語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Dimension Reduction and Regression for Tensor Data and Mixture Models.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Dimension Reduction and Regression for Tensor Data and Mixture Models./
作者:
Wang, Ning.
面頁冊數:
1 online resource (182 pages)
附註:
Source: Dissertations Abstracts International, Volume: 84-02, Section: B.
Contained By:
Dissertations Abstracts International84-02B.
標題:
Statistics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28968197click for full text (PQDT)
ISBN:
9798841771630
Dimension Reduction and Regression for Tensor Data and Mixture Models.
Wang, Ning.
Dimension Reduction and Regression for Tensor Data and Mixture Models.
- 1 online resource (182 pages)
Source: Dissertations Abstracts International, Volume: 84-02, Section: B.
Thesis (Ph.D.)--The Florida State University, 2022.
Includes bibliographical references
In modern statistics, many data sets are of complex structure, including but not limited to high dimensionality, higher-order, and heterogeneity. Recently, there has been growing interest in developing valid and efficient statistical methods for these data sets. In my thesis, we studied three types of data complexity: (1) tensor data (a.k.a. array valued random objects); (2) heavy-tailed data; (3) data from heterogeneous subpopulations. We address these three challenges by developing novel methodologies and efficient algorithms. Specifically, we proposed likelihood-based dimension folding methods for tensor data, studied the robust tensor t regression by a proposed tensor t distribution, and developed an algorithm and theory for high-dimensional mixture linear regression. My work on these three topics is elaborated as follows. In recent years, traditional multivariate analysis tools, such as multivariate regression and discriminant analysis, are generalized from modeling random vectors and matrices to higher-order random tensors (a.k.a. array-valued random objects). Equipped with tensor algebra and high-dimensional computation techniques, concise and interpretable statistical models and estimation procedures prevail in various applications. One challenge for tensor data analysis is caused by the large dimensions of the tensor. Many statistical methods such as linear discriminant analysis and quadratic discriminant analysis are not applicable or unstable for data sets with the dimension that is larger than the sample size. Sufficient dimension reduction methods are flexible tools for data visualization and exploratory analysis, typically in a regression of a univariate response on a multivariate predictor. For regressions with tensor predictors, a general framework of dimension folding and several moment-based estimation procedures have been proposed in the literature. In this essay, we propose two likelihood-based dimension folding methods motivated by quadratic discriminant analysis for tensor data: the maximum likelihood estimators are derived under a general covariance setting and a structured envelope covariance setting. We study the asymptotic properties of both estimators and show using simulation studies and a real-data analysis that they are more accurate than existing moment-based estimators. Another challenge to statistical tensor models is the non-Gaussian nature of many real-world data. Unfortunately, existing approaches are either restricted to normality or implicitly using least squares type objective functions that are computationally efficient but sensitive to data contamination. Motivated by this, we adopt a simple tensor t-distribution that is, unlike the commonly used matrix t-distributions, compatible with tensor operators and reshaping of the data. We study the tensor response regression with tensor t-error, and develop penalized likelihood-based estimation and a novel one-step estimation. We study the asymptotic relative efficiency of various estimators and establish the one-step estimator's oracle properties and near-optimal asymptotic efficiency. We further propose a high-dimensional modification to the one-step estimation procedure and showed that it attains the minimax optimal rate in estimation. Numerical studies show the excellent performance of the one-step estimator. In the last chapter, we consider the high-dimensional mixture linear regression. The expectation-maximization (EM) algorithm and its variants are widely used in statistics. In high-dimensional mixture linear regression, the model is assumed to be a finite mixture of linear regression forms and the number of predictors is much larger than the sample size. The standard EM algorithm, which attempts to find the maximum likelihood estimator, becomes infeasible. We devise a penalized EM algorithm and study its statistical properties. Existing theoretical results of regularized EM algorithms often rely on dividing the sample into many independent batches and employing a fresh batch of sample in each iteration of the algorithm. Our algorithm and theoretical analysis do not require sample-splitting. The proposed method also has encouraging performances in simulation studies and a real data example.
Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023
Mode of access: World Wide Web
ISBN: 9798841771630Subjects--Topical Terms:
517247
Statistics.
Subjects--Index Terms:
Dimension reductionIndex Terms--Genre/Form:
542853
Electronic books.
Dimension Reduction and Regression for Tensor Data and Mixture Models.
LDR
:05518nmm a2200349K 4500
001
2356365
005
20230612110804.5
006
m o d
007
cr mn ---uuuuu
008
241011s2022 xx obm 000 0 eng d
020
$a
9798841771630
035
$a
(MiAaPQ)AAI28968197
035
$a
AAI28968197
040
$a
MiAaPQ
$b
eng
$c
MiAaPQ
$d
NTU
100
1
$a
Wang, Ning.
$3
730784
245
1 0
$a
Dimension Reduction and Regression for Tensor Data and Mixture Models.
264
0
$c
2022
300
$a
1 online resource (182 pages)
336
$a
text
$b
txt
$2
rdacontent
337
$a
computer
$b
c
$2
rdamedia
338
$a
online resource
$b
cr
$2
rdacarrier
500
$a
Source: Dissertations Abstracts International, Volume: 84-02, Section: B.
500
$a
Advisor: Zhang, Xin.
502
$a
Thesis (Ph.D.)--The Florida State University, 2022.
504
$a
Includes bibliographical references
520
$a
In modern statistics, many data sets are of complex structure, including but not limited to high dimensionality, higher-order, and heterogeneity. Recently, there has been growing interest in developing valid and efficient statistical methods for these data sets. In my thesis, we studied three types of data complexity: (1) tensor data (a.k.a. array valued random objects); (2) heavy-tailed data; (3) data from heterogeneous subpopulations. We address these three challenges by developing novel methodologies and efficient algorithms. Specifically, we proposed likelihood-based dimension folding methods for tensor data, studied the robust tensor t regression by a proposed tensor t distribution, and developed an algorithm and theory for high-dimensional mixture linear regression. My work on these three topics is elaborated as follows. In recent years, traditional multivariate analysis tools, such as multivariate regression and discriminant analysis, are generalized from modeling random vectors and matrices to higher-order random tensors (a.k.a. array-valued random objects). Equipped with tensor algebra and high-dimensional computation techniques, concise and interpretable statistical models and estimation procedures prevail in various applications. One challenge for tensor data analysis is caused by the large dimensions of the tensor. Many statistical methods such as linear discriminant analysis and quadratic discriminant analysis are not applicable or unstable for data sets with the dimension that is larger than the sample size. Sufficient dimension reduction methods are flexible tools for data visualization and exploratory analysis, typically in a regression of a univariate response on a multivariate predictor. For regressions with tensor predictors, a general framework of dimension folding and several moment-based estimation procedures have been proposed in the literature. In this essay, we propose two likelihood-based dimension folding methods motivated by quadratic discriminant analysis for tensor data: the maximum likelihood estimators are derived under a general covariance setting and a structured envelope covariance setting. We study the asymptotic properties of both estimators and show using simulation studies and a real-data analysis that they are more accurate than existing moment-based estimators. Another challenge to statistical tensor models is the non-Gaussian nature of many real-world data. Unfortunately, existing approaches are either restricted to normality or implicitly using least squares type objective functions that are computationally efficient but sensitive to data contamination. Motivated by this, we adopt a simple tensor t-distribution that is, unlike the commonly used matrix t-distributions, compatible with tensor operators and reshaping of the data. We study the tensor response regression with tensor t-error, and develop penalized likelihood-based estimation and a novel one-step estimation. We study the asymptotic relative efficiency of various estimators and establish the one-step estimator's oracle properties and near-optimal asymptotic efficiency. We further propose a high-dimensional modification to the one-step estimation procedure and showed that it attains the minimax optimal rate in estimation. Numerical studies show the excellent performance of the one-step estimator. In the last chapter, we consider the high-dimensional mixture linear regression. The expectation-maximization (EM) algorithm and its variants are widely used in statistics. In high-dimensional mixture linear regression, the model is assumed to be a finite mixture of linear regression forms and the number of predictors is much larger than the sample size. The standard EM algorithm, which attempts to find the maximum likelihood estimator, becomes infeasible. We devise a penalized EM algorithm and study its statistical properties. Existing theoretical results of regularized EM algorithms often rely on dividing the sample into many independent batches and employing a fresh batch of sample in each iteration of the algorithm. Our algorithm and theoretical analysis do not require sample-splitting. The proposed method also has encouraging performances in simulation studies and a real data example.
533
$a
Electronic reproduction.
$b
Ann Arbor, Mich. :
$c
ProQuest,
$d
2023
538
$a
Mode of access: World Wide Web
650
4
$a
Statistics.
$3
517247
653
$a
Dimension reduction
653
$a
Mixture models
653
$a
Tensor data
655
7
$a
Electronic books.
$2
lcsh
$3
542853
690
$a
0463
710
2
$a
ProQuest Information and Learning Co.
$3
783688
710
2
$a
The Florida State University.
$b
Statistics.
$3
3185231
773
0
$t
Dissertations Abstracts International
$g
84-02B.
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28968197
$z
click for full text (PQDT)
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9478721
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入