東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Statistical Machine Learning for Com...

Dai, Xiaowu.

FindBook

Google Book

Amazon

博客來

Statistical Machine Learning for Complex Data Sets.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Statistical Machine Learning for Complex Data Sets./
作者:	Dai, Xiaowu.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2019,
面頁冊數:	302 p.
附註:	Source: Dissertations Abstracts International, Volume: 80-12, Section: B.
Contained By:	Dissertations Abstracts International80-12B.
標題:	Applied Mathematics. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13882330
ISBN:	9781392152898

Statistical Machine Learning for Complex Data Sets.
Dai, Xiaowu.

Statistical Machine Learning for Complex Data Sets. - Ann Arbor : ProQuest Dissertations & Theses, 2019 - 302 p.

Source: Dissertations Abstracts International, Volume: 80-12, Section: B.

Thesis (Ph.D.)--The University of Wisconsin - Madison, 2019.

This item must not be sold to any third party vendors.

This thesis is focused on developing theory and computational methods for a set of problems involving complex data.Chapter 2 studies multivariate nonparametric predictions with gradient information. Gradients can be easily estimated in stochastic simulations and computer experiments. We propose a unified framework to incorporate the noisy and correlated gradients into predictions. We show theoretically, through minimax optimal rates of convergence, that incorporating gradients tends to significantly improve predictions with deterministic or random designs.Chapters 3 proposes high-dimensional smoothing splines with applications to Alzheimer's disease (AD) prediction. While traditional prediction based on structural MRI uses imaging acquired at a single time point, a longitudinal study is more sensitive in detecting early pathological changes of the AD. Our novel method can be applied to extract features from heterogeneous and longitudinal MRI for the AD prediction, outperforming existing methods.Chapters 4 introduces a novel class of variable selection penalties called TWIN, which provides sensible data-adaptive penalization. Under a linear sparsity regime, we show that TWIN penalties have a high probability of selecting correct models and result in minimax optimal estimators. We demonstrate in challenging and realistic simulation settings with high correlations between active and inactive variables that TWIN has high power in variable selection while controlling the number of false discoveries, outperforming standard penalties.Chapters 5 investigates generalizations of mini-batch SGD in deep neural networks. We theoretically justify a hypothesis that large-batch SGD tends to converge to sharp minimizers by providing new properties of SGD. In particular, we give an explicit escaping time of SGD from a local minimum in the finite-time regime and prove that SGD tends to converge to flatter minima in the asymptotic regime (although may take exponential time to converge) regardless of the batch size.Chapter 6 provides another look at statistical calibration problems in computer models. This viewpoint is inspired by two overarching practical considerations: (i) Many computer models are inadequate for perfectly modeling physical systems; (ii) Only a finite number of data are available from physical experiments to calibrate related computer models. We provide a non-asymptotic theory and derive a novel prediction-oriented calibration method.

ISBN: 9781392152898Subjects--Topical Terms:

1669109
Applied Mathematics.

Statistical Machine Learning for Complex Data Sets.
LDR:03563nmm a2200337 4500 001 2207158
005 20190913102458.5
008 201008s2019 ||||||||||||||||| ||eng d
020 $a 9781392152898
035 $a (MiAaPQ)AAI13882330
035 $a (MiAaPQ)wisc:16140
035 $a AAI13882330
040 $a MiAaPQ $c MiAaPQ
100 1 $a Dai, Xiaowu. $3 3434099
245 1 0 $a Statistical Machine Learning for Complex Data Sets.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2019
300 $a 302 p.
500 $a Source: Dissertations Abstracts International, Volume: 80-12, Section: B.
500 $a Publisher info.: Dissertation/Thesis.
500 $a Advisor: Wahba, Grace;Chien, Peter.
502 $a Thesis (Ph.D.)--The University of Wisconsin - Madison, 2019.
506 $a This item must not be sold to any third party vendors.
520 $a This thesis is focused on developing theory and computational methods for a set of problems involving complex data.Chapter 2 studies multivariate nonparametric predictions with gradient information. Gradients can be easily estimated in stochastic simulations and computer experiments. We propose a unified framework to incorporate the noisy and correlated gradients into predictions. We show theoretically, through minimax optimal rates of convergence, that incorporating gradients tends to significantly improve predictions with deterministic or random designs.Chapters 3 proposes high-dimensional smoothing splines with applications to Alzheimer's disease (AD) prediction. While traditional prediction based on structural MRI uses imaging acquired at a single time point, a longitudinal study is more sensitive in detecting early pathological changes of the AD. Our novel method can be applied to extract features from heterogeneous and longitudinal MRI for the AD prediction, outperforming existing methods.Chapters 4 introduces a novel class of variable selection penalties called TWIN, which provides sensible data-adaptive penalization. Under a linear sparsity regime, we show that TWIN penalties have a high probability of selecting correct models and result in minimax optimal estimators. We demonstrate in challenging and realistic simulation settings with high correlations between active and inactive variables that TWIN has high power in variable selection while controlling the number of false discoveries, outperforming standard penalties.Chapters 5 investigates generalizations of mini-batch SGD in deep neural networks. We theoretically justify a hypothesis that large-batch SGD tends to converge to sharp minimizers by providing new properties of SGD. In particular, we give an explicit escaping time of SGD from a local minimum in the finite-time regime and prove that SGD tends to converge to flatter minima in the asymptotic regime (although may take exponential time to converge) regardless of the batch size.Chapter 6 provides another look at statistical calibration problems in computer models. This viewpoint is inspired by two overarching practical considerations: (i) Many computer models are inadequate for perfectly modeling physical systems; (ii) Only a finite number of data are available from physical experiments to calibrate related computer models. We provide a non-asymptotic theory and derive a novel prediction-oriented calibration method.
590 $a School code: 0262.
650 4 $a Applied Mathematics. $3 1669109
650 4 $a Statistics. $3 517247
650 4 $a Computer science. $3 523869
690 $a 0364
690 $a 0463
690 $a 0984
710 2 $a The University of Wisconsin - Madison. $b Statistics. $3 2101047
773 0 $t Dissertations Abstracts International $g 80-12B.
790 $a 0262
791 $a Ph.D.
792 $a 2019
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13882330