語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Statistical Modeling for High-Dimensional Compositional Data with Applications to the Human Microbiome.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Statistical Modeling for High-Dimensional Compositional Data with Applications to the Human Microbiome./
作者:
Dao, Thy.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:
81 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-04, Section: B.
Contained By:
Dissertations Abstracts International83-04B.
標題:
Statistics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28649733
ISBN:
9798460415458
Statistical Modeling for High-Dimensional Compositional Data with Applications to the Human Microbiome.
Dao, Thy.
Statistical Modeling for High-Dimensional Compositional Data with Applications to the Human Microbiome.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 81 p.
Source: Dissertations Abstracts International, Volume: 83-04, Section: B.
Thesis (Ph.D.)--University of Arkansas, 2021.
This item must not be sold to any third party vendors.
Compositional data refer to the data that lie on a simplex, which are common in many scientific domains such as genomics, geology, and economics. As the components in a composition must sum to one, traditional tests based on unconstrained data become inappropriate, and new statistical methods are needed to analyze this special type of data. This dissertation is motivated by some statistical problems arising in the analysis of compositional data. In particular, we focus on the high-dimensional and over-dispersed setting, where the dimensionality of compositions is greater than the sample size and the dispersion parameter is moderate or large. In this dissertation, we consider a general problem of testing for the compositional difference between K populations. We propose a new Bayesian hypothesis, together with a nonparametric and distance-based testing method. Furthermore, we utilize multiple variable-selecting models, including LASSO, elastic net, ridge regression and cumulative logit model, to identify the most important subset of variables. This dissertation is structured as follows:Chapter 1 introduces the compositional microbiome data, and then briefly review different statistical tests and model to be used in our framework, including distance correlation, LASSO, Ridge regression, elastic net, cumulative logit and adjacent-category logit model.Chapter 2 then presents our new statistical test together with two real world applications form human microbiome study. We first formulate a hypothesis from the Bayesian point of view and suggest a nonparametric test based on inter-point distance to evaluate statistical significance. Unlike most existing tests for compositional data, the distance-based method is more sensitive to the compositional difference than the mean-based method, especially when the data are over-dispersed or zero-inflated. It does not rely on any data transformation, sparsity assumption or regularity conditions on the covariance matrix, but directly analyzes the compositions. The performance of this method is evaluated using simulation studies. We apply this new procedure to two human microbiome datasets including a throat microbiome dataset and an intestinal microbiome data.In addition to the overall testing, we also want to identify a small subset of variables that distinguish different populations. Chapter 3 introduces the procedure to select most significant variables (bacteria or genus) using LASSO, Ridge regression, elastic net, cumulative logit model and adjacent-category logit models. Chapter 4 validates our findings from Chapter 3 and presents visualizations using multi-dimensional scaling (MDS).Chapter 5 discusses and concludes the dissertation with some future perspectives.
ISBN: 9798460415458Subjects--Topical Terms:
517247
Statistics.
Subjects--Index Terms:
Compositional data
Statistical Modeling for High-Dimensional Compositional Data with Applications to the Human Microbiome.
LDR
:03953nmm a2200373 4500
001
2346006
005
20220613064835.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798460415458
035
$a
(MiAaPQ)AAI28649733
035
$a
AAI28649733
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Dao, Thy.
$3
3685034
245
1 0
$a
Statistical Modeling for High-Dimensional Compositional Data with Applications to the Human Microbiome.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
81 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-04, Section: B.
500
$a
Advisor: Zhang, Qingyang.
502
$a
Thesis (Ph.D.)--University of Arkansas, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
Compositional data refer to the data that lie on a simplex, which are common in many scientific domains such as genomics, geology, and economics. As the components in a composition must sum to one, traditional tests based on unconstrained data become inappropriate, and new statistical methods are needed to analyze this special type of data. This dissertation is motivated by some statistical problems arising in the analysis of compositional data. In particular, we focus on the high-dimensional and over-dispersed setting, where the dimensionality of compositions is greater than the sample size and the dispersion parameter is moderate or large. In this dissertation, we consider a general problem of testing for the compositional difference between K populations. We propose a new Bayesian hypothesis, together with a nonparametric and distance-based testing method. Furthermore, we utilize multiple variable-selecting models, including LASSO, elastic net, ridge regression and cumulative logit model, to identify the most important subset of variables. This dissertation is structured as follows:Chapter 1 introduces the compositional microbiome data, and then briefly review different statistical tests and model to be used in our framework, including distance correlation, LASSO, Ridge regression, elastic net, cumulative logit and adjacent-category logit model.Chapter 2 then presents our new statistical test together with two real world applications form human microbiome study. We first formulate a hypothesis from the Bayesian point of view and suggest a nonparametric test based on inter-point distance to evaluate statistical significance. Unlike most existing tests for compositional data, the distance-based method is more sensitive to the compositional difference than the mean-based method, especially when the data are over-dispersed or zero-inflated. It does not rely on any data transformation, sparsity assumption or regularity conditions on the covariance matrix, but directly analyzes the compositions. The performance of this method is evaluated using simulation studies. We apply this new procedure to two human microbiome datasets including a throat microbiome dataset and an intestinal microbiome data.In addition to the overall testing, we also want to identify a small subset of variables that distinguish different populations. Chapter 3 introduces the procedure to select most significant variables (bacteria or genus) using LASSO, Ridge regression, elastic net, cumulative logit model and adjacent-category logit models. Chapter 4 validates our findings from Chapter 3 and presents visualizations using multi-dimensional scaling (MDS).Chapter 5 discusses and concludes the dissertation with some future perspectives.
590
$a
School code: 0011.
650
4
$a
Statistics.
$3
517247
650
4
$a
Biostatistics.
$3
1002712
650
4
$a
Bioinformatics.
$3
553671
653
$a
Compositional data
653
$a
Distance-based
653
$a
High dimensional compositional data
653
$a
Microbiome
653
$a
Statistical model
690
$a
0463
690
$a
0308
690
$a
0715
710
2
$a
University of Arkansas.
$b
Mathematics.
$3
2096727
773
0
$t
Dissertations Abstracts International
$g
83-04B.
790
$a
0011
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28649733
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9468444
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入
(1)帳號:一般為「身分證號」;外籍生或交換生則為「學號」。 (2)密碼:預設為帳號末四碼。
帳號
.
密碼
.
請在此電腦上記得個人資料
取消
忘記密碼? (請注意!您必須已在系統登記E-mail信箱方能使用。)