東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Computationally Efficient Multiple I...

Akkaya Hocagil, Tugba.

FindBook

Google Book

Amazon

博客來

Computationally Efficient Multiple Imputation Routines in Clustered Data.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Computationally Efficient Multiple Imputation Routines in Clustered Data./
作者:	Akkaya Hocagil, Tugba.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2017,
面頁冊數:	70 p.
附註:	Source: Dissertations Abstracts International, Volume: 79-05, Section: B.
Contained By:	Dissertations Abstracts International79-05B.
標題:	Biostatistics. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10690496
ISBN:	9780355430455

Computationally Efficient Multiple Imputation Routines in Clustered Data.
Akkaya Hocagil, Tugba.

Computationally Efficient Multiple Imputation Routines in Clustered Data. - Ann Arbor : ProQuest Dissertations & Theses, 2017 - 70 p.

Source: Dissertations Abstracts International, Volume: 79-05, Section: B.

Thesis (Ph.D.)--State University of New York at Albany, 2017.

This item must not be added to any third party search indexes.

Presence of missing data in correlated data settings is a non-trivial problem. Inference by multiple imputation offers a viable solution to analysts. However, the missing data problem is typically more complicated due to diverse measurement scales, skip patterns, bounds and restrictions. Sequential regression imputation also known as variable-by-variable imputation has emerged as a popular imputation modeling technique, especially in the complex data structures. In this dissertation, we develop three methods to handle incomplete data in hierarchically nested and non-nested multilevel data structures using sequential regression imputation approach. The first method is concerned with incomplete Gaussian variables. This method makes use of computational efficient algorithms in the context sequential regression imputation. The existing methods make use of traditional Gibbs sampler, a well-known Markov Chain Monte Carlo (MCMC) method, to sample from conditional posterior predictive distribution of missing data. These algorithms are known to mix slowly as they treat random effects missing data in addition to raw missing data. Our method bypasses the slow convergence of traditional Gibbs sampler by de-conditioning on simulated values of random effects. We evaluate and compare the performance of our method with other two multiple imputation routines (MICE and pan packages) through simulation studies. The results demonstrate that our method outperforms the alternatives with respect to computation time and operational characteristics (e.g. accuracy). This method is also illustrated using New York State birth certificate data. Our second method proposes an sequential regression imputation algorithm in settings where the lowest observational units are nested within possibly non-unique higher order observational units (e.g. students within multiple classrooms). These algorithms rely on MCMC simulation methods to sample from the predictive distribution of missing data. These distributions are derived from mixed-effects models tailored to reflect the combination of latent impacts of multiple clusters. We conduct three simulation studies to investigate how the rate of non-unique cluster membership affects the performance of the proposed imputation algorithm. The results show that the proposed algorithm performs well when the percentage of missing data and the rate of non-unique cluster membership are moderate. Our third method extend computationally-efficient algorithms of the first paper to be used in categorical data. In particular, we consider binary and/or ordinal variables. These methods are based on calibration-based rounding rules to applied to imputed values drawn under the continuous approximations. These rules are developed to ensure the imputed data distributions of categorical data are consistent with the observed data distributions. Our limited simulation study demonstrates that, this method allows practitioners to facilitate the inferentially sound MI techniques using our calibration rules with respect to bias, coverage rate and accuracy.

ISBN: 9780355430455Subjects--Topical Terms:

1002712
Biostatistics.
Subjects--Index Terms:

Calibration-based rounding in ordinal clustered data

Computationally Efficient Multiple Imputation Routines in Clustered Data.
LDR:04590nmm a2200397 4500 001 2269012
005 20200908082311.5
008 220629s2017 ||||||||||||||||| ||eng d
020 $a 9780355430455
035 $a (MiAaPQ)AAI10690496
035 $a (MiAaPQ)sunyalb:12242
035 $a AAI10690496
040 $a MiAaPQ $c MiAaPQ
100 1 $a Akkaya Hocagil, Tugba. $3 3546316
245 1 0 $a Computationally Efficient Multiple Imputation Routines in Clustered Data.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2017
300 $a 70 p.
500 $a Source: Dissertations Abstracts International, Volume: 79-05, Section: B.
500 $a Publisher info.: Dissertation/Thesis.
500 $a Advisor: Yucel, Recai M.
502 $a Thesis (Ph.D.)--State University of New York at Albany, 2017.
506 $a This item must not be added to any third party search indexes.
506 $a This item must not be sold to any third party vendors.
520 $a Presence of missing data in correlated data settings is a non-trivial problem. Inference by multiple imputation offers a viable solution to analysts. However, the missing data problem is typically more complicated due to diverse measurement scales, skip patterns, bounds and restrictions. Sequential regression imputation also known as variable-by-variable imputation has emerged as a popular imputation modeling technique, especially in the complex data structures. In this dissertation, we develop three methods to handle incomplete data in hierarchically nested and non-nested multilevel data structures using sequential regression imputation approach. The first method is concerned with incomplete Gaussian variables. This method makes use of computational efficient algorithms in the context sequential regression imputation. The existing methods make use of traditional Gibbs sampler, a well-known Markov Chain Monte Carlo (MCMC) method, to sample from conditional posterior predictive distribution of missing data. These algorithms are known to mix slowly as they treat random effects missing data in addition to raw missing data. Our method bypasses the slow convergence of traditional Gibbs sampler by de-conditioning on simulated values of random effects. We evaluate and compare the performance of our method with other two multiple imputation routines (MICE and pan packages) through simulation studies. The results demonstrate that our method outperforms the alternatives with respect to computation time and operational characteristics (e.g. accuracy). This method is also illustrated using New York State birth certificate data. Our second method proposes an sequential regression imputation algorithm in settings where the lowest observational units are nested within possibly non-unique higher order observational units (e.g. students within multiple classrooms). These algorithms rely on MCMC simulation methods to sample from the predictive distribution of missing data. These distributions are derived from mixed-effects models tailored to reflect the combination of latent impacts of multiple clusters. We conduct three simulation studies to investigate how the rate of non-unique cluster membership affects the performance of the proposed imputation algorithm. The results show that the proposed algorithm performs well when the percentage of missing data and the rate of non-unique cluster membership are moderate. Our third method extend computationally-efficient algorithms of the first paper to be used in categorical data. In particular, we consider binary and/or ordinal variables. These methods are based on calibration-based rounding rules to applied to imputed values drawn under the continuous approximations. These rules are developed to ensure the imputed data distributions of categorical data are consistent with the observed data distributions. Our limited simulation study demonstrates that, this method allows practitioners to facilitate the inferentially sound MI techniques using our calibration rules with respect to bias, coverage rate and accuracy.
590 $a School code: 0668.
650 4 $a Biostatistics. $3 1002712
653 $a Calibration-based rounding in ordinal clustered data
653 $a Multiple imputation
653 $a Multiple imputation in clustered data
653 $a Multiple imputation in multiple membership multilevel data
653 $a Multiple membership multilevel data
653 $a Sequential multiple imputation in clustered data
690 $a 0308
710 2 $a State University of New York at Albany. $b Biometry and Statistics. $3 2102086
773 0 $t Dissertations Abstracts International $g 79-05B.
790 $a 0668
791 $a Ph.D.
792 $a 2017
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10690496