Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Handling Incomplete High-Dimensional...
~
Lu, Xiang.
Linked to FindBook
Google Book
Amazon
博客來
Handling Incomplete High-Dimensional Multivariate Longitudinal Data with Mixed Data Types by Multiple Imputation Using a Longitudinal Factor Analysis Model.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Handling Incomplete High-Dimensional Multivariate Longitudinal Data with Mixed Data Types by Multiple Imputation Using a Longitudinal Factor Analysis Model./
Author:
Lu, Xiang.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2016,
Description:
114 p.
Notes:
Source: Dissertations Abstracts International, Volume: 77-09, Section: B.
Contained By:
Dissertations Abstracts International77-09B.
Subject:
Biostatistics. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10036394
ISBN:
9781339545691
Handling Incomplete High-Dimensional Multivariate Longitudinal Data with Mixed Data Types by Multiple Imputation Using a Longitudinal Factor Analysis Model.
Lu, Xiang.
Handling Incomplete High-Dimensional Multivariate Longitudinal Data with Mixed Data Types by Multiple Imputation Using a Longitudinal Factor Analysis Model.
- Ann Arbor : ProQuest Dissertations & Theses, 2016 - 114 p.
Source: Dissertations Abstracts International, Volume: 77-09, Section: B.
Thesis (Ph.D.)--University of California, Los Angeles, 2016.
This item must not be sold to any third party vendors.
We developed an imputation model solving the missing-data problem in a high-dimensional longitudinal data set with mixed data types (continuous and ordinal) based on a factor-analysis and a linear mixed-effect model. Markov Chain Monte Carlo is used to fit the model, drawing parameters, latent variables and missing values iteratively. The imputation model is written in an R package. We tested the newly developed imputation model using simulated data sets under 32 scenarios and 2 hypothetical missing-data mechanisms. Two competitive models PAN (Multiple Imputation for Multivariate Panel or Clustered Data) and MICE (Multiple Imputation using Chained Equations) are also tested in the same way for comparison, to show the necessity of addressing the high-dimension and mixed continuous and ordinal data type issues. Part of the effort we made is to accelerate the simulation using C++ (a low-level language) and the parallel computing by the Hoffman 2 Cluster. Compared to running the simulation evaluation in an R program on one single computer, the program we use for the simulation evaluation runs approximately 600 times faster. We also tested the robustness of the newly developed imputation model in the cases of violation of assumptions. We found that assuming less than the true number of factors corresponds to invalid inferences, while assuming more than that corresponds to reasonable inferences. We also found that only omitting very strong underlying quadratic trends of the factor scores hurt the inferences based on the imputation. In the most unfavorable scenario we tested, when the underlying quadratic coefficient is as large as .8 of the linear coefficient, the actual coverage rates of 95% interval estimates start falling below 90%. An application to a dentistry data is shown, in comparison to the PAN, NORM and a fore runner of the newly developed method.
ISBN: 9781339545691Subjects--Topical Terms:
1002712
Biostatistics.
Subjects--Index Terms:
Factor analysis
Handling Incomplete High-Dimensional Multivariate Longitudinal Data with Mixed Data Types by Multiple Imputation Using a Longitudinal Factor Analysis Model.
LDR
:03187nmm a2200373 4500
001
2268989
005
20200908082306.5
008
220629s2016 ||||||||||||||||| ||eng d
020
$a
9781339545691
035
$a
(MiAaPQ)AAI10036394
035
$a
(MiAaPQ)ucla:14328
035
$a
AAI10036394
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Lu, Xiang.
$3
1911329
245
1 0
$a
Handling Incomplete High-Dimensional Multivariate Longitudinal Data with Mixed Data Types by Multiple Imputation Using a Longitudinal Factor Analysis Model.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2016
300
$a
114 p.
500
$a
Source: Dissertations Abstracts International, Volume: 77-09, Section: B.
500
$a
Publisher info.: Dissertation/Thesis.
500
$a
Advisor: Belin, Thomas R.
502
$a
Thesis (Ph.D.)--University of California, Los Angeles, 2016.
506
$a
This item must not be sold to any third party vendors.
520
$a
We developed an imputation model solving the missing-data problem in a high-dimensional longitudinal data set with mixed data types (continuous and ordinal) based on a factor-analysis and a linear mixed-effect model. Markov Chain Monte Carlo is used to fit the model, drawing parameters, latent variables and missing values iteratively. The imputation model is written in an R package. We tested the newly developed imputation model using simulated data sets under 32 scenarios and 2 hypothetical missing-data mechanisms. Two competitive models PAN (Multiple Imputation for Multivariate Panel or Clustered Data) and MICE (Multiple Imputation using Chained Equations) are also tested in the same way for comparison, to show the necessity of addressing the high-dimension and mixed continuous and ordinal data type issues. Part of the effort we made is to accelerate the simulation using C++ (a low-level language) and the parallel computing by the Hoffman 2 Cluster. Compared to running the simulation evaluation in an R program on one single computer, the program we use for the simulation evaluation runs approximately 600 times faster. We also tested the robustness of the newly developed imputation model in the cases of violation of assumptions. We found that assuming less than the true number of factors corresponds to invalid inferences, while assuming more than that corresponds to reasonable inferences. We also found that only omitting very strong underlying quadratic trends of the factor scores hurt the inferences based on the imputation. In the most unfavorable scenario we tested, when the underlying quadratic coefficient is as large as .8 of the linear coefficient, the actual coverage rates of 95% interval estimates start falling below 90%. An application to a dentistry data is shown, in comparison to the PAN, NORM and a fore runner of the newly developed method.
590
$a
School code: 0031.
650
4
$a
Biostatistics.
$3
1002712
653
$a
Factor analysis
653
$a
High-dimensional
653
$a
Imputation
653
$a
Longitudinal
653
$a
Missing data
690
$a
0308
710
2
$a
University of California, Los Angeles.
$b
Biostatistics.
$3
3280770
773
0
$t
Dissertations Abstracts International
$g
77-09B.
790
$a
0031
791
$a
Ph.D.
792
$a
2016
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10036394
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9421223
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login