語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Statistical Methods for Mobile Health and Genomics Data.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Statistical Methods for Mobile Health and Genomics Data./
作者:
Quinn, Matthew.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2022,
面頁冊數:
110 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-12, Section: B.
Contained By:
Dissertations Abstracts International83-12B.
標題:
Biostatistics. -
ISBN:
9798819382677
Statistical Methods for Mobile Health and Genomics Data.
Quinn, Matthew.
Statistical Methods for Mobile Health and Genomics Data.
- Ann Arbor : ProQuest Dissertations & Theses, 2022 - 110 p.
Source: Dissertations Abstracts International, Volume: 83-12, Section: B.
Thesis (Ph.D.)--Harvard University, 2022.
This item must not be sold to any third party vendors.
A common goal in statistical analyses is to differentiate signal from noise. This problem is ubiquitous to many fields, including mobile health (mHealth) and genomics, both of which have garnered tremendous interest in recent years as advancements in technology continue to make them even more prominent for studying human health. While this challenge of detecting signal is universal, the solutions to it are not. Different research applications introduce their own idiosyncrasies that can make existing approaches for signal detection insufficient for that specific context. In this dissertation, we present approaches for signal detection for three different problems in mHealth and genomics.In Chapter 1, we study mHealth data, which are often collected through wearable devices, such as watches and other fitness trackers. The devices record and process data using algorithms that are subject to updates and glitches, which device manufacturers often do not publicize. As a result, devices can suddenly change how data are collected and reported over time. A researcher using mHealth data needs to be able to detect these changes in order to adjust for them. We propose Automated Selection of Changepoints using Empirical P-values and Trimming (ASCEPT) as an approach for objectively identifying where these changes occur. ASCEPT relies upon Monte Carlo simulations and regression models to accurately identify these algorithmic changes. We compare ASCEPT to an existing method on both simulated and real mHealth data.In Chapter 2, we look at chromatin immunoprecipitation sequencing (ChIP-seq) data, which reflect where proteins bind to a genome. Researchers often compare individuals from different experimental groups or biological conditions to detect regions of the genome in which there is differential binding (DB). DB in particular regions may then be associated with different health outcomes between the two groups, in turn helping the researcher understand risk factors or mechanisms contributing to a particular disease. However, popular methods for detecting DB often do not fully account for autocorrelation within samples, biological variability across samples, or selection procedures used to find regions of interest. As a result, they often report inappropriate inference regarding the significance of DB regions. We present a permutation test pipeline for finding DB sites on a genome while accounting for autocorrelation, biological variability, and the selection procedure in order to provide accurate inference. We compare this pipeline to two popular methods on both real and simulated data.In Chapter 3, we continue studying genomics data, but this time focus on ribonucleic acid sequencing (RNA-seq) data, which reflect gene expression. Researchers commonly use RNA-seq data to study gene co-expression, or how the expression of different genes are correlated with one another. One can use the co-expression between genes to construct networks to better understand gene regulation or biological mechanisms, often with the hope of learning more about the drivers of certain health outcomes. However, not all co-expressions in RNA-seq are genuine. Technical issues with sequencing and normalization procedures that researchers perform may introduce spurious signals. We present evidence that this problem arises for different genes in real RNA-seq data and that the characteristics of these false signals can vary depending both on the normalization procedure used and the tissue in which the expression occurs. We present different metrics for characterizing the presence of these spurious correlations and permutation tests for assessing their statistical significance.While we present three different research problems, they are all manifestations of the same core challenge. Whether we detect algorithmic changes in mHealth data over time, regions on the genome that contain DB, or spurious correlations among genes, the same underlying challenge of differentiating true signal from noise comes up. Additionally, while the solution to each instance is unique, we find that computational techniques, like Monte Carlo simulations and permutation tests, are particularly helpful tools in each scenario. Thus, while both the specific type of signal detection and its solution will depend on the underlying research context, there are commonalities among signal detection problems that can be helpful for understanding and addressing them.
ISBN: 9798819382677Subjects--Topical Terms:
1002712
Biostatistics.
Subjects--Index Terms:
Wearable devices
Statistical Methods for Mobile Health and Genomics Data.
LDR
:05503nmm a2200325 4500
001
2349773
005
20221003074951.5
008
241004s2022 eng d
020
$a
9798819382677
035
$a
(MiAaPQ)AAI29209964
035
$a
AAI29209964
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Quinn, Matthew.
$0
(orcid)0000-0002-3033-1682
$3
3689189
245
1 0
$a
Statistical Methods for Mobile Health and Genomics Data.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2022
300
$a
110 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-12, Section: B.
500
$a
Advisor: Irizarry, Rafael;Glass, Kimberly.
502
$a
Thesis (Ph.D.)--Harvard University, 2022.
506
$a
This item must not be sold to any third party vendors.
520
$a
A common goal in statistical analyses is to differentiate signal from noise. This problem is ubiquitous to many fields, including mobile health (mHealth) and genomics, both of which have garnered tremendous interest in recent years as advancements in technology continue to make them even more prominent for studying human health. While this challenge of detecting signal is universal, the solutions to it are not. Different research applications introduce their own idiosyncrasies that can make existing approaches for signal detection insufficient for that specific context. In this dissertation, we present approaches for signal detection for three different problems in mHealth and genomics.In Chapter 1, we study mHealth data, which are often collected through wearable devices, such as watches and other fitness trackers. The devices record and process data using algorithms that are subject to updates and glitches, which device manufacturers often do not publicize. As a result, devices can suddenly change how data are collected and reported over time. A researcher using mHealth data needs to be able to detect these changes in order to adjust for them. We propose Automated Selection of Changepoints using Empirical P-values and Trimming (ASCEPT) as an approach for objectively identifying where these changes occur. ASCEPT relies upon Monte Carlo simulations and regression models to accurately identify these algorithmic changes. We compare ASCEPT to an existing method on both simulated and real mHealth data.In Chapter 2, we look at chromatin immunoprecipitation sequencing (ChIP-seq) data, which reflect where proteins bind to a genome. Researchers often compare individuals from different experimental groups or biological conditions to detect regions of the genome in which there is differential binding (DB). DB in particular regions may then be associated with different health outcomes between the two groups, in turn helping the researcher understand risk factors or mechanisms contributing to a particular disease. However, popular methods for detecting DB often do not fully account for autocorrelation within samples, biological variability across samples, or selection procedures used to find regions of interest. As a result, they often report inappropriate inference regarding the significance of DB regions. We present a permutation test pipeline for finding DB sites on a genome while accounting for autocorrelation, biological variability, and the selection procedure in order to provide accurate inference. We compare this pipeline to two popular methods on both real and simulated data.In Chapter 3, we continue studying genomics data, but this time focus on ribonucleic acid sequencing (RNA-seq) data, which reflect gene expression. Researchers commonly use RNA-seq data to study gene co-expression, or how the expression of different genes are correlated with one another. One can use the co-expression between genes to construct networks to better understand gene regulation or biological mechanisms, often with the hope of learning more about the drivers of certain health outcomes. However, not all co-expressions in RNA-seq are genuine. Technical issues with sequencing and normalization procedures that researchers perform may introduce spurious signals. We present evidence that this problem arises for different genes in real RNA-seq data and that the characteristics of these false signals can vary depending both on the normalization procedure used and the tissue in which the expression occurs. We present different metrics for characterizing the presence of these spurious correlations and permutation tests for assessing their statistical significance.While we present three different research problems, they are all manifestations of the same core challenge. Whether we detect algorithmic changes in mHealth data over time, regions on the genome that contain DB, or spurious correlations among genes, the same underlying challenge of differentiating true signal from noise comes up. Additionally, while the solution to each instance is unique, we find that computational techniques, like Monte Carlo simulations and permutation tests, are particularly helpful tools in each scenario. Thus, while both the specific type of signal detection and its solution will depend on the underlying research context, there are commonalities among signal detection problems that can be helpful for understanding and addressing them.
590
$a
School code: 0084.
650
4
$a
Biostatistics.
$3
1002712
653
$a
Wearable devices
653
$a
Chromatin immunoprecipitation sequencing
653
$a
Ribonucleic acid sequencing
653
$a
Algorithmic changes
690
$a
0308
710
2 0
$a
Harvard University.
$b
Biostatistics.
$3
2104931
773
0
$t
Dissertations Abstracts International
$g
83-12B.
790
$a
0084
791
$a
Ph.D.
792
$a
2022
793
$a
English
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9472211
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入
(1)帳號:一般為「身分證號」;外籍生或交換生則為「學號」。 (2)密碼:預設為帳號末四碼。
帳號
.
密碼
.
請在此電腦上記得個人資料
取消
忘記密碼? (請注意!您必須已在系統登記E-mail信箱方能使用。)