Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Sampling Designs for Resource Effici...
~
Tan, Wei Ling Katherine.
Linked to FindBook
Google Book
Amazon
博客來
Sampling Designs for Resource Efficient Collection of Outcome Labels for Machine-Learning, with Application to Electronic Medical Records.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Sampling Designs for Resource Efficient Collection of Outcome Labels for Machine-Learning, with Application to Electronic Medical Records./
Author:
Tan, Wei Ling Katherine.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2018,
Description:
199 p.
Notes:
Source: Dissertation Abstracts International, Volume: 80-07(E), Section: B.
Contained By:
Dissertation Abstracts International80-07B(E).
Subject:
Biostatistics. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10981508
ISBN:
9780438871168
Sampling Designs for Resource Efficient Collection of Outcome Labels for Machine-Learning, with Application to Electronic Medical Records.
Tan, Wei Ling Katherine.
Sampling Designs for Resource Efficient Collection of Outcome Labels for Machine-Learning, with Application to Electronic Medical Records.
- Ann Arbor : ProQuest Dissertations & Theses, 2018 - 199 p.
Source: Dissertation Abstracts International, Volume: 80-07(E), Section: B.
Thesis (Ph.D.)--University of Washington, 2018.
In leveraging data from large-scale electronic medical record systems for research, an important step is the accurate identification of key clinical outcomes. Some outcomes must be derived or predicted from both structured and unstructured data, for example using statistical machine-learning classification. Classification requires the collection of labeled data, which is a sample where actual outcome statuses are manually coded by human clinical experts. For rare outcomes, simple random sampling (SRS) for labeled data collection results in very few cases in the sample. Such outcome class imbalance results in insufficient information for classifier modeling, yet additional abstraction is often expensive and time-consuming. In this dissertation, we propose sampling designs for labeled data collection towards machine-learning, targeting the rare outcome scenario. Our proposed designs are resource efficient, requiring a smaller sample size for modeling goals compared to SRS, yet design impacts on model development and validation can be statistically characterized to be "valid". We first introduce a stratified sampling procedure based on values of enrichment surrogates, which are summaries of structured data related to the clinical outcome requiring abstraction. Next, motivated by radiology reports with multiple co-occurring findings, we discuss extensions to the multi-label setting. Finally, for scenarios where a previously developed "source" model is to be externally transferred, we propose a framework for such "new'' labeled data collection.
ISBN: 9780438871168Subjects--Topical Terms:
1002712
Biostatistics.
Sampling Designs for Resource Efficient Collection of Outcome Labels for Machine-Learning, with Application to Electronic Medical Records.
LDR
:02584nmm a2200301 4500
001
2202309
005
20190513114647.5
008
201008s2018 ||||||||||||||||| ||eng d
020
$a
9780438871168
035
$a
(MiAaPQ)AAI10981508
035
$a
(MiAaPQ)washington:19428
035
$a
AAI10981508
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Tan, Wei Ling Katherine.
$3
3429053
245
1 0
$a
Sampling Designs for Resource Efficient Collection of Outcome Labels for Machine-Learning, with Application to Electronic Medical Records.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2018
300
$a
199 p.
500
$a
Source: Dissertation Abstracts International, Volume: 80-07(E), Section: B.
500
$a
Adviser: Patrick J. Heagerty.
502
$a
Thesis (Ph.D.)--University of Washington, 2018.
520
$a
In leveraging data from large-scale electronic medical record systems for research, an important step is the accurate identification of key clinical outcomes. Some outcomes must be derived or predicted from both structured and unstructured data, for example using statistical machine-learning classification. Classification requires the collection of labeled data, which is a sample where actual outcome statuses are manually coded by human clinical experts. For rare outcomes, simple random sampling (SRS) for labeled data collection results in very few cases in the sample. Such outcome class imbalance results in insufficient information for classifier modeling, yet additional abstraction is often expensive and time-consuming. In this dissertation, we propose sampling designs for labeled data collection towards machine-learning, targeting the rare outcome scenario. Our proposed designs are resource efficient, requiring a smaller sample size for modeling goals compared to SRS, yet design impacts on model development and validation can be statistically characterized to be "valid". We first introduce a stratified sampling procedure based on values of enrichment surrogates, which are summaries of structured data related to the clinical outcome requiring abstraction. Next, motivated by radiology reports with multiple co-occurring findings, we discuss extensions to the multi-label setting. Finally, for scenarios where a previously developed "source" model is to be externally transferred, we propose a framework for such "new'' labeled data collection.
590
$a
School code: 0250.
650
4
$a
Biostatistics.
$3
1002712
650
4
$a
Artificial intelligence.
$3
516317
690
$a
0308
690
$a
0800
710
2
$a
University of Washington.
$b
Biostatistics (Public Health).
$3
3429054
773
0
$t
Dissertation Abstracts International
$g
80-07B(E).
790
$a
0250
791
$a
Ph.D.
792
$a
2018
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10981508
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9378858
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login