語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Two Studies on the Application of Ma...
~
Lung, Pei-Yau.
FindBook
Google Book
Amazon
博客來
Two Studies on the Application of Machine Learning for Biomedical Big Data.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Two Studies on the Application of Machine Learning for Biomedical Big Data./
作者:
Lung, Pei-Yau.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2019,
面頁冊數:
62 p.
附註:
Source: Dissertations Abstracts International, Volume: 81-04, Section: B.
Contained By:
Dissertations Abstracts International81-04B.
標題:
Mathematics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13811934
ISBN:
9781085789639
Two Studies on the Application of Machine Learning for Biomedical Big Data.
Lung, Pei-Yau.
Two Studies on the Application of Machine Learning for Biomedical Big Data.
- Ann Arbor : ProQuest Dissertations & Theses, 2019 - 62 p.
Source: Dissertations Abstracts International, Volume: 81-04, Section: B.
Thesis (Ph.D.)--The Florida State University, 2019.
This item must not be sold to any third party vendors.
Large volumes of genomic data and new scientific discoveries in biomedical research are being made every day by laboratories in both academia and industry. However, two issues severely affect the usability of so-called biomedical big data: 1) the majority of the public genomic data do not contain enough clinical information, and 2) scientific discoveries are stored in text as unstructured data. This dissertation presents two studies, which address each issue using machine learning methods, in order to maximize the usability of biomedical big data. In the first study, we infer missing clinical information using multiple gene expression data sets and a wide variety of machine learning methods. We proposed a new performance measure, Proportion of Positives which can be predicted with High accuracy (PPH), to evaluate models in term of their effectiveness in recovering data with missing clinical information. PPH estimates the percentage of data that can be recovered given a desired level of accuracy. The experiment results demonstrate the effectiveness of the predicted clinical information in downstream inference tasks.In the second study, we propose a three-stage computational method to automatically extract chemical-protein interactions (CPIs) from a given text. Our method extracts CPI-pairs and CPI-triplets from sentences; where a CPI-pair consists of a chemical compound and a protein name, and a CPI-triplet consists of a CPI-pair along with an interaction word describing their relationship. We extract a diverse set of features from sentences, which are used to build multiple machine learning models. Our models contain both simple features, which can be directly computed from sentences, and more sophisticated features derived using sentence structure analysis techniques. Our method performed the best among systems which use non-deep-learning methods, and outperformed several deep-learning-based systems in the track 5 of the BioCreative VI challenge. The features we designed in this study are informative and can be applied to other machine learning methods including deep learning.
ISBN: 9781085789639Subjects--Topical Terms:
515831
Mathematics.
Subjects--Index Terms:
Machine learning
Two Studies on the Application of Machine Learning for Biomedical Big Data.
LDR
:03200nmm a2200337 4500
001
2273615
005
20201109124816.5
008
220629s2019 ||||||||||||||||| ||eng d
020
$a
9781085789639
035
$a
(MiAaPQ)AAI13811934
035
$a
AAI13811934
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Lung, Pei-Yau.
$3
3551066
245
1 0
$a
Two Studies on the Application of Machine Learning for Biomedical Big Data.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2019
300
$a
62 p.
500
$a
Source: Dissertations Abstracts International, Volume: 81-04, Section: B.
500
$a
Advisor: Zhang, Jinfeng.
502
$a
Thesis (Ph.D.)--The Florida State University, 2019.
506
$a
This item must not be sold to any third party vendors.
520
$a
Large volumes of genomic data and new scientific discoveries in biomedical research are being made every day by laboratories in both academia and industry. However, two issues severely affect the usability of so-called biomedical big data: 1) the majority of the public genomic data do not contain enough clinical information, and 2) scientific discoveries are stored in text as unstructured data. This dissertation presents two studies, which address each issue using machine learning methods, in order to maximize the usability of biomedical big data. In the first study, we infer missing clinical information using multiple gene expression data sets and a wide variety of machine learning methods. We proposed a new performance measure, Proportion of Positives which can be predicted with High accuracy (PPH), to evaluate models in term of their effectiveness in recovering data with missing clinical information. PPH estimates the percentage of data that can be recovered given a desired level of accuracy. The experiment results demonstrate the effectiveness of the predicted clinical information in downstream inference tasks.In the second study, we propose a three-stage computational method to automatically extract chemical-protein interactions (CPIs) from a given text. Our method extracts CPI-pairs and CPI-triplets from sentences; where a CPI-pair consists of a chemical compound and a protein name, and a CPI-triplet consists of a CPI-pair along with an interaction word describing their relationship. We extract a diverse set of features from sentences, which are used to build multiple machine learning models. Our models contain both simple features, which can be directly computed from sentences, and more sophisticated features derived using sentence structure analysis techniques. Our method performed the best among systems which use non-deep-learning methods, and outperformed several deep-learning-based systems in the track 5 of the BioCreative VI challenge. The features we designed in this study are informative and can be applied to other machine learning methods including deep learning.
590
$a
School code: 0071.
650
4
$a
Mathematics.
$3
515831
650
4
$a
Medicine.
$3
641104
653
$a
Machine learning
653
$a
Biomedical big data
653
$a
Clinical information
690
$a
0405
690
$a
0564
710
2
$a
The Florida State University.
$b
Statistics.
$3
3185231
773
0
$t
Dissertations Abstracts International
$g
81-04B.
790
$a
0071
791
$a
Ph.D.
792
$a
2019
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=13811934
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9425849
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入