語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Data mining algorithms for classific...
~
Lan, Liang.
FindBook
Google Book
Amazon
博客來
Data mining algorithms for classification of complex biomedical data.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Data mining algorithms for classification of complex biomedical data./
作者:
Lan, Liang.
面頁冊數:
112 p.
附註:
Source: Dissertation Abstracts International, Volume: 74-06(E), Section: B.
Contained By:
Dissertation Abstracts International74-06B(E).
標題:
Computer Science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3552336
ISBN:
9781267907868
Data mining algorithms for classification of complex biomedical data.
Lan, Liang.
Data mining algorithms for classification of complex biomedical data.
- 112 p.
Source: Dissertation Abstracts International, Volume: 74-06(E), Section: B.
Thesis (Ph.D.)--Temple University, 2012.
In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data.
ISBN: 9781267907868Subjects--Topical Terms:
626642
Computer Science.
Data mining algorithms for classification of complex biomedical data.
LDR
:04256nam a2200325 4500
001
1968298
005
20141203120934.5
008
150210s2012 ||||||||||||||||| ||eng d
020
$a
9781267907868
035
$a
(MiAaPQ)AAI3552336
035
$a
AAI3552336
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Lan, Liang.
$3
2105442
245
1 0
$a
Data mining algorithms for classification of complex biomedical data.
300
$a
112 p.
500
$a
Source: Dissertation Abstracts International, Volume: 74-06(E), Section: B.
500
$a
Adviser: Slobodan Vucetic.
502
$a
Thesis (Ph.D.)--Temple University, 2012.
520
$a
In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data.
520
$a
In microarray classification, samples belong to several predefined categories (e.g., cancer vs. control tissues) and the goal is to build a predictor that classifies a new tissue sample based on its microarray measurements. When faced with the small-sample high-dimensional microarray data, most machine learning algorithm would produce an overly complicated model that performs well on training data but poorly on new data. To reduce the risk of over-fitting, feature selection becomes an essential technique in microarray classification. However, standard feature selection algorithms are bound to underperform when the size of the microarray data is particularly small. The best remedy is to borrow strength from external microarray datasets. In this dissertation, I will present two new multi-task feature filter methods which can improve the classification performance by utilizing the external microarray data. The first method is to aggregate the feature selection results from multiple microarray classification tasks. The resulting multi-task feature selection can be shown to improve quality of the selected features and lead to higher classification accuracy. The second method jointly selects a small gene set with maximal discriminative power and minimal redundancy across multiple classification tasks by solving an objective function with integer constraints.
520
$a
In protein function prediction problem, gene functions are predicted from a predefined set of possible functions (e.g., the functions defined in the Gene Ontology). Gene function prediction is a complex classification problem characterized by the following aspects: (1) a single gene may have multiple functions; (2) the functions are organized in hierarchy; (3) unbalanced training data for each function (much less positive than negative examples); (4) missing class labels; (5) availability of multiple biological data sources, such as microarray data, genome sequence and protein-protein interactions. As participants in the 2011 Critical Assessment of Function Annotation (CAFA) challenge, our team achieved the highest AUC accuracy among 45 groups. In the competition, we gained by focusing on the 5-th aspect of the problem. Thus, in this dissertation, I will discuss several schemes to integrate the prediction scores from multiple data sources and show their results. Interestingly, the experimental results show that a simple averaging integration method is competitive with other state-of-the-art data integration methods.
520
$a
Original spatial scan algorithm is used for detection of spatial overdensities: discovery of spatial subregions with significantly higher scores according to some density measure. This algorithm is widely used in identifying cluster of disease cases (e.g., identifying environmental risk factors for child leukemia). However, the original spatial scan algorithm only works on static spatial data. In this dissertation, I will propose one possible solution for spatial scan on movement data.
590
$a
School code: 0225.
650
4
$a
Computer Science.
$3
626642
650
4
$a
Biology, Bioinformatics.
$3
1018415
650
4
$a
Information Science.
$3
1017528
690
$a
0984
690
$a
0715
690
$a
0723
710
2
$a
Temple University.
$b
Computer and Information Science.
$3
1065462
773
0
$t
Dissertation Abstracts International
$g
74-06B(E).
790
$a
0225
791
$a
Ph.D.
792
$a
2012
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3552336
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9263304
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入