語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Statistical learning in drug discove...
~
Wang, Xu.
FindBook
Google Book
Amazon
博客來
Statistical learning in drug discovery via clustering and mixtures.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Statistical learning in drug discovery via clustering and mixtures./
作者:
Wang, Xu.
面頁冊數:
208 p.
附註:
Source: Dissertation Abstracts International, Volume: 69-01, Section: B, page: 0400.
Contained By:
Dissertation Abstracts International69-01B.
標題:
Biology, Bioinformatics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR36445
ISBN:
9780494364451
Statistical learning in drug discovery via clustering and mixtures.
Wang, Xu.
Statistical learning in drug discovery via clustering and mixtures.
- 208 p.
Source: Dissertation Abstracts International, Volume: 69-01, Section: B, page: 0400.
Thesis (Ph.D.)--University of Waterloo (Canada), 2007.
In drug discovery, thousands of compounds are assayed to detect activity against a biological target. The goal of drug discovery is to identify compounds that are active against the target (e.g. inhibit a virus). Statistical learning in drug discovery seeks to build a model that uses descriptors characterizing molecular structure to predict biological activity. However, the characteristics of drug discovery data can make it difficult to model the relationship between molecular descriptors and biological activity. Among these characteristics are the rarity of active compounds, the large volume of compounds tested by high-throughput screening, and the complexity of molecular structure and its relationship to activity.
ISBN: 9780494364451Subjects--Topical Terms:
1018415
Biology, Bioinformatics.
Statistical learning in drug discovery via clustering and mixtures.
LDR
:04735nam 2200313 a 45
001
962260
005
20110830
008
110831s2007 ||||||||||||||||| ||eng d
020
$a
9780494364451
035
$a
(UMI)AAINR36445
035
$a
AAINR36445
040
$a
UMI
$c
UMI
100
1
$a
Wang, Xu.
$3
1028898
245
1 0
$a
Statistical learning in drug discovery via clustering and mixtures.
300
$a
208 p.
500
$a
Source: Dissertation Abstracts International, Volume: 69-01, Section: B, page: 0400.
502
$a
Thesis (Ph.D.)--University of Waterloo (Canada), 2007.
520
$a
In drug discovery, thousands of compounds are assayed to detect activity against a biological target. The goal of drug discovery is to identify compounds that are active against the target (e.g. inhibit a virus). Statistical learning in drug discovery seeks to build a model that uses descriptors characterizing molecular structure to predict biological activity. However, the characteristics of drug discovery data can make it difficult to model the relationship between molecular descriptors and biological activity. Among these characteristics are the rarity of active compounds, the large volume of compounds tested by high-throughput screening, and the complexity of molecular structure and its relationship to activity.
520
$a
This thesis focuses on the design of statistical learning algorithms/models and their applications to drug discovery. The two main parts of the thesis are: an algorithm-based statistical method and a more formal model-based approach. Both approaches can facilitate and accelerate the process of developing new drugs. A unifying theme is the use of unsupervised methods as components of supervised learning algorithms/models.
520
$a
In the first part of the thesis, we explore a sequential screening approach, Cluster Structure-Activity Relationship Analysis (CSARA). Sequential screening integrates High Throughput Screening with mathematical modeling to sequentially select the best compounds. CSARA is a cluster-based and algorithm driven method. To gain further insight into this method, we use three carefully designed experiments to compare predictive accuracy with Recursive Partitioning, a popular structure-activity relationship analysis method. The experiments show that CSARA outperforms Recursive Partitioning. Comparisons include problems with many descriptor sets and situations in which many descriptors are not important for activity.
520
$a
In the second part of the thesis, we propose and develop constrained mixture discriminant analysis (CMDA), a model-based method. The main idea of CMDA is to model the distribution of the observations given the class label (e.g. active or inactive class) as a constrained mixture distribution, and then use Bayes' rule to predict the probability of being active for each observation in the testing set. Constraints are used to deal with the otherwise explosive growth of the number of parameters with increasing dimensionality. CMDA is designed to solve several challenges in modeling drug data sets, such as multiple mechanisms, the rare target problem (i.e. imbalanced classes), and the identification of relevant subspaces of descriptors (i.e. variable selection).
520
$a
We focus on the CMDA1 model, in which univariate densities form the building blocks of the mixture components. Due to the unboundedness of the CMDA1 log likelihood function, it is easy for the EM algorithm to converge to degenerate solutions. A special Multi-Step EM algorithm is therefore developed and explored via several experimental comparisons. Using the multi-step EM algorithm, the CMDA1 model is compared to model-based clustering discriminant analysis (MclustDA). The CMDA1 model is either superior to or competitive with the MclustDA model, depending on which model generates the data. The CMDA1 model has better performance than the MclustDA model when the data are high-dimensional and unbalanced, an essential feature of the drug discovery problem!
520
$a
An alternate approach to the problem of degeneracy is penalized estimation. By introducing a group of simple penalty functions, we consider penalized maximum likelihood estimation of the CMDA1 and CMDA2 models. This strategy improves the convergence of the conventional EM algorithm, and helps avoid degenerate solutions. Extending techniques from Chen et al. (2007), we prove that the PMLE's of the two-dimensional CMDA1 model can be asymptotically consistent.
590
$a
School code: 1141.
650
4
$a
Biology, Bioinformatics.
$3
1018415
650
4
$a
Statistics.
$3
517247
690
$a
0463
690
$a
0715
710
2
$a
University of Waterloo (Canada).
$3
1017669
773
0
$t
Dissertation Abstracts International
$g
69-01B.
790
$a
1141
791
$a
Ph.D.
792
$a
2007
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR36445
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9122615
電子資源
11.線上閱覽_V
電子書
EB W9122615
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入