Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Statistical learning and data mining...
~
Stevens Institute of Technology.
Linked to FindBook
Google Book
Amazon
博客來
Statistical learning and data mining in biological databases.
Record Type:
Language materials, printed : Monograph/item
Title/Author:
Statistical learning and data mining in biological databases./
Author:
Kim, Hyunjae Ryan.
Description:
84 p.
Notes:
Advisers: Rajarathnam Chandramouli; Yingying Chen; Haibo He.
Contained By:
Dissertation Abstracts International70-03B.
Subject:
Biology, Bioinformatics. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoeng/servlet/advanced?query=3351557
ISBN:
9781109078626
Statistical learning and data mining in biological databases.
Kim, Hyunjae Ryan.
Statistical learning and data mining in biological databases.
- 84 p.
Advisers: Rajarathnam Chandramouli; Yingying Chen; Haibo He.
Thesis (Ph.D.)--Stevens Institute of Technology, 2009.
This thesis explores (i) the feasibility of using communication theory models to understand the protein synthesis process from gene to protein, (ii) to find the genetic error control mechanism using error correcting coding theory and (iii) detecting diseases related genetic errors using statistical learning methods on biological databases i.e., EST(Expressed Sequence Tag) and SNP(Single Nucleotide Polymorphism). Several statistical tests are proposed and tested over various biological data. These include the CUSUM (Cumulative Sum) detection for abrupt changes in a stochastic process, SVD(Singular Value Decomposition) for dimensionality reduction and HMM-SVM(Hidden Markov Model-Support Vector Machine). We propose new disease diagnosis systems based on Gene Variation Analysis. The system consist of Pre-Processing, Similarity Search and clustering by EST analysis and disease analysis by SNP classification. Pre-processing reduces the overall noise (vector contamination, low complexity region, repeats) in EST data to improve the efficacy of subsequent analysis. EST clustering and assembly using CAP3 sequence assembly is used to collect overlapping ESTs from the same transcript to reduce redundancy. The assembled EST called Consensus EST sequences are merged based on clone-identification data to obtain the best putative gene representation. Detailed test results on several biological databases are used to draw key conclusions about the proposed mathematical analyses.
ISBN: 9781109078626Subjects--Topical Terms:
1018415
Biology, Bioinformatics.
Statistical learning and data mining in biological databases.
LDR
:02480nam 2200301 a 45
001
857106
005
20100709
008
100709s2009 ||||||||||||||||| ||eng d
020
$a
9781109078626
035
$a
(UMI)AAI3351557
035
$a
AAI3351557
040
$a
UMI
$c
UMI
100
1
$a
Kim, Hyunjae Ryan.
$3
1024032
245
1 0
$a
Statistical learning and data mining in biological databases.
300
$a
84 p.
500
$a
Advisers: Rajarathnam Chandramouli; Yingying Chen; Haibo He.
500
$a
Source: Dissertation Abstracts International, Volume: 70-03, Section: B, page: 1755.
502
$a
Thesis (Ph.D.)--Stevens Institute of Technology, 2009.
520
$a
This thesis explores (i) the feasibility of using communication theory models to understand the protein synthesis process from gene to protein, (ii) to find the genetic error control mechanism using error correcting coding theory and (iii) detecting diseases related genetic errors using statistical learning methods on biological databases i.e., EST(Expressed Sequence Tag) and SNP(Single Nucleotide Polymorphism). Several statistical tests are proposed and tested over various biological data. These include the CUSUM (Cumulative Sum) detection for abrupt changes in a stochastic process, SVD(Singular Value Decomposition) for dimensionality reduction and HMM-SVM(Hidden Markov Model-Support Vector Machine). We propose new disease diagnosis systems based on Gene Variation Analysis. The system consist of Pre-Processing, Similarity Search and clustering by EST analysis and disease analysis by SNP classification. Pre-processing reduces the overall noise (vector contamination, low complexity region, repeats) in EST data to improve the efficacy of subsequent analysis. EST clustering and assembly using CAP3 sequence assembly is used to collect overlapping ESTs from the same transcript to reduce redundancy. The assembled EST called Consensus EST sequences are merged based on clone-identification data to obtain the best putative gene representation. Detailed test results on several biological databases are used to draw key conclusions about the proposed mathematical analyses.
590
$a
School code: 0733.
650
4
$a
Biology, Bioinformatics.
$3
1018415
650
4
$a
Computer Science.
$3
626642
690
$a
0715
690
$a
0984
710
2
$a
Stevens Institute of Technology.
$3
1019501
773
0
$t
Dissertation Abstracts International
$g
70-03B.
790
$a
0733
790
1 0
$a
Chandramouli, Rajarathnam,
$e
advisor
790
1 0
$a
Chen, Yingying,
$e
advisor
790
1 0
$a
He, Haibo,
$e
advisor
791
$a
Ph.D.
792
$a
2009
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoeng/servlet/advanced?query=3351557
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9072267
電子資源
11.線上閱覽_V
電子書
EB W9072267
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login