Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Application of language technologies...
~
Ganapathiraju, Madhavi K.
Linked to FindBook
Google Book
Amazon
博客來
Application of language technologies in biology: Feature extraction and modeling for transmembrane helix prediction.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Application of language technologies in biology: Feature extraction and modeling for transmembrane helix prediction./
Author:
Ganapathiraju, Madhavi K.
Description:
155 p.
Notes:
Source: Dissertation Abstracts International, Volume: 68-06, Section: B, page: 3482.
Contained By:
Dissertation Abstracts International68-06B.
Subject:
Biology, Molecular. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3269539
ISBN:
9780549086574
Application of language technologies in biology: Feature extraction and modeling for transmembrane helix prediction.
Ganapathiraju, Madhavi K.
Application of language technologies in biology: Feature extraction and modeling for transmembrane helix prediction.
- 155 p.
Source: Dissertation Abstracts International, Volume: 68-06, Section: B, page: 3482.
Thesis (Ph.D.)--Carnegie Mellon University, 2007.
This thesis provides new insights into the application of algorithms developed for language processing towards problems in mapping of protein sequences to their structure and function, in direct analogy to the mapping of words to meaning in natural language. While there have been applications of language algorithms previously in computational biology, most notably hidden Markov models, there has been no systematic investigation of what are appropriate word equivalents and vocabularies in biology to date. In this thesis, we consider amino acids, chemical vocabularies and amino acid properties as fundamental building blocks of protein sequence language and study n-grams and other positional word-associations and latent semantic analysis towards prediction transmembrane helices.
ISBN: 9780549086574Subjects--Topical Terms:
1017719
Biology, Molecular.
Application of language technologies in biology: Feature extraction and modeling for transmembrane helix prediction.
LDR
:03131nmm 2200301 4500
001
1835908
005
20080107105549.5
008
130610s2007 eng d
020
$a
9780549086574
035
$a
(UMI)AAI3269539
035
$a
AAI3269539
040
$a
UMI
$c
UMI
100
1
$a
Ganapathiraju, Madhavi K.
$3
1924528
245
1 0
$a
Application of language technologies in biology: Feature extraction and modeling for transmembrane helix prediction.
300
$a
155 p.
500
$a
Source: Dissertation Abstracts International, Volume: 68-06, Section: B, page: 3482.
502
$a
Thesis (Ph.D.)--Carnegie Mellon University, 2007.
520
$a
This thesis provides new insights into the application of algorithms developed for language processing towards problems in mapping of protein sequences to their structure and function, in direct analogy to the mapping of words to meaning in natural language. While there have been applications of language algorithms previously in computational biology, most notably hidden Markov models, there has been no systematic investigation of what are appropriate word equivalents and vocabularies in biology to date. In this thesis, we consider amino acids, chemical vocabularies and amino acid properties as fundamental building blocks of protein sequence language and study n-grams and other positional word-associations and latent semantic analysis towards prediction transmembrane helices.
520
$a
First, a toolkit referred to as the Biological Language Modeling Toolkit has been developed for biological sequence analysis through amino acid n-gram and amino acid word-association analysis. N-gram comparisons across genomes showed that biological sequence language differs from organism to organism, and has resulted in identification of genome signatures.
520
$a
Next, we used a biologically well established mapping problem, namely the mapping of protein sequences to their secondary structures, to quantitatively compare the utility of different fundamental building blocks in representing protein sequences. We found that the different vocabularies capture different aspects of protein secondary structure best. Finally, the conclusions from the study of biological vocabularies were used, in combination with the latent semantic analysis and signal processing techniques to address the biologically important but technically challenging and unsolved problem of predicting transmembrane segments.
520
$a
This work led to the development of TMpro, which achieves reduced transmembrane segment prediction error rate by 20-50% compared to previous state-of-the-art methods. The method is a novel approach of analyzing amino-acid property sequences as opposed to analyzing amino acid sequences: following our work, it has already been applied towards protein remote homology detection and protein structural type classifications by others.
590
$a
School code: 0041.
650
4
$a
Biology, Molecular.
$3
1017719
650
4
$a
Biology, Bioinformatics.
$3
1018415
650
4
$a
Computer Science.
$3
626642
690
$a
0307
690
$a
0715
690
$a
0984
710
2
$a
Carnegie Mellon University.
$3
1018096
773
0
$t
Dissertation Abstracts International
$g
68-06B.
790
$a
0041
791
$a
Ph.D.
792
$a
2007
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3269539
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9226928
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login