東華大學圖書館 |

Language: English

Help

回圖書館首頁

手機版館藏查詢

Back

Switch To: Labeled | MARC Mode | ISBD

Application of statistical propertie...

University of Houston.

Linked to FindBook

Google Book

Amazon

博客來

Application of statistical properties of short sequences in the analysis of 16S ribosomal RNA and the identification of bacteria.

Record Type:	Language materials, printed : Monograph/item
Title/Author:	Application of statistical properties of short sequences in the analysis of 16S ribosomal RNA and the identification of bacteria./
Author:	Zhu, Dianhui.
Description:	172 p.
Notes:	Source: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6101.
Contained By:	Dissertation Abstracts International68-09B.
Subject:	Biology, Bioinformatics. -
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3279600
ISBN:	9780549206729

Application of statistical properties of short sequences in the analysis of 16S ribosomal RNA and the identification of bacteria.
Zhu, Dianhui.

Application of statistical properties of short sequences in the analysis of 16S ribosomal RNA and the identification of bacteria. - 172 p.

Source: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6101.

Thesis (Ph.D.)--University of Houston, 2007.

Comparisons of 16S ribosomal RNA, (16S rRNA), are widely used to characterize relationships between bacteria and to identify unknown bacteria. The dramatically increasing number of 16S rRNA sequences and the large number of species present numerous computational challenges, many of which are addressed here. First, in order to facilitate efforts to characterize the sequence of hundreds of 16S rRNAs at a time, a software tool, STITCH, was developed. STITCH automates the process of splicing sequences obtained from reverse and forward primer reads and automatically searches the resulting sequence against the NCBI online database or a local database of type strains. STITCH has been used to process over 4,000 sequences. Second, an efficient software tool known as ProkProbePicker, (PPP), was developed to rapidly design probe-target n-mers for all major groupings in a known phylogenetic tree using a fast string search based on the Karp-Rabin algorithm. When parallelized, the run time for this algorithm was reduced to 67 minutes from 87 hours. Third, in order to rapidly characterize the similarity of large numbers of 16S rRNA sequences, alignment independent comparisons using n-mers were examined. Three measures of distance were considered: the linear correlation coefficient, the Angle distance, and the Manhattan distance. The Angle distance measure using 6-mers gave the best correlation with standard alignment based methods and was therefore used to identify clusters of similar 16S rRNA sequences among over 300,000 database entries. Finally, an evolutionary computing approach was used to design universal arrays of 16S rRNA target subsequences that can be used to place any unidentified bacterium in the known phylogenetic context of a group of 16S rRNA sequences that represent the major clusters found with the Angle distance measure. A target set consisting of 703 20-mers was identified that was able to place an unknown organism within five tree nodes of the correct location over half the time. A larger array of 6011 20-mers achieved accuracy that was very close to the maximum obtainable accuracy.

ISBN: 9780549206729Subjects--Topical Terms:

1018415
Biology, Bioinformatics.

Application of statistical properties of short sequences in the analysis of 16S ribosomal RNA and the identification of bacteria.
LDR:02971nam 2200253 a 45 001 861708
005 20100720
008 100720s2007 ||||||||||||||||| ||eng d
020 $a 9780549206729
035 $a (UMI)AAI3279600
035 $a AAI3279600
040 $a UMI $c UMI
100 1 $a Zhu, Dianhui. $3 1029431
245 1 0 $a Application of statistical properties of short sequences in the analysis of 16S ribosomal RNA and the identification of bacteria.
300 $a 172 p.
500 $a Source: Dissertation Abstracts International, Volume: 68-09, Section: B, page: 6101.
502 $a Thesis (Ph.D.)--University of Houston, 2007.
520 $a Comparisons of 16S ribosomal RNA, (16S rRNA), are widely used to characterize relationships between bacteria and to identify unknown bacteria. The dramatically increasing number of 16S rRNA sequences and the large number of species present numerous computational challenges, many of which are addressed here. First, in order to facilitate efforts to characterize the sequence of hundreds of 16S rRNAs at a time, a software tool, STITCH, was developed. STITCH automates the process of splicing sequences obtained from reverse and forward primer reads and automatically searches the resulting sequence against the NCBI online database or a local database of type strains. STITCH has been used to process over 4,000 sequences. Second, an efficient software tool known as ProkProbePicker, (PPP), was developed to rapidly design probe-target n-mers for all major groupings in a known phylogenetic tree using a fast string search based on the Karp-Rabin algorithm. When parallelized, the run time for this algorithm was reduced to 67 minutes from 87 hours. Third, in order to rapidly characterize the similarity of large numbers of 16S rRNA sequences, alignment independent comparisons using n-mers were examined. Three measures of distance were considered: the linear correlation coefficient, the Angle distance, and the Manhattan distance. The Angle distance measure using 6-mers gave the best correlation with standard alignment based methods and was therefore used to identify clusters of similar 16S rRNA sequences among over 300,000 database entries. Finally, an evolutionary computing approach was used to design universal arrays of 16S rRNA target subsequences that can be used to place any unidentified bacterium in the known phylogenetic context of a group of 16S rRNA sequences that represent the major clusters found with the Angle distance measure. A target set consisting of 703 20-mers was identified that was able to place an unknown organism within five tree nodes of the correct location over half the time. A larger array of 6011 20-mers achieved accuracy that was very close to the maximum obtainable accuracy.
590 $a School code: 0087.
650 4 $a Biology, Bioinformatics. $3 1018415
650 4 $a Computer Science. $3 626642
690 $a 0715
690 $a 0984
710 2 $a University of Houston. $3 1019266
773 0 $t Dissertation Abstracts International $g 68-09B.
790 $a 0087
791 $a Ph.D.
792 $a 2007
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3279600