東華大學圖書館 |

Language: English

Help

回圖書館首頁

手機版館藏查詢

Back

Switch To: Labeled | MARC Mode | ISBD

Biological sequence analyses - Theor...

The University of Nebraska - Lincoln., Computer Science.

Linked to FindBook

Google Book

Amazon

博客來

Biological sequence analyses - Theory, algorithms, and applications.

Record Type:	Electronic resources : Monograph/item
Title/Author:	Biological sequence analyses - Theory, algorithms, and applications./
Author:	Ma, Fangrui.
Description:	248 p.
Notes:	Adviser: Jitender S. Deogun.
Contained By:	Dissertation Abstracts International70-06B.
Subject:	Biology, Bioinformatics. -
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3360173
ISBN:	9781109245400

Biological sequence analyses - Theory, algorithms, and applications.
Ma, Fangrui.

Biological sequence analyses - Theory, algorithms, and applications. - 248 p.

Adviser: Jitender S. Deogun.

Thesis (Ph.D.)--The University of Nebraska - Lincoln, 2009.

With more and more biological sequences available, sequence analyses have become very important in bioinformatics and computational biology. In this dissertation, we present the results of our research on genome sequence alignment, RNA folding with simple pseudoknots, and single nucleotide polymorphism (SNP) association pattern discovery with spatial constraints. In genome sequence alignment, we use the divide and conquer approach to reduce the computational complexity of multiple whole genome sequence alignment. There are three major steps: finding candidate anchors, aligning multiple anchor sequences, and closing the gaps between the aligned anchors. The candidate anchors are computed using a suffix tree/array method. Then multiple anchor sequences are aligned using efficient graph theoretic algorithms. ClustalW is used for closing the gaps. The experiments showed that the algorithms can correctly find the alignment, and the longest path algorithm is more efficient than the maximum clique algorithm. The comparison with other closely related program showed that our programs run faster. Furthermore, we introduced the concept of solution space of genome sequence alignment to solve the problem that the current genome sequence alignment algorithms do not consider spatial constraints. The solution space is modeled as a multi-bipartite digraph. We provide efficient graph decomposition and traversal algorithms for processing the graph to output solutions as alignments of functionally equivalent clusters. To find the maximum alignment among them, we developed an O(qn2) algorithm for finding the maximum edge q-clique in the graph. The most conserved sites between genome sequences are equivalent to the minimum level cut of the graph. For RNA folding, we developed a dynamic programming algorithm for prediction of simple pseudoknots in the optimal secondary structure of a single RNA sequence using standard thermodynamic parameters. The algorithm has time and space complexities of O(n 4) and O(n3 ), respectively. Compared with other methods by using PseudoBase, our method is more accurate. In the last part of the dissertation, we present a tool for discovering SNP association patterns with spatial constraints using the association mining method.

ISBN: 9781109245400Subjects--Topical Terms:

1018415
Biology, Bioinformatics.

Biological sequence analyses - Theory, algorithms, and applications.
LDR:03313nmm 2200313 a 45 001 866852
005 20100802
008 100802s2009 ||||||||||||||||| ||eng d
020 $a 9781109245400
035 $a (UMI)AAI3360173
035 $a AAI3360173
040 $a UMI $c UMI
100 1 $a Ma, Fangrui. $3 1035527
245 1 0 $a Biological sequence analyses - Theory, algorithms, and applications.
300 $a 248 p.
500 $a Adviser: Jitender S. Deogun.
500 $a Source: Dissertation Abstracts International, Volume: 70-06, Section: B, page: .
502 $a Thesis (Ph.D.)--The University of Nebraska - Lincoln, 2009.
520 $a With more and more biological sequences available, sequence analyses have become very important in bioinformatics and computational biology. In this dissertation, we present the results of our research on genome sequence alignment, RNA folding with simple pseudoknots, and single nucleotide polymorphism (SNP) association pattern discovery with spatial constraints. In genome sequence alignment, we use the divide and conquer approach to reduce the computational complexity of multiple whole genome sequence alignment. There are three major steps: finding candidate anchors, aligning multiple anchor sequences, and closing the gaps between the aligned anchors. The candidate anchors are computed using a suffix tree/array method. Then multiple anchor sequences are aligned using efficient graph theoretic algorithms. ClustalW is used for closing the gaps. The experiments showed that the algorithms can correctly find the alignment, and the longest path algorithm is more efficient than the maximum clique algorithm. The comparison with other closely related program showed that our programs run faster. Furthermore, we introduced the concept of solution space of genome sequence alignment to solve the problem that the current genome sequence alignment algorithms do not consider spatial constraints. The solution space is modeled as a multi-bipartite digraph. We provide efficient graph decomposition and traversal algorithms for processing the graph to output solutions as alignments of functionally equivalent clusters. To find the maximum alignment among them, we developed an O(qn2) algorithm for finding the maximum edge q-clique in the graph. The most conserved sites between genome sequences are equivalent to the minimum level cut of the graph. For RNA folding, we developed a dynamic programming algorithm for prediction of simple pseudoknots in the optimal secondary structure of a single RNA sequence using standard thermodynamic parameters. The algorithm has time and space complexities of O(n 4) and O(n3 ), respectively. Compared with other methods by using PseudoBase, our method is more accurate. In the last part of the dissertation, we present a tool for discovering SNP association patterns with spatial constraints using the association mining method.
590 $a School code: 0138.
650 4 $a Biology, Bioinformatics. $3 1018415
650 4 $a Computer Science. $3 626642
690 $a 0715
690 $a 0984
710 2 $a The University of Nebraska - Lincoln. $b Computer Science. $3 1035526
773 0 $t Dissertation Abstracts International $g 70-06B.
790 $a 0138
790 1 0 $a Deogun, Jitender S., $e advisor
790 1 0 $a Elbaum, Sebastian $e committee member
790 1 0 $a Moriyama, Etsuko $e committee member
790 1 0 $a Scott, Stephen $e committee member
791 $a Ph.D.
792 $a 2009
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3360173