語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Suffix trees and suffix arrays in pr...
~
Ko, Pang.
FindBook
Google Book
Amazon
博客來
Suffix trees and suffix arrays in primary and secondary storage.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Suffix trees and suffix arrays in primary and secondary storage./
作者:
Ko, Pang.
面頁冊數:
69 p.
附註:
Adviser: Srinivas Aluru.
Contained By:
Dissertation Abstracts International68-07B.
標題:
Biology, Bioinformatics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3274885
ISBN:
9780549154792
Suffix trees and suffix arrays in primary and secondary storage.
Ko, Pang.
Suffix trees and suffix arrays in primary and secondary storage.
- 69 p.
Adviser: Srinivas Aluru.
Thesis (Ph.D.)--Iowa State University, 2007.
In recent years the volume of string data has increased exponentially, and the speed at which these data is being generated has also increased. Some examples of string data includes biological sequences, internet webpages, and digitalized documents, to name a few. The indexing of biological sequence data is especially challenging due to the lack of natural word and sentence boundaries. Although many algorithms are able to deal with this lack of natural boundaries, they are not able to process the large quantity of data in reasonable time. To speed up the runtime of these algorithms, suffix trees and suffix arrays are routinely used to generate a set of starting positions quickly and/or narrow down the set of possibilities need to be considered.
ISBN: 9780549154792Subjects--Topical Terms:
1018415
Biology, Bioinformatics.
Suffix trees and suffix arrays in primary and secondary storage.
LDR
:04170nam 2200313 a 45
001
957624
005
20110630
008
110630s2007 ||||||||||||||||| ||eng d
020
$a
9780549154792
035
$a
(UMI)AAI3274885
035
$a
AAI3274885
040
$a
UMI
$c
UMI
100
1
$a
Ko, Pang.
$3
1280969
245
1 0
$a
Suffix trees and suffix arrays in primary and secondary storage.
300
$a
69 p.
500
$a
Adviser: Srinivas Aluru.
500
$a
Source: Dissertation Abstracts International, Volume: 68-07, Section: B, page: 4595.
502
$a
Thesis (Ph.D.)--Iowa State University, 2007.
520
$a
In recent years the volume of string data has increased exponentially, and the speed at which these data is being generated has also increased. Some examples of string data includes biological sequences, internet webpages, and digitalized documents, to name a few. The indexing of biological sequence data is especially challenging due to the lack of natural word and sentence boundaries. Although many algorithms are able to deal with this lack of natural boundaries, they are not able to process the large quantity of data in reasonable time. To speed up the runtime of these algorithms, suffix trees and suffix arrays are routinely used to generate a set of starting positions quickly and/or narrow down the set of possibilities need to be considered.
520
$a
The first contribution of this dissertation is a linear time algorithm to sort all the suffixes of a string over a large alphabet of integers. The sorted order of suffixes of a string is also called suffix array, a data structure introduced by Manber and Myers that has numerous applications in pattern matching, string processing, and computational biology. Though the suffix tree of a string can be constructed in linear time and the sorted order of suffixes derived from it, a direct algorithm for suffix sorting is of great interest due to the space requirements of suffix trees. Our result is one of the first linear time suffix array construction algorithms, which improve upon the previously known O(n log n) time direct algorithms for suffix sorting. It can also be used to derive a different linear time construction algorithm for suffix trees. Apart from being simple and applicable for alphabets not necessarily of fixed size, this method of constructing suffix trees is more space efficient.
520
$a
The second contribution of this dissertation is providing a new suffix tree layout scheme for secondary storage and present construction, substring search, insertion and deletion algorithms using this layout scheme. For a set of strings of total length n, a pattern p and disk blocks of size B, we provide a substring search algorithm that uses O(|p|/ B +logB n) disk accesses. We present algorithms for insertion and deletion of all suffixes of a string of length m that take O(mlogB( n + m)) and O(mlog B n) disk accesses, respectively. Our results demonstrate that suffix trees can be directly used as efficient secondary storage data structures for string and sequence data.
520
$a
The last contribution of this dissertation is providing a self-adjusting variant of our layout scheme for suffix trees in secondary storage that provides optimal number of disk accesses for a sequence of string or substring queries. This has been an open problem since Sleator and Tarjan presented their splaying technique to create self-adjusting binary search trees in 1985. In addition to resolving this open problem, our scheme provides two additional advantages: (1) The partitions are slowly readjusted, requiring fewer disk accesses than splaying methods, and (2) the initial state of the layout is balanced, making it useful even when the sequence of queries is not highly skewed. Our layout scheme, and its self-adjusting variant are also applicable to PATRICIA trees, and potentially to other data structures.
590
$a
School code: 0097.
650
4
$a
Biology, Bioinformatics.
$3
1018415
650
4
$a
Computer Science.
$3
626642
690
$a
0715
690
$a
0984
710
2
$a
Iowa State University.
$b
Electrical and Computer Engineering.
$3
1018524
773
0
$t
Dissertation Abstracts International
$g
68-07B.
790
$a
0097
790
1 0
$a
Aluru, Srinivas,
$e
advisor
791
$a
Ph.D.
792
$a
2007
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3274885
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9121289
電子資源
11.線上閱覽_V
電子書
EB W9121289
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入