語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
On Optimizing LSM-Based Storage for ...
~
Luo, Chen.
FindBook
Google Book
Amazon
博客來
On Optimizing LSM-Based Storage for Big Data Management Systems.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
On Optimizing LSM-Based Storage for Big Data Management Systems./
作者:
Luo, Chen.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2020,
面頁冊數:
202 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Contained By:
Dissertations Abstracts International83-02B.
標題:
Computer science. -
電子資源:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28157520
ISBN:
9798522945879
On Optimizing LSM-Based Storage for Big Data Management Systems.
Luo, Chen.
On Optimizing LSM-Based Storage for Big Data Management Systems.
- Ann Arbor : ProQuest Dissertations & Theses, 2020 - 202 p.
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Thesis (Ph.D.)--University of California, Irvine, 2020.
This item must not be sold to any third party vendors.
In recent years, the Log-Structured Merge-tree (LSM-tree) has been widely used in the storage layer in modern NoSQL systems. Different from traditional index structures that apply updates in-place, an LSM-tree first buffers all writes in memory and subsequently flushes them to disk and merges them using sequential I/Os. This out-of-place update design brings a number of advantages, including superior write performance, high space utilization, tunability, and simplification of concurrency control and recovery. These advantages have enabled LSM-trees to serve a large variety of workloads in production systems.Despite the popularity of LSM-trees, the existing research efforts have been primarily focusing on improving the maximum throughput of LSM-trees in simple key-value store settings. This leads to several outages when adopting LSM-based storage techniques in a Big Data Management System (BDMS) with multiple heterogeneous LSM-trees and requiring performance metrics beyond just the maximum throughput.In this dissertation, we focus on optimizing LSM-trees for BDMSs. We first propose a set of techniques to efficiently maintain and exploit LSM-based auxiliary structures, including secondary indexes and filters. These techniques include a series of optimizations for efficient batched point lookups, significantly improving the range of applicability of LSM-based secondary indexes, and several new and efficient maintenance strategies to maintain LSM-based auxiliary structures to accommodate the out-of-place update nature of LSM-trees.In addition to maximum throughput, performance stability measures, such as percentile latency, are another important performance metric for storage systems. However, LSM-trees often exhibit large performance variance due to periodic write stalls. To tackle this problem, we propose a simple yet effective two-phase approach to evaluate write stalls of LSM-trees. We further explore the design choices of merge schedulers for various LSM-tree designs to minimize write stalls given a disk bandwidth budget.Finally, we present adaptive memory management techniques to break down the memory walls in LSM-based storage systems, enabling the shared management of memory components of multiple datasets and adaptive memory allocation between memory components and the disk buffer cache. These techniques together have successfully reduced the disk I/O cost of LSM-based storage systems, improving the system performance and efficiency.
ISBN: 9798522945879Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Database systems
On Optimizing LSM-Based Storage for Big Data Management Systems.
LDR
:03649nmm a2200373 4500
001
2282144
005
20211001100708.5
008
220723s2020 ||||||||||||||||| ||eng d
020
$a
9798522945879
035
$a
(MiAaPQ)AAI28157520
035
$a
AAI28157520
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Luo, Chen.
$3
3560900
245
1 0
$a
On Optimizing LSM-Based Storage for Big Data Management Systems.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2020
300
$a
202 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
500
$a
Advisor: Carey, Michael J.
502
$a
Thesis (Ph.D.)--University of California, Irvine, 2020.
506
$a
This item must not be sold to any third party vendors.
520
$a
In recent years, the Log-Structured Merge-tree (LSM-tree) has been widely used in the storage layer in modern NoSQL systems. Different from traditional index structures that apply updates in-place, an LSM-tree first buffers all writes in memory and subsequently flushes them to disk and merges them using sequential I/Os. This out-of-place update design brings a number of advantages, including superior write performance, high space utilization, tunability, and simplification of concurrency control and recovery. These advantages have enabled LSM-trees to serve a large variety of workloads in production systems.Despite the popularity of LSM-trees, the existing research efforts have been primarily focusing on improving the maximum throughput of LSM-trees in simple key-value store settings. This leads to several outages when adopting LSM-based storage techniques in a Big Data Management System (BDMS) with multiple heterogeneous LSM-trees and requiring performance metrics beyond just the maximum throughput.In this dissertation, we focus on optimizing LSM-trees for BDMSs. We first propose a set of techniques to efficiently maintain and exploit LSM-based auxiliary structures, including secondary indexes and filters. These techniques include a series of optimizations for efficient batched point lookups, significantly improving the range of applicability of LSM-based secondary indexes, and several new and efficient maintenance strategies to maintain LSM-based auxiliary structures to accommodate the out-of-place update nature of LSM-trees.In addition to maximum throughput, performance stability measures, such as percentile latency, are another important performance metric for storage systems. However, LSM-trees often exhibit large performance variance due to periodic write stalls. To tackle this problem, we propose a simple yet effective two-phase approach to evaluate write stalls of LSM-trees. We further explore the design choices of merge schedulers for various LSM-tree designs to minimize write stalls given a disk bandwidth budget.Finally, we present adaptive memory management techniques to break down the memory walls in LSM-based storage systems, enabling the shared management of memory components of multiple datasets and adaptive memory allocation between memory components and the disk buffer cache. These techniques together have successfully reduced the disk I/O cost of LSM-based storage systems, improving the system performance and efficiency.
590
$a
School code: 0030.
650
4
$a
Computer science.
$3
523869
650
4
$a
Electrical engineering.
$3
649834
650
4
$a
Information science.
$3
554358
650
4
$a
Trees.
$3
516384
650
4
$a
Concurrency control.
$3
3560901
653
$a
Database systems
653
$a
Indexing
653
$a
Log-structured merge-tree
653
$a
Memory management
653
$a
Storage management
690
$a
0984
690
$a
0544
690
$a
0723
710
2
$a
University of California, Irvine.
$b
Computer Science - Ph.D..
$3
2099281
773
0
$t
Dissertations Abstracts International
$g
83-02B.
790
$a
0030
791
$a
Ph.D.
792
$a
2020
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28157520
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9433877
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入