Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
On Optimizing LSM-Based Storage for ...
~
Luo, Chen.
Linked to FindBook
Google Book
Amazon
博客來
On Optimizing LSM-Based Storage for Big Data Management Systems.
Record Type:
Electronic resources : Monograph/item
Title/Author:
On Optimizing LSM-Based Storage for Big Data Management Systems./
Author:
Luo, Chen.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2020,
Description:
202 p.
Notes:
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Contained By:
Dissertations Abstracts International83-02B.
Subject:
Computer science. -
Online resource:
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28157520
ISBN:
9798522945879
On Optimizing LSM-Based Storage for Big Data Management Systems.
Luo, Chen.
On Optimizing LSM-Based Storage for Big Data Management Systems.
- Ann Arbor : ProQuest Dissertations & Theses, 2020 - 202 p.
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
Thesis (Ph.D.)--University of California, Irvine, 2020.
This item must not be sold to any third party vendors.
In recent years, the Log-Structured Merge-tree (LSM-tree) has been widely used in the storage layer in modern NoSQL systems. Different from traditional index structures that apply updates in-place, an LSM-tree first buffers all writes in memory and subsequently flushes them to disk and merges them using sequential I/Os. This out-of-place update design brings a number of advantages, including superior write performance, high space utilization, tunability, and simplification of concurrency control and recovery. These advantages have enabled LSM-trees to serve a large variety of workloads in production systems.Despite the popularity of LSM-trees, the existing research efforts have been primarily focusing on improving the maximum throughput of LSM-trees in simple key-value store settings. This leads to several outages when adopting LSM-based storage techniques in a Big Data Management System (BDMS) with multiple heterogeneous LSM-trees and requiring performance metrics beyond just the maximum throughput.In this dissertation, we focus on optimizing LSM-trees for BDMSs. We first propose a set of techniques to efficiently maintain and exploit LSM-based auxiliary structures, including secondary indexes and filters. These techniques include a series of optimizations for efficient batched point lookups, significantly improving the range of applicability of LSM-based secondary indexes, and several new and efficient maintenance strategies to maintain LSM-based auxiliary structures to accommodate the out-of-place update nature of LSM-trees.In addition to maximum throughput, performance stability measures, such as percentile latency, are another important performance metric for storage systems. However, LSM-trees often exhibit large performance variance due to periodic write stalls. To tackle this problem, we propose a simple yet effective two-phase approach to evaluate write stalls of LSM-trees. We further explore the design choices of merge schedulers for various LSM-tree designs to minimize write stalls given a disk bandwidth budget.Finally, we present adaptive memory management techniques to break down the memory walls in LSM-based storage systems, enabling the shared management of memory components of multiple datasets and adaptive memory allocation between memory components and the disk buffer cache. These techniques together have successfully reduced the disk I/O cost of LSM-based storage systems, improving the system performance and efficiency.
ISBN: 9798522945879Subjects--Topical Terms:
523869
Computer science.
Subjects--Index Terms:
Database systems
On Optimizing LSM-Based Storage for Big Data Management Systems.
LDR
:03649nmm a2200373 4500
001
2282144
005
20211001100708.5
008
220723s2020 ||||||||||||||||| ||eng d
020
$a
9798522945879
035
$a
(MiAaPQ)AAI28157520
035
$a
AAI28157520
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Luo, Chen.
$3
3560900
245
1 0
$a
On Optimizing LSM-Based Storage for Big Data Management Systems.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2020
300
$a
202 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-02, Section: B.
500
$a
Advisor: Carey, Michael J.
502
$a
Thesis (Ph.D.)--University of California, Irvine, 2020.
506
$a
This item must not be sold to any third party vendors.
520
$a
In recent years, the Log-Structured Merge-tree (LSM-tree) has been widely used in the storage layer in modern NoSQL systems. Different from traditional index structures that apply updates in-place, an LSM-tree first buffers all writes in memory and subsequently flushes them to disk and merges them using sequential I/Os. This out-of-place update design brings a number of advantages, including superior write performance, high space utilization, tunability, and simplification of concurrency control and recovery. These advantages have enabled LSM-trees to serve a large variety of workloads in production systems.Despite the popularity of LSM-trees, the existing research efforts have been primarily focusing on improving the maximum throughput of LSM-trees in simple key-value store settings. This leads to several outages when adopting LSM-based storage techniques in a Big Data Management System (BDMS) with multiple heterogeneous LSM-trees and requiring performance metrics beyond just the maximum throughput.In this dissertation, we focus on optimizing LSM-trees for BDMSs. We first propose a set of techniques to efficiently maintain and exploit LSM-based auxiliary structures, including secondary indexes and filters. These techniques include a series of optimizations for efficient batched point lookups, significantly improving the range of applicability of LSM-based secondary indexes, and several new and efficient maintenance strategies to maintain LSM-based auxiliary structures to accommodate the out-of-place update nature of LSM-trees.In addition to maximum throughput, performance stability measures, such as percentile latency, are another important performance metric for storage systems. However, LSM-trees often exhibit large performance variance due to periodic write stalls. To tackle this problem, we propose a simple yet effective two-phase approach to evaluate write stalls of LSM-trees. We further explore the design choices of merge schedulers for various LSM-tree designs to minimize write stalls given a disk bandwidth budget.Finally, we present adaptive memory management techniques to break down the memory walls in LSM-based storage systems, enabling the shared management of memory components of multiple datasets and adaptive memory allocation between memory components and the disk buffer cache. These techniques together have successfully reduced the disk I/O cost of LSM-based storage systems, improving the system performance and efficiency.
590
$a
School code: 0030.
650
4
$a
Computer science.
$3
523869
650
4
$a
Electrical engineering.
$3
649834
650
4
$a
Information science.
$3
554358
650
4
$a
Trees.
$3
516384
650
4
$a
Concurrency control.
$3
3560901
653
$a
Database systems
653
$a
Indexing
653
$a
Log-structured merge-tree
653
$a
Memory management
653
$a
Storage management
690
$a
0984
690
$a
0544
690
$a
0723
710
2
$a
University of California, Irvine.
$b
Computer Science - Ph.D..
$3
2099281
773
0
$t
Dissertations Abstracts International
$g
83-02B.
790
$a
0030
791
$a
Ph.D.
792
$a
2020
793
$a
English
856
4 0
$u
https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28157520
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9433877
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login