東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

I/O Optimization in Big Data Storage...

Qader, Mohiuddin Abdul.

FindBook

Google Book

Amazon

博客來

I/O Optimization in Big Data Storage Systems.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	I/O Optimization in Big Data Storage Systems./
作者:	Qader, Mohiuddin Abdul.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2018,
面頁冊數:	155 p.
附註:	Source: Dissertations Abstracts International, Volume: 80-05, Section: B.
Contained By:	Dissertations Abstracts International80-05B.
標題:	Computer Engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10935074
ISBN:	9780438640290

I/O Optimization in Big Data Storage Systems.
Qader, Mohiuddin Abdul.

I/O Optimization in Big Data Storage Systems. - Ann Arbor : ProQuest Dissertations & Theses, 2018 - 155 p.

Source: Dissertations Abstracts International, Volume: 80-05, Section: B.

Thesis (Ph.D.)--University of California, Riverside, 2018.

This item must not be sold to any third party vendors.

The age of Big data has transformed into the era of Internet of Things (IoT) where massive scale data is generated, stored, and used by a diverse set of physical objects: devices, vehicles, buildings, software, sensors, GPS and networks. It has become an open challenge for researchers in academia and industry to find the best ways to ingest, replicate, manage, read and deliver this massively growing data efficiently to millions of users in real time. Big data storage systems -- especially NoSQL databases like LevelDB, Cassandra, BigTable and AsterixDB -- have become extremely popular in the last decade for managing large amounts of data that don't require the stringent concurrency or transaction management guarantees. In such settings, NoSQL systems achieve high rates of data writes. My research interests focus on Input/Output (I/O) optimizations of such state-of-the-art big data storage systems. Specially my thesis aims mainly at three aspects of optimization: Indexing, Partitioning and Replication. a) Indexing of non-key attributes: Current state-of-the-art big data storage systems have limited support for secondary attribute lookup queries or continuous lookup queries. To tackle these limitations, first we introduce and implement five secondary indexes on a NoSQL database. Specifically, we use the popular LevelDB database, which employs Log-Structured Merge-Tree (LSM) for organizing its data. Our comprehensive experimental study and theoretical evaluation provide empirical guidelines for optimal choice of secondary index, depending on the workload of different applications. b) Indexing for publish-subscribe systems: We propose and compare several publish/subscribe storage architectures, based on the popular NoSQL LSM storage paradigm, to support high-throughput and highly dynamic continuous lookup queries. Our framework naturally supports subscriptions on both historic and future streaming data, and generates instant notifications. c) Data partitioning: We create optimization techniques for spatial indexes via intelligent partitioning. Currently NoSQL based databases do not offer any spatial partitioning to achieve faster spatial query response. We propose a level-based organization of disk components and two novel component merge techniques that leverage their spatial properties. d) Data replication: Another important feature of big storage systems is its availability and reliability, which is achieved through replication. Paxos is a widely used replication policy to ensure the replicas are in sync. We develop an I/O optimized Paxos-based fault-tolerant block storage replication engine.

ISBN: 9780438640290Subjects--Topical Terms:

1567821
Computer Engineering.

I/O Optimization in Big Data Storage Systems.
LDR:03686nmm a2200325 4500 001 2205627
005 20190828120328.5
008 201008s2018 ||||||||||||||||| ||eng d
020 $a 9780438640290
035 $a (MiAaPQ)AAI10935074
035 $a (MiAaPQ)ucr:13513
035 $a AAI10935074
040 $a MiAaPQ $c MiAaPQ
100 1 $a Qader, Mohiuddin Abdul. $3 3432490
245 1 0 $a I/O Optimization in Big Data Storage Systems.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2018
300 $a 155 p.
500 $a Source: Dissertations Abstracts International, Volume: 80-05, Section: B.
500 $a Publisher info.: Dissertation/Thesis.
500 $a Hristidis, Vagelis.
502 $a Thesis (Ph.D.)--University of California, Riverside, 2018.
506 $a This item must not be sold to any third party vendors.
520 $a The age of Big data has transformed into the era of Internet of Things (IoT) where massive scale data is generated, stored, and used by a diverse set of physical objects: devices, vehicles, buildings, software, sensors, GPS and networks. It has become an open challenge for researchers in academia and industry to find the best ways to ingest, replicate, manage, read and deliver this massively growing data efficiently to millions of users in real time. Big data storage systems -- especially NoSQL databases like LevelDB, Cassandra, BigTable and AsterixDB -- have become extremely popular in the last decade for managing large amounts of data that don't require the stringent concurrency or transaction management guarantees. In such settings, NoSQL systems achieve high rates of data writes. My research interests focus on Input/Output (I/O) optimizations of such state-of-the-art big data storage systems. Specially my thesis aims mainly at three aspects of optimization: Indexing, Partitioning and Replication. a) Indexing of non-key attributes: Current state-of-the-art big data storage systems have limited support for secondary attribute lookup queries or continuous lookup queries. To tackle these limitations, first we introduce and implement five secondary indexes on a NoSQL database. Specifically, we use the popular LevelDB database, which employs Log-Structured Merge-Tree (LSM) for organizing its data. Our comprehensive experimental study and theoretical evaluation provide empirical guidelines for optimal choice of secondary index, depending on the workload of different applications. b) Indexing for publish-subscribe systems: We propose and compare several publish/subscribe storage architectures, based on the popular NoSQL LSM storage paradigm, to support high-throughput and highly dynamic continuous lookup queries. Our framework naturally supports subscriptions on both historic and future streaming data, and generates instant notifications. c) Data partitioning: We create optimization techniques for spatial indexes via intelligent partitioning. Currently NoSQL based databases do not offer any spatial partitioning to achieve faster spatial query response. We propose a level-based organization of disk components and two novel component merge techniques that leverage their spatial properties. d) Data replication: Another important feature of big storage systems is its availability and reliability, which is achieved through replication. Paxos is a widely used replication policy to ensure the replicas are in sync. We develop an I/O optimized Paxos-based fault-tolerant block storage replication engine.
590 $a School code: 0032.
650 4 $a Computer Engineering. $3 1567821
650 4 $a Computer science. $3 523869
690 $a 0464
690 $a 0984
710 2 $a University of California, Riverside. $b Computer Science. $3 1680199
773 0 $t Dissertations Abstracts International $g 80-05B.
790 $a 0032
791 $a Ph.D.
792 $a 2018
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10935074