東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

High performance integration of data...

Guo, Zhenhua.

FindBook

Google Book

Amazon

博客來

High performance integration of data parallel file systems and computing: Optimizing MapReduce.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	High performance integration of data parallel file systems and computing: Optimizing MapReduce./
作者:	Guo, Zhenhua.
面頁冊數:	197 p.
附註:	Source: Dissertation Abstracts International, Volume: 74-06(E), Section: B.
Contained By:	Dissertation Abstracts International74-06B(E).
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3550817
ISBN:	9781267881946

High performance integration of data parallel file systems and computing: Optimizing MapReduce.
Guo, Zhenhua.

High performance integration of data parallel file systems and computing: Optimizing MapReduce. - 197 p.

Source: Dissertation Abstracts International, Volume: 74-06(E), Section: B.

Thesis (Ph.D.)--Indiana University, 2012.

This item is not available from ProQuest Dissertations & Theses.

The ongoing data deluge brings parallel and distributed computing into the new data-intensive computing era, where many assumptions made by prior research on grid and High-Performance Computing need to be reviewed to check their validity and explore their performance implication. Data parallel systems, which are different from traditional HPC architecture in that compute nodes and storage nodes are not separated, have been proposed and widely deployed in both industry and academia. Many research issues, which did not exist before or were not under serious consideration, arise in this new architecture and have drastic influence on performance and scalability. MapReduce has been introduced by the information retrieval community, and has quickly demonstrated its usefulness, scalability and applicability. Its adoption of data centered approach yields higher throughput for data-intensive applications.

ISBN: 9781267881946Subjects--Topical Terms:

523869
Computer science.

High performance integration of data parallel file systems and computing: Optimizing MapReduce.
LDR:02884nmm a2200289 4500 001 2078433
005 20161122122604.5
008 170521s2012 ||||||||||||||||| ||eng d
020 $a 9781267881946
035 $a (MiAaPQ)AAI3550817
035 $a AAI3550817
040 $a MiAaPQ $c MiAaPQ
100 1 $a Guo, Zhenhua. $3 2179924
245 1 0 $a High performance integration of data parallel file systems and computing: Optimizing MapReduce.
300 $a 197 p.
500 $a Source: Dissertation Abstracts International, Volume: 74-06(E), Section: B.
500 $a Adviser: Geoffrey Fox.
502 $a Thesis (Ph.D.)--Indiana University, 2012.
506 $a This item is not available from ProQuest Dissertations & Theses.
520 $a The ongoing data deluge brings parallel and distributed computing into the new data-intensive computing era, where many assumptions made by prior research on grid and High-Performance Computing need to be reviewed to check their validity and explore their performance implication. Data parallel systems, which are different from traditional HPC architecture in that compute nodes and storage nodes are not separated, have been proposed and widely deployed in both industry and academia. Many research issues, which did not exist before or were not under serious consideration, arise in this new architecture and have drastic influence on performance and scalability. MapReduce has been introduced by the information retrieval community, and has quickly demonstrated its usefulness, scalability and applicability. Its adoption of data centered approach yields higher throughput for data-intensive applications.
520 $a In this thesis, we present our investigation and improvement of MapReduce. We identify the inefficiencies of various aspects of MapReduce such as data locality, task granularity, resource utilization, and fault tolerance, and propose algorithms to mitigate the performance issues. Extensive evaluation is presented to demonstrate the effectiveness of our proposed algorithms and approaches. Besides, I, along with Yuan Luo and Yiming Sun, observe the inability of MapReduce to utilize cross-domain grid resources, and propose a MapReduce extension called Hierarchical MapReduce (HMR). In addition, to speed up the execution of our bioinformatics data visualization pipelines containing both single-pass and iterative MapReduce jobs, a workflow management system Hybrid MapReduce (HyMR) is presented built by Rang and me upon Hadoop and Twister. The thesis also includes a detailed performance evaluation of Hadoop and some storage systems, and provides useful insights to both framework and application developers.
590 $a School code: 0093.
650 4 $a Computer science. $3 523869
690 $a 0984
710 2 $a Indiana University. $b Computer Sciences. $3 1018516
773 0 $t Dissertation Abstracts International $g 74-06B(E).
790 $a 0093
791 $a Ph.D.
792 $a 2012
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3550817