Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Improving MapReduce performance in l...
~
Ahmad, Faraz.
Linked to FindBook
Google Book
Amazon
博客來
Improving MapReduce performance in large-scale clusters.
Record Type:
Language materials, printed : Monograph/item
Title/Author:
Improving MapReduce performance in large-scale clusters./
Author:
Ahmad, Faraz.
Description:
136 p.
Notes:
Source: Dissertation Abstracts International, Volume: 75-04(E), Section: B.
Contained By:
Dissertation Abstracts International75-04B(E).
Subject:
Engineering, Computer. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3604695
ISBN:
9781303605406
Improving MapReduce performance in large-scale clusters.
Ahmad, Faraz.
Improving MapReduce performance in large-scale clusters.
- 136 p.
Source: Dissertation Abstracts International, Volume: 75-04(E), Section: B.
Thesis (Ph.D.)--Purdue University, 2013.
The evolution of big data has led enterprises to seek time efficient and cost affordable solutions for processing large volumes of raw data on clusters of commodity hardware. MapReduce is a well-known programming model from Google for large-scale data processing which provides automatic data management and fault tolerance to improve programmability of clusters. MapReductions are extensively used in clusters not only to provide up-to-date organized data for interactive workloads such as search engines and social networks, but also to perform time-critical data analytics for retail enterprises as well as financial markets. Improving the performance of MapReductions becomes particularly important because of (i) time-critical nature of MapReductions, (ii) savings in important machine hours, and (iii) cost-effective cloud solutions for users and enterprises.
ISBN: 9781303605406Subjects--Topical Terms:
1669061
Engineering, Computer.
Improving MapReduce performance in large-scale clusters.
LDR
:02896nam a2200289 4500
001
1965744
005
20141029122203.5
008
150210s2013 ||||||||||||||||| ||eng d
020
$a
9781303605406
035
$a
(MiAaPQ)AAI3604695
035
$a
AAI3604695
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Ahmad, Faraz.
$3
2102450
245
1 0
$a
Improving MapReduce performance in large-scale clusters.
300
$a
136 p.
500
$a
Source: Dissertation Abstracts International, Volume: 75-04(E), Section: B.
500
$a
Adviser: T. N. Vijaykumar.
502
$a
Thesis (Ph.D.)--Purdue University, 2013.
520
$a
The evolution of big data has led enterprises to seek time efficient and cost affordable solutions for processing large volumes of raw data on clusters of commodity hardware. MapReduce is a well-known programming model from Google for large-scale data processing which provides automatic data management and fault tolerance to improve programmability of clusters. MapReductions are extensively used in clusters not only to provide up-to-date organized data for interactive workloads such as search engines and social networks, but also to perform time-critical data analytics for retail enterprises as well as financial markets. Improving the performance of MapReductions becomes particularly important because of (i) time-critical nature of MapReductions, (ii) savings in important machine hours, and (iii) cost-effective cloud solutions for users and enterprises.
520
$a
The main thrust of the thesis is to address the MapReduce performance problems caused by an all-Map-to-all-Reduce communication, called the Shuffle, across the network bisection. Many MapReductions move large amounts of data (e.g., as much as the input data) during the Shuffle, stressing the bisection bandwidth and introducing significant runtime overhead. In this work, I make four contributions. First, I propose techniques to overlap Shuffle communication with Reduce computation to improve MapReduce performance (MaRCO) in homogeneous clusters. Second, I propose a suite of optimizations (Tarazu) that perform communication- and computation-aware load balancing to improve performance on heterogeneous clusters. Third, I identify performance bottlenecks in multi-tenant clusters due to Shuffle, and exploit a key trade-off between intra-job concurrency and data locality (ShuffleWatcher) to shape and reduce Shuffle traffic in multi-tenant clusters. Finally, I establish a benchmark suite (PUMA) of real-world applications that represents a broad range of MapReductions exhibiting application characteristics with varying computation and communication demands.
590
$a
School code: 0183.
650
4
$a
Engineering, Computer.
$3
1669061
650
4
$a
Computer Science.
$3
626642
690
$a
0464
690
$a
0984
710
2
$a
Purdue University.
$b
Electrical and Computer Engineering.
$3
1018497
773
0
$t
Dissertation Abstracts International
$g
75-04B(E).
790
$a
0183
791
$a
Ph.D.
792
$a
2013
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3604695
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9260743
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login