語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Thread criticality and TLB enhanceme...
~
Bhattacharjee, Abhishek.
FindBook
Google Book
Amazon
博客來
Thread criticality and TLB enhancement techniques for chip multiprocessors.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Thread criticality and TLB enhancement techniques for chip multiprocessors./
作者:
Bhattacharjee, Abhishek.
面頁冊數:
157 p.
附註:
Source: Dissertation Abstracts International, Volume: 71-10, Section: B, page: 6309.
Contained By:
Dissertation Abstracts International71-10B.
標題:
Engineering, Computer. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3424098
ISBN:
9781124230757
Thread criticality and TLB enhancement techniques for chip multiprocessors.
Bhattacharjee, Abhishek.
Thread criticality and TLB enhancement techniques for chip multiprocessors.
- 157 p.
Source: Dissertation Abstracts International, Volume: 71-10, Section: B, page: 6309.
Thesis (Ph.D.)--Princeton University, 2010.
Numerous technology trends including debilitating power densities and rising verification costs have recently prompted a shift to multicore or chip multiprocessor (CMP) architectures. Despite their benefits, CMPs face a number of design challenges. A key challenge is how best to architect the on-chip memory hierarchy, which plays a key role in determining system performance and power characteristics.
ISBN: 9781124230757Subjects--Topical Terms:
1669061
Engineering, Computer.
Thread criticality and TLB enhancement techniques for chip multiprocessors.
LDR
:05380nam 2200349 4500
001
1403168
005
20111111141819.5
008
130515s2010 ||||||||||||||||| ||eng d
020
$a
9781124230757
035
$a
(UMI)AAI3424098
035
$a
AAI3424098
040
$a
UMI
$c
UMI
100
1
$a
Bhattacharjee, Abhishek.
$3
1682417
245
1 0
$a
Thread criticality and TLB enhancement techniques for chip multiprocessors.
300
$a
157 p.
500
$a
Source: Dissertation Abstracts International, Volume: 71-10, Section: B, page: 6309.
500
$a
Adviser: Margaret R. Martonosi.
502
$a
Thesis (Ph.D.)--Princeton University, 2010.
520
$a
Numerous technology trends including debilitating power densities and rising verification costs have recently prompted a shift to multicore or chip multiprocessor (CMP) architectures. Despite their benefits, CMPs face a number of design challenges. A key challenge is how best to architect the on-chip memory hierarchy, which plays a key role in determining system performance and power characteristics.
520
$a
This thesis presents a top-down analysis, from the application-level down to the microarchitectural layer, of the role of the on-chip memory hierarchy in determining the performance and power of emerging parallel workloads. Analysis shows that two primary sources of overhead in parallel program performance arise due to imperfections in the on-chip memory. The first is the variation in execution speeds that multiple threads of a parallel program experience. As this thesis will show, this difference in thread criticality results in performance and energy degradation. The second source of overhead arises from the fact that emerging parallel workloads tend to stress their Translation Lookaside Buffers (TLBs) significantly. As application working sets increase, we show that modern TLBs experience notable miss rates, resulting in performance overheads.
520
$a
Based on these observations, this thesis presents the first full-system characterization of the roles of thread criticality and TLB behavior in determining system performance. Using a combination of real-system profiling, full-system simulation, and FPGA-based emulation techniques, this thesis characterizes the causes of thread criticality and increasing TLB pressure. First, this work shows that cache misses are the primary cause of differing thread speeds. Specifically, threads that experience a greater number of cache misses run slower than their better-cached counterparts. Using this simple but powerful intuition, this thesis proposes thread criticality predictors with 93% accuracy. This thesis will also explore the usefulness of these criticality predictors for various resource management techniques on CMPs. Second, this work then characterizes the prevalence of TLB misses, showing that while parallel workloads experience high TLB miss rates, 30% to 95% of them can be classified as predictable. This predictability arises in two ways. First, multiple cores often TLB miss on the same translation. Second, cores often TLB miss on entries with virtual pages placed a predictable stride from one another.
520
$a
This thesis then builds upon our workload characterization by proposing techniques to improve the on-chip memory hierarchy. First, I show how cache-based thread criticality prediction can improve parallel program performance by off-loading work from critical to non-critical threads. Specifically, Intel TBB's task stealing mechanism is augmented with criticality prediction to yield 21% average performance improvements. Second, this thesis shows that by estimating which threads are non-critical and by how much, critical threads may be run at a high clock rate while the others are slowed down, achieving 15% average energy savings. While this thesis focuses on these specific applications, we discuss the versatility of thread criticality prediction and how it may be applied in additional scenarios.
520
$a
This thesis then uses the TLB characterization to propose TLB enhancement techniques. By leveraging the classes of predictable TLB misses, we propose and evaluate two techniques that use inter-core cooperation to eliminate TLB misses. First, I show the benefits of Inter-Core Cooperative (ICC) prefetching schemes, in which Leader-Follower prefetching exploits TLB misses experienced by multiple cores while Distance-based Cross-Core prefetching captures the presence of regular inter-core strides. Combining these approaches, ICC prefetching techniques can eliminate 19% to 90% of system misses. I then propose an alternative to ICC prefetching, Shared Last-Level (SLL) TLBs, which eliminate 7% to 79% of system TLB misses.
520
$a
Overall, this thesis is the first to show the importance of thread criticality and TLB enhancement techniques for parallel programs on CMPs. Moreover, as CMPs experience increased core counts, heterogeneity, and application memory footprints increase, these techniques will be essential in apportioning system resources intelligently among multiple contending threads.
590
$a
School code: 0181.
650
4
$a
Engineering, Computer.
$3
1669061
650
4
$a
Engineering, Electronics and Electrical.
$3
626636
650
4
$a
Computer Science.
$3
626642
690
$a
0464
690
$a
0544
690
$a
0984
710
2
$a
Princeton University.
$3
645579
773
0
$t
Dissertation Abstracts International
$g
71-10B.
790
1 0
$a
Martonosi, Margaret R.,
$e
advisor
790
$a
0181
791
$a
Ph.D.
792
$a
2010
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3424098
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9166307
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入