Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Performance and Power Optimization o...
~
Wang, Yue.
Linked to FindBook
Google Book
Amazon
博客來
Performance and Power Optimization of GPU Architectures for General-purpose Computing.
Record Type:
Language materials, printed : Monograph/item
Title/Author:
Performance and Power Optimization of GPU Architectures for General-purpose Computing./
Author:
Wang, Yue.
Description:
106 p.
Notes:
Source: Dissertation Abstracts International, Volume: 75-11(E), Section: B.
Contained By:
Dissertation Abstracts International75-11B(E).
Subject:
Engineering, Computer. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3631053
ISBN:
9781321093193
Performance and Power Optimization of GPU Architectures for General-purpose Computing.
Wang, Yue.
Performance and Power Optimization of GPU Architectures for General-purpose Computing.
- 106 p.
Source: Dissertation Abstracts International, Volume: 75-11(E), Section: B.
Thesis (Ph.D.)--University of South Florida, 2014.
Power-performance efficiency has become a central focus that is challenging in heterogeneous processing platforms as the power constraints have to be established without hindering the high performance. In this dissertation, a framework for optimizing the power and performance of GPUs in the context of general-purpose computing in GPUs (GPGPU) is proposed. To optimize the leakage power of caches in GPUs, we dynamically switch the L1 and L2 caches into low power modes during periods of inactivity to reduce leakage power. The L1 cache can be put into a low-leakage (sleep) state when a processing unit is stalled due to no ready threads to be scheduled and the L2 can be put into sleep state during its idle period when there is no memory request. The sleep mode is state-retentive, which obviates the necessity to flush the caches after they are woken up, thereby, avoiding any performance degradation. Experimental results indicate that this technique can reduce the leakage power by 52% on average. Further, to improve performance, we redistribute the GPGPU workload across the computing units of the GPU during application execution. The fundamental idea is to monitor the workload on each multi-processing unit and redistribute it by having a portion of its unfinished threads executed in a neighboring multi-processing unit. Experimental results show this technique improves the performance of the GPGPU workload by 15.7%. Finally, to improve both performance and dynamic power of GPUs, we propose two dynamic frequency scaling (DFS) techniques implemented on CPU host threads, one of which is motivated by the significance of the pipeline stalls during GPGPU execution. It applies a feedback controlling algorithm, Proportional-Integral-Derivative (PID), to regulate the frequency of parallel processors and memory channels based on the occupancy of the memory buffering queues. The other technique targets on maximizing the average throughput of all parallel processors under the dynamic power constraints. We formalize this target as a linear programming problem and solve it on the runtime. According to the simulation results, the first technique achieves more than 22% power savings with a 4% improvement in performance and the second technique saves 11% power consumption with 9% performance improvement. The contributions of this dissertation represent a significant advancement in the quest for improving performance and reducing energy consumption of GPGPU.
ISBN: 9781321093193Subjects--Topical Terms:
1669061
Engineering, Computer.
Performance and Power Optimization of GPU Architectures for General-purpose Computing.
LDR
:03348nam a2200265 4500
001
1967104
005
20141112075802.5
008
150210s2014 ||||||||||||||||| ||eng d
020
$a
9781321093193
035
$a
(MiAaPQ)AAI3631053
035
$a
AAI3631053
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Wang, Yue.
$3
1908045
245
1 0
$a
Performance and Power Optimization of GPU Architectures for General-purpose Computing.
300
$a
106 p.
500
$a
Source: Dissertation Abstracts International, Volume: 75-11(E), Section: B.
500
$a
Adviser: Nagarajan Ranganathan.
502
$a
Thesis (Ph.D.)--University of South Florida, 2014.
520
$a
Power-performance efficiency has become a central focus that is challenging in heterogeneous processing platforms as the power constraints have to be established without hindering the high performance. In this dissertation, a framework for optimizing the power and performance of GPUs in the context of general-purpose computing in GPUs (GPGPU) is proposed. To optimize the leakage power of caches in GPUs, we dynamically switch the L1 and L2 caches into low power modes during periods of inactivity to reduce leakage power. The L1 cache can be put into a low-leakage (sleep) state when a processing unit is stalled due to no ready threads to be scheduled and the L2 can be put into sleep state during its idle period when there is no memory request. The sleep mode is state-retentive, which obviates the necessity to flush the caches after they are woken up, thereby, avoiding any performance degradation. Experimental results indicate that this technique can reduce the leakage power by 52% on average. Further, to improve performance, we redistribute the GPGPU workload across the computing units of the GPU during application execution. The fundamental idea is to monitor the workload on each multi-processing unit and redistribute it by having a portion of its unfinished threads executed in a neighboring multi-processing unit. Experimental results show this technique improves the performance of the GPGPU workload by 15.7%. Finally, to improve both performance and dynamic power of GPUs, we propose two dynamic frequency scaling (DFS) techniques implemented on CPU host threads, one of which is motivated by the significance of the pipeline stalls during GPGPU execution. It applies a feedback controlling algorithm, Proportional-Integral-Derivative (PID), to regulate the frequency of parallel processors and memory channels based on the occupancy of the memory buffering queues. The other technique targets on maximizing the average throughput of all parallel processors under the dynamic power constraints. We formalize this target as a linear programming problem and solve it on the runtime. According to the simulation results, the first technique achieves more than 22% power savings with a 4% improvement in performance and the second technique saves 11% power consumption with 9% performance improvement. The contributions of this dissertation represent a significant advancement in the quest for improving performance and reducing energy consumption of GPGPU.
590
$a
School code: 0206.
650
4
$a
Engineering, Computer.
$3
1669061
690
$a
0464
710
2
$a
University of South Florida.
$b
Computer Science and Engineering.
$3
1682850
773
0
$t
Dissertation Abstracts International
$g
75-11B(E).
790
$a
0206
791
$a
Ph.D.
792
$a
2014
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3631053
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9262110
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login