語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Compiler optimizations for SIMD/GPU/...
~
liu, Jun.
FindBook
Google Book
Amazon
博客來
Compiler optimizations for SIMD/GPU/multicore architectures.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
正題名/作者:
Compiler optimizations for SIMD/GPU/multicore architectures./
作者:
liu, Jun.
面頁冊數:
99 p.
附註:
Source: Dissertation Abstracts International, Volume: 75-03(E), Section: B.
Contained By:
Dissertation Abstracts International75-03B(E).
標題:
Computer Science. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3576556
ISBN:
9781303566028
Compiler optimizations for SIMD/GPU/multicore architectures.
liu, Jun.
Compiler optimizations for SIMD/GPU/multicore architectures.
- 99 p.
Source: Dissertation Abstracts International, Volume: 75-03(E), Section: B.
Thesis (Ph.D.)--The Pennsylvania State University, 2013.
In modern computer architectures, both SIMD (single-instruction multiple-data) instruction set extensions and GPUs can be used to accelerate the general purpose applications. In addition, the multicore machines can potentially provide more computation power for high performance computing with increasing number of cores and deeper cache hierarchies. However, writing high-performance codes manually for these architectures is still tedious and difficult. In particular, the unique characteristics of these architectures may not be fully exploited.
ISBN: 9781303566028Subjects--Topical Terms:
626642
Computer Science.
Compiler optimizations for SIMD/GPU/multicore architectures.
LDR
:04804nam a2200313 4500
001
1960331
005
20140611111837.5
008
150210s2013 ||||||||||||||||| ||eng d
020
$a
9781303566028
035
$a
(MiAaPQ)AAI3576556
035
$a
AAI3576556
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
liu, Jun.
$3
2095965
245
1 0
$a
Compiler optimizations for SIMD/GPU/multicore architectures.
300
$a
99 p.
500
$a
Source: Dissertation Abstracts International, Volume: 75-03(E), Section: B.
500
$a
Adviser: Mahmut Kandemir.
502
$a
Thesis (Ph.D.)--The Pennsylvania State University, 2013.
520
$a
In modern computer architectures, both SIMD (single-instruction multiple-data) instruction set extensions and GPUs can be used to accelerate the general purpose applications. In addition, the multicore machines can potentially provide more computation power for high performance computing with increasing number of cores and deeper cache hierarchies. However, writing high-performance codes manually for these architectures is still tedious and difficult. In particular, the unique characteristics of these architectures may not be fully exploited.
520
$a
Specifically, SIMD instruction set extensions enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. We propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling. of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.
520
$a
On the other hand, GPUs are also being increasingly used in accelerating general-purpose applications, leading to the emergence of GPGPU architectures. New programming models, e.g., Compute Unified Device Architecture (CUDA), have been proposed to facilitate programming general-purpose computations in GPGPUs. However, writing high-performance CUDA codes manually is still tedious and difficult. In particular, the organization of the data in the memory space can greatly affect the performance due to the unique features of a custom GPGPU memory hierarchy. In this work, we propose an automatic data layout transformation framework to solve the key issues associated with a GPGPU memory hierarchy (i.e., channel skewing, data coalescing, and bank conflicts). Our approach employs a widely applicable strategy based on a novel concept called data localization. Specifically, we try to optimize the layout of the arrays accessed in kernels mapped to GPGPUs, for both the device memory and shared memory, at both coarse grain and fine grain parallelization levels.
520
$a
In addition, iteration space tiling is an important technique for optimizing loops that constitute a large fraction of execution times in computation kernels of both scientific codes and embedded applications. While tiling has been studied extensively in the context of both uniprocessor and multiprocessor platforms, prior research has paid less attention to tile scheduling, especially when targeting multicore machines with deep on-chip cache hierarchies. We propose a cache hierarchy-aware tile scheduling algorithm for multicore machines, with the purpose of maximizing both horizontal and vertical data reuses in on-chip caches, and balancing the workloads across different cores. This scheduling algorithm is one of the key components in a source-to-source translation tool that we developed for automatic loop parallelization and multithreaded code generation from sequential codes. To the best of our knowledge, this is the first effort that develops a fully-automated tile scheduling strategy customized for on-chip cache topologies of multicore machines.
590
$a
School code: 0176.
650
4
$a
Computer Science.
$3
626642
650
4
$a
Engineering, Computer.
$3
1669061
690
$a
0984
690
$a
0464
710
2
$a
The Pennsylvania State University.
$b
Computer Science and Engineering.
$3
2095963
773
0
$t
Dissertation Abstracts International
$g
75-03B(E).
790
$a
0176
791
$a
Ph.D.
792
$a
2013
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3576556
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9255159
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入