語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Optimization techniques for mapping ...
~
Wu, Jing.
FindBook
Google Book
Amazon
博客來
Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms./
作者:
Wu, Jing.
面頁冊數:
179 p.
附註:
Source: Dissertation Abstracts International, Volume: 76-03(E), Section: B.
Contained By:
Dissertation Abstracts International76-03B(E).
標題:
Engineering, Computer. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3644509
ISBN:
9781321327618
Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms.
Wu, Jing.
Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms.
- 179 p.
Source: Dissertation Abstracts International, Volume: 76-03(E), Section: B.
Thesis (Ph.D.)--University of Maryland, College Park, 2014.
This item must not be sold to any third party vendors.
An emerging trend in processor architecture seems to indicate the doubling of the number of cores per chip every two years with same or decreased clock speed. Of particular interest to this thesis is the class of many-core processors, which are becoming more attractive due to their high performance, low cost, and low power consumption. The main goal of this dissertation is to develop optimization techniques for mapping algorithms and applications onto CUDA GPUs and CPU-GPU heterogeneous platforms.
ISBN: 9781321327618Subjects--Topical Terms:
1669061
Engineering, Computer.
Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms.
LDR
:03283nmm a2200325 4500
001
2056406
005
20150526083649.5
008
170521s2014 ||||||||||||||||| ||eng d
020
$a
9781321327618
035
$a
(MiAaPQ)AAI3644509
035
$a
AAI3644509
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Wu, Jing.
$3
1001228
245
1 0
$a
Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms.
300
$a
179 p.
500
$a
Source: Dissertation Abstracts International, Volume: 76-03(E), Section: B.
500
$a
Adviser: Joseph F. JaJa.
502
$a
Thesis (Ph.D.)--University of Maryland, College Park, 2014.
506
$a
This item must not be sold to any third party vendors.
520
$a
An emerging trend in processor architecture seems to indicate the doubling of the number of cores per chip every two years with same or decreased clock speed. Of particular interest to this thesis is the class of many-core processors, which are becoming more attractive due to their high performance, low cost, and low power consumption. The main goal of this dissertation is to develop optimization techniques for mapping algorithms and applications onto CUDA GPUs and CPU-GPU heterogeneous platforms.
520
$a
The Fast Fourier transform (FFT) constitutes a fundamental tool in computational science and engineering, and hence a GPU-optimized implementation is of paramount importance. We first study the mapping of the 3D FFT onto the recent, CUDA GPUs and develop a new approach that minimizes the number of global memory accesses and overlaps the computations along the different dimensions. We obtain some of the fastest known implementations for the computation of multi-dimensional FFT.
520
$a
We then present a highly multithreaded FFT-based direct Poisson solver that is optimized for the recent NVIDIA GPUs. In addition to the massive multithreading, our algorithm carefully manages the multiple layers of the memory hierarchy so that all global memory accesses are coalesced into 128-bytes device memory transactions. As a result, we have achieved up to 375GFLOPS with a bandwidth of 120GB/s on the GTX 480.
520
$a
We further extend our methodology to deal with CPU-GPU based heterogeneous platforms for the case when the input is too large to fit on the GPU global memory. We develop optimization techniques for memory-bound, and computation-bound application. The main challenge here is to minimize data transfer between the CPU memory and the device memory and to overlap as much as possible these transfers with kernel execution. For memory-bounded applications, we achieve a near-peak effective PCIe bus bandwidth, 9-10GB/s and performance as high as 145 GFLOPS for multi-dimensional FFT computations and for solving the Poisson equation. We extend our CPU-GPU based software pipeline to a computation-bound application-DGEMM, and achieve the illusion of a memory of the CPU memory size and a computation throughput similar to a pure GPU.
590
$a
School code: 0117.
650
4
$a
Engineering, Computer.
$3
1669061
650
4
$a
Computer Science.
$3
626642
690
$a
0464
690
$a
0984
710
2
$a
University of Maryland, College Park.
$b
Electrical Engineering.
$3
1018746
773
0
$t
Dissertation Abstracts International
$g
76-03B(E).
790
$a
0117
791
$a
Ph.D.
792
$a
2014
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3644509
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9288895
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入