語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Improving communication performance ...
~
Faraji, Iman.
FindBook
Google Book
Amazon
博客來
Improving communication performance in GPU-accelerated HPC clusters.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Improving communication performance in GPU-accelerated HPC clusters./
作者:
Faraji, Iman.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2018,
面頁冊數:
193 p.
附註:
Source: Masters Abstracts International, Volume: 79-08.
Contained By:
Masters Abstracts International79-08.
標題:
Computer Engineering. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10760623
Improving communication performance in GPU-accelerated HPC clusters.
Faraji, Iman.
Improving communication performance in GPU-accelerated HPC clusters.
- Ann Arbor : ProQuest Dissertations & Theses, 2018 - 193 p.
Source: Masters Abstracts International, Volume: 79-08.
Thesis (Ph.D.)--Queen's University (Canada), 2018.
This item must not be sold to any third party vendors.
In recent years, GPUs have been adopted in many High-Performance Computing (HPC) clusters due to their massive computational power and energy efficiency. The Message Passing Interface (MPI) is the de-facto standard for parallel programming. Many HPC applications, written in MPI, use parallel processes and multiple GPUs to achieve higher performance and GPU memory capacity. In such applications, efficiently performing GPU inter-process communication is the key in the application performance. In this dissertation, we present proposals to improve the GPU inter-process communication in HPC clusters using novel GPU-aware designs, efficient and scalable algorithms, topology-aware designs, and hardware features. Specifically, we propose various approaches to improve the efficiency of MPI communication routines in GPU clusters. We also propose designs that evaluate the total application inter-process communication and provide solutions to improve its efficiency. First, we propose efficient GPU-aware algorithms to improve MPI collective performance. We show the importance of minimizing CPU intervention on GPU collective performance. We also utilize GPU features to enhance both collective communication and computation. As inter-process communications scale to across multi-GPU nodes and clusters, efficient inter-process communication routines must consider the physical structure of the underlying system. Given the hierarchical nature of the GPU clusters with multi-GPU nodes, we propose hierarchy-aware designs for GPU collectives and show that different algorithms are favored at different hierarchy levels. With the presence of multiple data copy mechanisms in modern GPU clusters, it is crucial to make an informed decision on how to use them for efficient inter-process communications. In this regard, we propose designs that intelligently decide which data copy mechanisms to use in GPU collectives. Using these designs, we reveal the importance of using multiple data copy mechanisms in performing multiple inter-process communications. Finally, we provide topology-aware solutions to improve the application inter-process communication efficiency, both within multi-GPU nodes and across GPU clusters. First, we study the performance of different communication channels used for GPU inter-process communications. Next, we propose topology-aware designs that consider both the system physical topology and application communication pattern. These designs improve the communication performance by performing more intensive inter-process communication on stronger communication channels.Subjects--Topical Terms:
1567821
Computer Engineering.
Improving communication performance in GPU-accelerated HPC clusters.
LDR
:03606nmm a2200301 4500
001
2208074
005
20190929184212.5
008
201008s2018 ||||||||||||||||| ||eng d
035
$a
(MiAaPQ)AAI10760623
035
$a
(MiAaPQ)QueensUCan197423833
035
$a
AAI10760623
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Faraji, Iman.
$3
3435086
245
1 0
$a
Improving communication performance in GPU-accelerated HPC clusters.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2018
300
$a
193 p.
500
$a
Source: Masters Abstracts International, Volume: 79-08.
500
$a
Publisher info.: Dissertation/Thesis.
500
$a
Advisor: Afsahi, Ahmad.
502
$a
Thesis (Ph.D.)--Queen's University (Canada), 2018.
506
$a
This item must not be sold to any third party vendors.
520
$a
In recent years, GPUs have been adopted in many High-Performance Computing (HPC) clusters due to their massive computational power and energy efficiency. The Message Passing Interface (MPI) is the de-facto standard for parallel programming. Many HPC applications, written in MPI, use parallel processes and multiple GPUs to achieve higher performance and GPU memory capacity. In such applications, efficiently performing GPU inter-process communication is the key in the application performance. In this dissertation, we present proposals to improve the GPU inter-process communication in HPC clusters using novel GPU-aware designs, efficient and scalable algorithms, topology-aware designs, and hardware features. Specifically, we propose various approaches to improve the efficiency of MPI communication routines in GPU clusters. We also propose designs that evaluate the total application inter-process communication and provide solutions to improve its efficiency. First, we propose efficient GPU-aware algorithms to improve MPI collective performance. We show the importance of minimizing CPU intervention on GPU collective performance. We also utilize GPU features to enhance both collective communication and computation. As inter-process communications scale to across multi-GPU nodes and clusters, efficient inter-process communication routines must consider the physical structure of the underlying system. Given the hierarchical nature of the GPU clusters with multi-GPU nodes, we propose hierarchy-aware designs for GPU collectives and show that different algorithms are favored at different hierarchy levels. With the presence of multiple data copy mechanisms in modern GPU clusters, it is crucial to make an informed decision on how to use them for efficient inter-process communications. In this regard, we propose designs that intelligently decide which data copy mechanisms to use in GPU collectives. Using these designs, we reveal the importance of using multiple data copy mechanisms in performing multiple inter-process communications. Finally, we provide topology-aware solutions to improve the application inter-process communication efficiency, both within multi-GPU nodes and across GPU clusters. First, we study the performance of different communication channels used for GPU inter-process communications. Next, we propose topology-aware designs that consider both the system physical topology and application communication pattern. These designs improve the communication performance by performing more intensive inter-process communication on stronger communication channels.
590
$a
School code: 0283.
650
4
$a
Computer Engineering.
$3
1567821
690
$a
0464
710
2
$a
Queen's University (Canada).
$3
1017786
773
0
$t
Masters Abstracts International
$g
79-08.
790
$a
0283
791
$a
Ph.D.
792
$a
2018
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10760623
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9384623
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入