Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Improving communication performance ...
~
Faraji, Iman.
Linked to FindBook
Google Book
Amazon
博客來
Improving communication performance in GPU-accelerated HPC clusters.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Improving communication performance in GPU-accelerated HPC clusters./
Author:
Faraji, Iman.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2018,
Description:
193 p.
Notes:
Source: Masters Abstracts International, Volume: 79-08.
Contained By:
Masters Abstracts International79-08.
Subject:
Computer Engineering. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10760623
Improving communication performance in GPU-accelerated HPC clusters.
Faraji, Iman.
Improving communication performance in GPU-accelerated HPC clusters.
- Ann Arbor : ProQuest Dissertations & Theses, 2018 - 193 p.
Source: Masters Abstracts International, Volume: 79-08.
Thesis (Ph.D.)--Queen's University (Canada), 2018.
This item must not be sold to any third party vendors.
In recent years, GPUs have been adopted in many High-Performance Computing (HPC) clusters due to their massive computational power and energy efficiency. The Message Passing Interface (MPI) is the de-facto standard for parallel programming. Many HPC applications, written in MPI, use parallel processes and multiple GPUs to achieve higher performance and GPU memory capacity. In such applications, efficiently performing GPU inter-process communication is the key in the application performance. In this dissertation, we present proposals to improve the GPU inter-process communication in HPC clusters using novel GPU-aware designs, efficient and scalable algorithms, topology-aware designs, and hardware features. Specifically, we propose various approaches to improve the efficiency of MPI communication routines in GPU clusters. We also propose designs that evaluate the total application inter-process communication and provide solutions to improve its efficiency. First, we propose efficient GPU-aware algorithms to improve MPI collective performance. We show the importance of minimizing CPU intervention on GPU collective performance. We also utilize GPU features to enhance both collective communication and computation. As inter-process communications scale to across multi-GPU nodes and clusters, efficient inter-process communication routines must consider the physical structure of the underlying system. Given the hierarchical nature of the GPU clusters with multi-GPU nodes, we propose hierarchy-aware designs for GPU collectives and show that different algorithms are favored at different hierarchy levels. With the presence of multiple data copy mechanisms in modern GPU clusters, it is crucial to make an informed decision on how to use them for efficient inter-process communications. In this regard, we propose designs that intelligently decide which data copy mechanisms to use in GPU collectives. Using these designs, we reveal the importance of using multiple data copy mechanisms in performing multiple inter-process communications. Finally, we provide topology-aware solutions to improve the application inter-process communication efficiency, both within multi-GPU nodes and across GPU clusters. First, we study the performance of different communication channels used for GPU inter-process communications. Next, we propose topology-aware designs that consider both the system physical topology and application communication pattern. These designs improve the communication performance by performing more intensive inter-process communication on stronger communication channels.Subjects--Topical Terms:
1567821
Computer Engineering.
Improving communication performance in GPU-accelerated HPC clusters.
LDR
:03606nmm a2200301 4500
001
2208074
005
20190929184212.5
008
201008s2018 ||||||||||||||||| ||eng d
035
$a
(MiAaPQ)AAI10760623
035
$a
(MiAaPQ)QueensUCan197423833
035
$a
AAI10760623
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Faraji, Iman.
$3
3435086
245
1 0
$a
Improving communication performance in GPU-accelerated HPC clusters.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2018
300
$a
193 p.
500
$a
Source: Masters Abstracts International, Volume: 79-08.
500
$a
Publisher info.: Dissertation/Thesis.
500
$a
Advisor: Afsahi, Ahmad.
502
$a
Thesis (Ph.D.)--Queen's University (Canada), 2018.
506
$a
This item must not be sold to any third party vendors.
520
$a
In recent years, GPUs have been adopted in many High-Performance Computing (HPC) clusters due to their massive computational power and energy efficiency. The Message Passing Interface (MPI) is the de-facto standard for parallel programming. Many HPC applications, written in MPI, use parallel processes and multiple GPUs to achieve higher performance and GPU memory capacity. In such applications, efficiently performing GPU inter-process communication is the key in the application performance. In this dissertation, we present proposals to improve the GPU inter-process communication in HPC clusters using novel GPU-aware designs, efficient and scalable algorithms, topology-aware designs, and hardware features. Specifically, we propose various approaches to improve the efficiency of MPI communication routines in GPU clusters. We also propose designs that evaluate the total application inter-process communication and provide solutions to improve its efficiency. First, we propose efficient GPU-aware algorithms to improve MPI collective performance. We show the importance of minimizing CPU intervention on GPU collective performance. We also utilize GPU features to enhance both collective communication and computation. As inter-process communications scale to across multi-GPU nodes and clusters, efficient inter-process communication routines must consider the physical structure of the underlying system. Given the hierarchical nature of the GPU clusters with multi-GPU nodes, we propose hierarchy-aware designs for GPU collectives and show that different algorithms are favored at different hierarchy levels. With the presence of multiple data copy mechanisms in modern GPU clusters, it is crucial to make an informed decision on how to use them for efficient inter-process communications. In this regard, we propose designs that intelligently decide which data copy mechanisms to use in GPU collectives. Using these designs, we reveal the importance of using multiple data copy mechanisms in performing multiple inter-process communications. Finally, we provide topology-aware solutions to improve the application inter-process communication efficiency, both within multi-GPU nodes and across GPU clusters. First, we study the performance of different communication channels used for GPU inter-process communications. Next, we propose topology-aware designs that consider both the system physical topology and application communication pattern. These designs improve the communication performance by performing more intensive inter-process communication on stronger communication channels.
590
$a
School code: 0283.
650
4
$a
Computer Engineering.
$3
1567821
690
$a
0464
710
2
$a
Queen's University (Canada).
$3
1017786
773
0
$t
Masters Abstracts International
$g
79-08.
790
$a
0283
791
$a
Ph.D.
792
$a
2018
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10760623
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9384623
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login