東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Improving Multi-GPU Strong Scaling through Optimization of Fine-Grained Transfers.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Improving Multi-GPU Strong Scaling through Optimization of Fine-Grained Transfers./
作者:	Muthukrishnan, Harini.
面頁冊數:	1 online resource (139 pages)
附註:	Source: Dissertations Abstracts International, Volume: 84-01, Section: B.
Contained By:	Dissertations Abstracts International84-01B.
標題:	Computer engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29274985click for full text (PQDT)
ISBN:	9798438776116

Improving Multi-GPU Strong Scaling through Optimization of Fine-Grained Transfers.
Muthukrishnan, Harini.

Improving Multi-GPU Strong Scaling through Optimization of Fine-Grained Transfers. - 1 online resource (139 pages)

Source: Dissertations Abstracts International, Volume: 84-01, Section: B.

Thesis (Ph.D.)--University of Michigan, 2022.

Includes bibliographical references

Despite dramatic improvements in GPU and interconnect architectures, inter-GPU communication remains the most significant architectural bottleneck in multi-GPU systems. With hundreds of thousands of independent concurrently executing threads, maximizing interconnect utilization without degrading computational efficiency when strong-scaling HPC workloads is an open problem. In this dissertation, I will explore fine-grained peer-to-peer stores as the communication paradigm for improved multi-GPU strong scaling and propose three solutions to overcome the limitations of existing GPU and interconnect architectures to benefit from such transfers. First, I will detail PROACT, a joint compile and runtime system that transparently fine-tunes inter-GPU data movement for each application's needs, thus achieving the interconnect efficiency of bulk transfers at the programming simplicity of peer-to-peer stores. Next, I will demonstrate how GPS, a HW/SW memory management technique, employs selective page replication and proactive remote stores to improve read locality while conserving the interconnect bandwidth. Finally, I will discuss FinePack, a set of architectural enhancements to overcome the limitations of existing multi-GPU interconnects to perform small transfers efficiently.

Electronic reproduction.
Ann Arbor, Mich. :
ProQuest,
2023

Mode of access: World Wide Web

ISBN: 9798438776116Subjects--Topical Terms:

621879
Computer engineering.
Subjects--Index Terms:

Fine-grained transfersIndex Terms--Genre/Form:

542853
Electronic books.

Improving Multi-GPU Strong Scaling through Optimization of Fine-Grained Transfers.
LDR:02693nmm a2200385K 4500 001 2354476
005 20230414084807.5
006 m o d
007 cr mn ---uuuuu
008 241011s2022 xx obm 000 0 eng d
020 $a 9798438776116
035 $a (MiAaPQ)AAI29274985
035 $a (MiAaPQ)umichrackham004116
035 $a AAI29274985
040 $a MiAaPQ $b eng $c MiAaPQ $d NTU
100 1 $a Muthukrishnan, Harini. $3 3694827
245 1 0 $a Improving Multi-GPU Strong Scaling through Optimization of Fine-Grained Transfers.
264 0 $c 2022
300 $a 1 online resource (139 pages)
336 $a text $b txt $2 rdacontent
337 $a computer $b c $2 rdamedia
338 $a online resource $b cr $2 rdacarrier
500 $a Source: Dissertations Abstracts International, Volume: 84-01, Section: B.
500 $a Advisor: Dreslinski, Ronald G.; Wenisch, Thomas F.
502 $a Thesis (Ph.D.)--University of Michigan, 2022.
504 $a Includes bibliographical references
520 $a Despite dramatic improvements in GPU and interconnect architectures, inter-GPU communication remains the most significant architectural bottleneck in multi-GPU systems. With hundreds of thousands of independent concurrently executing threads, maximizing interconnect utilization without degrading computational efficiency when strong-scaling HPC workloads is an open problem. In this dissertation, I will explore fine-grained peer-to-peer stores as the communication paradigm for improved multi-GPU strong scaling and propose three solutions to overcome the limitations of existing GPU and interconnect architectures to benefit from such transfers. First, I will detail PROACT, a joint compile and runtime system that transparently fine-tunes inter-GPU data movement for each application's needs, thus achieving the interconnect efficiency of bulk transfers at the programming simplicity of peer-to-peer stores. Next, I will demonstrate how GPS, a HW/SW memory management technique, employs selective page replication and proactive remote stores to improve read locality while conserving the interconnect bandwidth. Finally, I will discuss FinePack, a set of architectural enhancements to overcome the limitations of existing multi-GPU interconnects to perform small transfers efficiently.
533 $a Electronic reproduction. $b Ann Arbor, Mich. : $c ProQuest, $d 2023
538 $a Mode of access: World Wide Web
650 4 $a Computer engineering. $3 621879
650 4 $a Computer science. $3 523869
650 4 $a Multimedia communications. $3 590562
653 $a Fine-grained transfers
653 $a Graphics Processing Unit
653 $a Inter-GPU communication
655 7 $a Electronic books. $2 lcsh $3 542853
690 $a 0464
690 $a 0984
690 $a 0558
710 2 $a ProQuest Information and Learning Co. $3 783688
710 2 $a University of Michigan. $b Computer Science & Engineering. $3 3285590
773 0 $t Dissertations Abstracts International $g 84-01B.
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=29274985 $z click for full text (PQDT)