東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁 到查詢結果 [ subject:"Computer engineering." ]

切換: 標籤 | MARC模式 | ISBD

Study of Parallel Programming Models...

Lai, Chenggang.

FindBook

Google Book

Amazon

博客來

Study of Parallel Programming Models on Computer Clusters with Accelerators.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Study of Parallel Programming Models on Computer Clusters with Accelerators./
作者:	Lai, Chenggang.
面頁冊數:	47 p.
附註:	Source: Masters Abstracts International, Volume: 54-02.
Contained By:	Masters Abstracts International54-02(E).
標題:	Computer engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1570873
ISBN:	9781321401288

Study of Parallel Programming Models on Computer Clusters with Accelerators.
Lai, Chenggang.

Study of Parallel Programming Models on Computer Clusters with Accelerators. - 47 p.

Source: Masters Abstracts International, Volume: 54-02.

Thesis (M.S.)--University of Arkansas, 2015.

This item must not be sold to any third party vendors.

In order to reach exascale computing capability, accelerators have become a crucial part in developing supercomputers. This work examines the potential of two latest acceleration technologies, Intel Many Integrated Core (MIC) Architecture and Graphics Processing Units (GPUs). This thesis applies three benchmarks under 3 different configurations, MPI+CPU, MPI+GPU, and MPI+MIC. The benchmarks include intensely communicating application, loosely communicating application, and embarrassingly parallel application. This thesis also carries out a detailed study on the scalability and performance of MIC processors under two programming models, i.e., offload model and native model, on the Beacon computer cluster.

ISBN: 9781321401288Subjects--Topical Terms:

621879
Computer engineering.

Study of Parallel Programming Models on Computer Clusters with Accelerators.
LDR:03501nmm a2200289 4500 001 2063972
005 20151106144751.5
008 170521s2015 ||||||||||||||||| ||eng d
020 $a 9781321401288
035 $a (MiAaPQ)AAI1570873
035 $a AAI1570873
040 $a MiAaPQ $c MiAaPQ
100 1 $a Lai, Chenggang. $3 3178529
245 1 0 $a Study of Parallel Programming Models on Computer Clusters with Accelerators.
300 $a 47 p.
500 $a Source: Masters Abstracts International, Volume: 54-02.
500 $a Adviser: Miaoqing Huang.
502 $a Thesis (M.S.)--University of Arkansas, 2015.
506 $a This item must not be sold to any third party vendors.
520 $a In order to reach exascale computing capability, accelerators have become a crucial part in developing supercomputers. This work examines the potential of two latest acceleration technologies, Intel Many Integrated Core (MIC) Architecture and Graphics Processing Units (GPUs). This thesis applies three benchmarks under 3 different configurations, MPI+CPU, MPI+GPU, and MPI+MIC. The benchmarks include intensely communicating application, loosely communicating application, and embarrassingly parallel application. This thesis also carries out a detailed study on the scalability and performance of MIC processors under two programming models, i.e., offload model and native model, on the Beacon computer cluster.
520 $a According to different benchmarks, the results demonstrate different performance and scalability between GPU and MIC. (1) For embarrassingly parallel case, GPU-based parallel implementation on Keeneland computer cluster has a better performance than other accelerators. However, MIC-based parallel implementation shows a better scalability than the implementation on GPU. The performances of native model and offload model on MIC are very close. (2) For loosely communicating case, the performances on GPU and MIC are very close. The MIC-based parallel implementation still demonstrates a strong scalability when using 120 MIC processors in computation. (3) For the intensely communicating case, the MPI implementations on CPUs and GPUs both have a strong scalability. GPUs can consistently outperform other accelerators. However, the MIC-based implementation cannot scale quite well. The performance of different models on MIC is different from the performance of embarrassingly parallel case. Native model can consistently outperform the offload model by ~10 times. And there is not much performance gain when allocating more MIC processors. The increase of communication cost will offset the performance gain from the reduced workload on each MIC core. This work also tests the performance capabilities and scalability by changing the number of threads on each MIC card form 10 to 60. When using different number of threads for the intensely communicating case, it shows different capabilities of the MIC based offload model. The scalability can hold when the number of threads increases from 10 to 30, and the computation time reduces with a smaller rate from 30 threads to 50 threads. When using 60 threads, the computation time will increase. The reason is that the communication overhead will offset the performance gain when 60 threads are deployed on a single MIC card.
590 $a School code: 0011.
650 4 $a Computer engineering. $3 621879
690 $a 0464
710 2 $a University of Arkansas. $b Computer Engineering. $3 1672578
773 0 $t Masters Abstracts International $g 54-02(E).
790 $a 0011
791 $a M.S.
792 $a 2015
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=1570873