東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Optimizing Distributed Computing Sys...

Wang, Hao.

FindBook

Google Book

Amazon

博客來

Optimizing Distributed Computing Systems via Machine Learning.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Optimizing Distributed Computing Systems via Machine Learning./
作者:	Wang, Hao.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2020,
面頁冊數:	156 p.
附註:	Source: Dissertations Abstracts International, Volume: 82-06, Section: B.
Contained By:	Dissertations Abstracts International82-06B.
標題:	Computer engineering. -
電子資源:	https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28031730
ISBN:	9798698546979

Optimizing Distributed Computing Systems via Machine Learning.
Wang, Hao.

Optimizing Distributed Computing Systems via Machine Learning. - Ann Arbor : ProQuest Dissertations & Theses, 2020 - 156 p.

Source: Dissertations Abstracts International, Volume: 82-06, Section: B.

Thesis (Ph.D.)--University of Toronto (Canada), 2020.

This item must not be sold to any third party vendors.

The prosperity of Big Data owes to the advances in distributed computing systems, which make it possible to process massive volumes of data in parallel efficiently. As distributed architectures enable scalable and parallel computation with thousands of servers, critical challenges---allocating resources, scheduling tasks, and optimizing computation algorithms across servers---also arise. The dynamic performance of distributed architectures makes it too volatile to address these challenges with deterministic algorithms. In this dissertation, we apply machine learning techniques to optimize distributed systems. Particularly, we contribute to the following two areas that drive the advance of distributed systems in resource provisioning, task scheduling, and architecture design.Optimizing large-scale data analytics systems. Large-scale data analytics---processing data across geographically distributed datacenters---suffers from fluctuating resource provisioning and heterogeneous demands. However, most of the existing solutions are based on simplified heuristics derived from prior experiences, assuming the network bandwidth is a bottleneck. However, we find that the performance bottleneck is alternating---disk I/O and memory can also become the bottleneck. We propose Lube, a new system that detects bottlenecks at runtime and minimizes job computation times by mitigating the bottlenecks. By examining the execution of analytic queries, we show that existing query optimizers choose the slowest execution plan due to the agnostic of bandwidth fluctuations. In this dissertation, we present Turbo, a system that dynamically optimizes execution plans in response to runtime resource variations.Accelerating distributed machine learning systems. Traditional machine learning collecting training data to a server-based cluster has introduced non-trivial overhead and data privacy issues. To address such problems, we propose to refactor distributed machine learning systems with new architectures. In this dissertation, Siren enables distributed machine learning with serverless architectures and simplifies the way of machine learning development and deployment by learning the optimal resource allocation for machine learning training. We then focus on optimizing federated learning---a new distributed machine learning paradigm---that performs training locally on devices to preserve data privacy. However, the statistical issues posed by the local data significantly slow down existing federated learning algorithms. We present Favor, a framework that optimizes federated learning by carefully selecting devices to address the statistical issues.

ISBN: 9798698546979Subjects--Topical Terms:

621879
Computer engineering.
Subjects--Index Terms:

Distributed Computing System

Optimizing Distributed Computing Systems via Machine Learning.
LDR:03792nmm a2200361 4500 001 2282823
005 20211022115952.5
008 220723s2020 ||||||||||||||||| ||eng d
020 $a 9798698546979
035 $a (MiAaPQ)AAI28031730
035 $a AAI28031730
040 $a MiAaPQ $c MiAaPQ
100 1 $a Wang, Hao. $3 1911290
245 1 0 $a Optimizing Distributed Computing Systems via Machine Learning.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2020
300 $a 156 p.
500 $a Source: Dissertations Abstracts International, Volume: 82-06, Section: B.
500 $a Advisor: Li, Baochun.
502 $a Thesis (Ph.D.)--University of Toronto (Canada), 2020.
506 $a This item must not be sold to any third party vendors.
520 $a The prosperity of Big Data owes to the advances in distributed computing systems, which make it possible to process massive volumes of data in parallel efficiently. As distributed architectures enable scalable and parallel computation with thousands of servers, critical challenges---allocating resources, scheduling tasks, and optimizing computation algorithms across servers---also arise. The dynamic performance of distributed architectures makes it too volatile to address these challenges with deterministic algorithms. In this dissertation, we apply machine learning techniques to optimize distributed systems. Particularly, we contribute to the following two areas that drive the advance of distributed systems in resource provisioning, task scheduling, and architecture design.Optimizing large-scale data analytics systems. Large-scale data analytics---processing data across geographically distributed datacenters---suffers from fluctuating resource provisioning and heterogeneous demands. However, most of the existing solutions are based on simplified heuristics derived from prior experiences, assuming the network bandwidth is a bottleneck. However, we find that the performance bottleneck is alternating---disk I/O and memory can also become the bottleneck. We propose Lube, a new system that detects bottlenecks at runtime and minimizes job computation times by mitigating the bottlenecks. By examining the execution of analytic queries, we show that existing query optimizers choose the slowest execution plan due to the agnostic of bandwidth fluctuations. In this dissertation, we present Turbo, a system that dynamically optimizes execution plans in response to runtime resource variations.Accelerating distributed machine learning systems. Traditional machine learning collecting training data to a server-based cluster has introduced non-trivial overhead and data privacy issues. To address such problems, we propose to refactor distributed machine learning systems with new architectures. In this dissertation, Siren enables distributed machine learning with serverless architectures and simplifies the way of machine learning development and deployment by learning the optimal resource allocation for machine learning training. We then focus on optimizing federated learning---a new distributed machine learning paradigm---that performs training locally on devices to preserve data privacy. However, the statistical issues posed by the local data significantly slow down existing federated learning algorithms. We present Favor, a framework that optimizes federated learning by carefully selecting devices to address the statistical issues.
590 $a School code: 0779.
650 4 $a Computer engineering. $3 621879
650 4 $a Artificial intelligence. $3 516317
653 $a Distributed Computing System
653 $a Machine Learning
653 $a Artificial intelligence
653 $a Turbo
653 $a Siren
690 $a 0464
690 $a 0800
710 2 $a University of Toronto (Canada). $b Electrical and Computer Engineering. $3 2096349
773 0 $t Dissertations Abstracts International $g 82-06B.
790 $a 0779
791 $a Ph.D.
792 $a 2020
793 $a English
856 4 0 $u https://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28031730