東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

Resource and Data Efficient Deep Learning.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Resource and Data Efficient Deep Learning./
作者:	Coleman, Cody Austun.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	175 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Contained By:	Dissertations Abstracts International83-05B.
標題:	Software. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28812872
ISBN:	9798494452610

Resource and Data Efficient Deep Learning.
Coleman, Cody Austun.

Resource and Data Efficient Deep Learning. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 175 p.

Source: Dissertations Abstracts International, Volume: 83-05, Section: B.

Thesis (Ph.D.)--Stanford University, 2021.

This item must not be sold to any third party vendors.

Using massive computation, deep learning allows machines to translate large amounts of data into models that accurately predict the real world, enabling powerful applications like virtual assistants and autonomous vehicles. As datasets and computer systems have continued to grow in scale, so has the quality of machine learning models, creating an expensive appetite in practitioners and researchers for data and computation. To address this demand, this dissertation discusses ways to measure and improve both the computational and data efficiency of deep learning. First, we introduce DAWNBench and MLPerf as a systematic way to measure endto-end machine learning system performance. Researchers have proposed numerous hardware, software, and algorithmic optimizations to improve the computational efficiency of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision) and can even impact the final model's accuracy on unseen data. Because of these trade-offs between accuracy and computational efficiency, it has been difficult to compare and understand the impact of these optimizations. We propose and evaluate a new metric, time-to-accuracy, that can be used to compare different system designs and use it to evaluate high performing systems by organizing two public benchmark competitions, DAWNBench and MLPerf. MLPerf has now grown into an industry standard benchmark co-organized by over 70 organizations. Second, we present ways to perform data selection on large-scale datasets efficiently. Data selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. To make these methods tractable, we propose (1) "selection via proxy" (SVP) to avoid expensive training and reduce the computation per example and (2) "similarity search for efficient active learning and search" (SEALS) to reduce the number of examples processed. Both methods lead to order of magnitude performance improvements, making techniques like active learning on billions of unlabeled images practical for the first time.

ISBN: 9798494452610Subjects--Topical Terms:

619355
Software.

Resource and Data Efficient Deep Learning.
LDR:03514nmm a2200313 4500 001 2349871
005 20221010063644.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798494452610
035 $a (MiAaPQ)AAI28812872
035 $a (MiAaPQ)STANFORDmy863wx9641
035 $a AAI28812872
040 $a MiAaPQ $c MiAaPQ
100 1 $a Coleman, Cody Austun. $3 3689295
245 1 0 $a Resource and Data Efficient Deep Learning.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 175 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
500 $a Advisor: Zaharia, Matei;Bailis, Peter;Li, Fei-Fei.
502 $a Thesis (Ph.D.)--Stanford University, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a Using massive computation, deep learning allows machines to translate large amounts of data into models that accurately predict the real world, enabling powerful applications like virtual assistants and autonomous vehicles. As datasets and computer systems have continued to grow in scale, so has the quality of machine learning models, creating an expensive appetite in practitioners and researchers for data and computation. To address this demand, this dissertation discusses ways to measure and improve both the computational and data efficiency of deep learning. First, we introduce DAWNBench and MLPerf as a systematic way to measure endto-end machine learning system performance. Researchers have proposed numerous hardware, software, and algorithmic optimizations to improve the computational efficiency of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision) and can even impact the final model's accuracy on unseen data. Because of these trade-offs between accuracy and computational efficiency, it has been difficult to compare and understand the impact of these optimizations. We propose and evaluate a new metric, time-to-accuracy, that can be used to compare different system designs and use it to evaluate high performing systems by organizing two public benchmark competitions, DAWNBench and MLPerf. MLPerf has now grown into an industry standard benchmark co-organized by over 70 organizations. Second, we present ways to perform data selection on large-scale datasets efficiently. Data selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. To make these methods tractable, we propose (1) "selection via proxy" (SVP) to avoid expensive training and reduce the computation per example and (2) "similarity search for efficient active learning and search" (SEALS) to reduce the number of examples processed. Both methods lead to order of magnitude performance improvements, making techniques like active learning on billions of unlabeled images practical for the first time.
590 $a School code: 0212.
650 4 $a Software. $2 gtt. $3 619355
650 4 $a Active learning. $3 527777
650 4 $a Deep learning. $3 3554982
650 4 $a Artificial intelligence. $3 516317
650 4 $a Computer science. $3 523869
690 $a 0800
690 $a 0984
710 2 $a Stanford University. $3 754827
773 0 $t Dissertations Abstracts International $g 83-05B.
790 $a 0212
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28812872