Language:
English
繁體中文
Help
回圖書館首頁
手機版館藏查詢
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Linked to FindBook
Google Book
Amazon
博客來
Resource and Data Efficient Deep Learning.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Resource and Data Efficient Deep Learning./
Author:
Coleman, Cody Austun.
Published:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
Description:
175 p.
Notes:
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Contained By:
Dissertations Abstracts International83-05B.
Subject:
Software. -
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28812872
ISBN:
9798494452610
Resource and Data Efficient Deep Learning.
Coleman, Cody Austun.
Resource and Data Efficient Deep Learning.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 175 p.
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Thesis (Ph.D.)--Stanford University, 2021.
This item must not be sold to any third party vendors.
Using massive computation, deep learning allows machines to translate large amounts of data into models that accurately predict the real world, enabling powerful applications like virtual assistants and autonomous vehicles. As datasets and computer systems have continued to grow in scale, so has the quality of machine learning models, creating an expensive appetite in practitioners and researchers for data and computation. To address this demand, this dissertation discusses ways to measure and improve both the computational and data efficiency of deep learning. First, we introduce DAWNBench and MLPerf as a systematic way to measure endto-end machine learning system performance. Researchers have proposed numerous hardware, software, and algorithmic optimizations to improve the computational efficiency of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision) and can even impact the final model's accuracy on unseen data. Because of these trade-offs between accuracy and computational efficiency, it has been difficult to compare and understand the impact of these optimizations. We propose and evaluate a new metric, time-to-accuracy, that can be used to compare different system designs and use it to evaluate high performing systems by organizing two public benchmark competitions, DAWNBench and MLPerf. MLPerf has now grown into an industry standard benchmark co-organized by over 70 organizations. Second, we present ways to perform data selection on large-scale datasets efficiently. Data selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. To make these methods tractable, we propose (1) "selection via proxy" (SVP) to avoid expensive training and reduce the computation per example and (2) "similarity search for efficient active learning and search" (SEALS) to reduce the number of examples processed. Both methods lead to order of magnitude performance improvements, making techniques like active learning on billions of unlabeled images practical for the first time.
ISBN: 9798494452610Subjects--Topical Terms:
619355
Software.
Resource and Data Efficient Deep Learning.
LDR
:03514nmm a2200313 4500
001
2349871
005
20221010063644.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798494452610
035
$a
(MiAaPQ)AAI28812872
035
$a
(MiAaPQ)STANFORDmy863wx9641
035
$a
AAI28812872
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Coleman, Cody Austun.
$3
3689295
245
1 0
$a
Resource and Data Efficient Deep Learning.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
175 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
500
$a
Advisor: Zaharia, Matei;Bailis, Peter;Li, Fei-Fei.
502
$a
Thesis (Ph.D.)--Stanford University, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
Using massive computation, deep learning allows machines to translate large amounts of data into models that accurately predict the real world, enabling powerful applications like virtual assistants and autonomous vehicles. As datasets and computer systems have continued to grow in scale, so has the quality of machine learning models, creating an expensive appetite in practitioners and researchers for data and computation. To address this demand, this dissertation discusses ways to measure and improve both the computational and data efficiency of deep learning. First, we introduce DAWNBench and MLPerf as a systematic way to measure endto-end machine learning system performance. Researchers have proposed numerous hardware, software, and algorithmic optimizations to improve the computational efficiency of deep learning. While some of these optimizations perform the same operations faster (e.g., increasing GPU clock speed), many others modify the semantics of the training procedure (e.g., reduced precision) and can even impact the final model's accuracy on unseen data. Because of these trade-offs between accuracy and computational efficiency, it has been difficult to compare and understand the impact of these optimizations. We propose and evaluate a new metric, time-to-accuracy, that can be used to compare different system designs and use it to evaluate high performing systems by organizing two public benchmark competitions, DAWNBench and MLPerf. MLPerf has now grown into an industry standard benchmark co-organized by over 70 organizations. Second, we present ways to perform data selection on large-scale datasets efficiently. Data selection methods, such as active learning and core-set selection, improve the data efficiency of machine learning by identifying the most informative data points to label or train on. Across the data selection literature, there are many ways to identify these training examples. However, classical data selection methods are prohibitively expensive to apply in deep learning because of the larger datasets and models. To make these methods tractable, we propose (1) "selection via proxy" (SVP) to avoid expensive training and reduce the computation per example and (2) "similarity search for efficient active learning and search" (SEALS) to reduce the number of examples processed. Both methods lead to order of magnitude performance improvements, making techniques like active learning on billions of unlabeled images practical for the first time.
590
$a
School code: 0212.
650
4
$a
Software.
$2
gtt.
$3
619355
650
4
$a
Active learning.
$3
527777
650
4
$a
Deep learning.
$3
3554982
650
4
$a
Artificial intelligence.
$3
516317
650
4
$a
Computer science.
$3
523869
690
$a
0800
690
$a
0984
710
2
$a
Stanford University.
$3
754827
773
0
$t
Dissertations Abstracts International
$g
83-05B.
790
$a
0212
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28812872
based on 0 review(s)
Location:
ALL
電子資源
Year:
Volume Number:
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
W9472309
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login