語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
FindBook
Google Book
Amazon
博客來
Creating Hardware Component Knowledge Bases from Pdf Datasheets.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Creating Hardware Component Knowledge Bases from Pdf Datasheets./
作者:
Hsiao, Luke.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:
125 p.
附註:
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Contained By:
Dissertations Abstracts International83-05B.
標題:
Construction. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28688410
ISBN:
9798544204756
Creating Hardware Component Knowledge Bases from Pdf Datasheets.
Hsiao, Luke.
Creating Hardware Component Knowledge Bases from Pdf Datasheets.
- Ann Arbor : ProQuest Dissertations & Theses, 2021 - 125 p.
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
Thesis (Ph.D.)--Stanford University, 2021.
This item must not be sold to any third party vendors.
Hardware component databases are vital resources in designing electronics. These databases store information about hardware components that allow designers to find the components they need. However, creating detailed hardware databases requires hundreds of thousands of hours of manual data entry. As a result, existing databases are often proprietary, incomplete, and may have sporadic human data entry errors. Knowledge base construction (KBC) systems help automate the process of creating and populating structured databases and have been applied effectively to many domains. Knowledge base construction techniques reduce dependency on human input, making it faster, easier, and cheaper to build these databases.This dissertation presents a machine-learning-based approach for creating hardware component databases directly from manufacturers' published component datasheets. Extracting data directly from datasheets is challenging for three reasons. First, the data is relational in nature; accurate interpretation relies on non-local context. Second, datasheets are filled with technical jargon. Third, datasheets are PDFs, a format that decouples visual locality from locality within the document. These challenges illuminate why human input is required, but human input is error-prone, timeconsuming, and expensive. Instead of relying solely on human input, the approach of using a rich data model, weak supervision, data augmentation, and multi-task learning in this dissertation presents a more automated alternative. When utilized effectively, these machine-learning techniques create large knowledge bases cheaply and in just days.This dissertation consists of three parts. First, it presents Fonduer, a novel knowledge base construction system for richly formatted data based on a multimodal data model and weak supervision. It motivates Fonduer by studying the challenging properties of richly formatted data like the PDF datasheets electronics manufacturers use to publish component specifications. These insights lead to developing the building blocks necessary to enable automated information extraction from hardware datasheets. Fonduer is validated across various domains beyond only hardware datasheets by creating large knowledge bases in days.Second, this dissertation shows how Fonduer can be used to build hardware component knowledge bases in practice. The multimodal information that Fonduer captures provides signals utilized in training data generation and the augmentation of deep learning models for multi-task learning. An evaluation of this approach on datasheets of three types of components achieves an average quality of 0.77 F1-quality comparable to existing human-curated knowledge bases.Third, this dissertation demonstrates the utility of Fonduer with end-to-end applications and empirical results applied to real-world use cases. Two end-to-end applications, the enhancement of product catalogs with thumbnail images and the analysis of electrical characteristics, demonstrate that hardware component knowledge bases created in days make hardware component selection easier.Together, these results show three things. First, it is possible to automate the generation of hardware component knowledge bases. Second, these generated knowledge bases can be of higher quality than existing human-curated knowledge bases. Finally, these higher-quality knowledge bases open the door to innovative applications and tools for designing electronics.
ISBN: 9798544204756Subjects--Topical Terms:
3561054
Construction.
Creating Hardware Component Knowledge Bases from Pdf Datasheets.
LDR
:04573nmm a2200349 4500
001
2344876
005
20220531062203.5
008
241004s2021 ||||||||||||||||| ||eng d
020
$a
9798544204756
035
$a
(MiAaPQ)AAI28688410
035
$a
(MiAaPQ)STANFORDsf776wm9525
035
$a
AAI28688410
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Hsiao, Luke.
$3
3683706
245
1 0
$a
Creating Hardware Component Knowledge Bases from Pdf Datasheets.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2021
300
$a
125 p.
500
$a
Source: Dissertations Abstracts International, Volume: 83-05, Section: B.
500
$a
Advisor: Levis, Philip; Winstein, Keith; Re, Chris.
502
$a
Thesis (Ph.D.)--Stanford University, 2021.
506
$a
This item must not be sold to any third party vendors.
520
$a
Hardware component databases are vital resources in designing electronics. These databases store information about hardware components that allow designers to find the components they need. However, creating detailed hardware databases requires hundreds of thousands of hours of manual data entry. As a result, existing databases are often proprietary, incomplete, and may have sporadic human data entry errors. Knowledge base construction (KBC) systems help automate the process of creating and populating structured databases and have been applied effectively to many domains. Knowledge base construction techniques reduce dependency on human input, making it faster, easier, and cheaper to build these databases.This dissertation presents a machine-learning-based approach for creating hardware component databases directly from manufacturers' published component datasheets. Extracting data directly from datasheets is challenging for three reasons. First, the data is relational in nature; accurate interpretation relies on non-local context. Second, datasheets are filled with technical jargon. Third, datasheets are PDFs, a format that decouples visual locality from locality within the document. These challenges illuminate why human input is required, but human input is error-prone, timeconsuming, and expensive. Instead of relying solely on human input, the approach of using a rich data model, weak supervision, data augmentation, and multi-task learning in this dissertation presents a more automated alternative. When utilized effectively, these machine-learning techniques create large knowledge bases cheaply and in just days.This dissertation consists of three parts. First, it presents Fonduer, a novel knowledge base construction system for richly formatted data based on a multimodal data model and weak supervision. It motivates Fonduer by studying the challenging properties of richly formatted data like the PDF datasheets electronics manufacturers use to publish component specifications. These insights lead to developing the building blocks necessary to enable automated information extraction from hardware datasheets. Fonduer is validated across various domains beyond only hardware datasheets by creating large knowledge bases in days.Second, this dissertation shows how Fonduer can be used to build hardware component knowledge bases in practice. The multimodal information that Fonduer captures provides signals utilized in training data generation and the augmentation of deep learning models for multi-task learning. An evaluation of this approach on datasheets of three types of components achieves an average quality of 0.77 F1-quality comparable to existing human-curated knowledge bases.Third, this dissertation demonstrates the utility of Fonduer with end-to-end applications and empirical results applied to real-world use cases. Two end-to-end applications, the enhancement of product catalogs with thumbnail images and the analysis of electrical characteristics, demonstrate that hardware component knowledge bases created in days make hardware component selection easier.Together, these results show three things. First, it is possible to automate the generation of hardware component knowledge bases. Second, these generated knowledge bases can be of higher quality than existing human-curated knowledge bases. Finally, these higher-quality knowledge bases open the door to innovative applications and tools for designing electronics.
590
$a
School code: 0212.
650
4
$a
Construction.
$3
3561054
650
4
$a
Software.
$2
gtt.
$3
619355
650
4
$a
Search engines.
$3
869493
650
4
$a
Libraries.
$3
525303
650
4
$a
Web studies.
$3
2122754
650
4
$a
Civil engineering.
$3
860360
650
4
$a
Computer science.
$3
523869
650
4
$a
Engineering.
$3
586835
650
4
$a
Library science.
$3
539284
690
$a
0646
690
$a
0543
690
$a
0984
690
$a
0537
690
$a
0399
710
2
$a
Stanford University.
$3
754827
773
0
$t
Dissertations Abstracts International
$g
83-05B.
790
$a
0212
791
$a
Ph.D.
792
$a
2021
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28688410
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9467314
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入