語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
An Information Based Optimal Subdata...
~
Zheng, Yi.
FindBook
Google Book
Amazon
博客來
An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm./
作者:
Zheng, Yi.
出版者:
Ann Arbor : ProQuest Dissertations & Theses, : 2017,
面頁冊數:
46 p.
附註:
Source: Masters Abstracts International, Volume: 56-04.
Contained By:
Masters Abstracts International56-04(E).
標題:
Statistics. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10275712
ISBN:
9781369767995
An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm.
Zheng, Yi.
An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm.
- Ann Arbor : ProQuest Dissertations & Theses, 2017 - 46 p.
Source: Masters Abstracts International, Volume: 56-04.
Thesis (M.S.)--Arizona State University, 2017.
This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA is studied from three aspects: nonorthogonality, including interaction terms and variable misspecification. A new accurate variable selection algorithm is proposed to help the implementation of IBOSS algorithms when a large number of variables are present with sparse important variables among them. Aggregating random subsample results, this variable selection algorithm is much more accurate than the LASSO method using full data. Since the time complexity is associated with the number of variables only, it is also very computationally efficient if the number of variables is fixed as n increases and not massively large. More importantly, using subsamples it solves the problem that full data cannot be stored in the memory when a data set is too large.
ISBN: 9781369767995Subjects--Topical Terms:
517247
Statistics.
An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm.
LDR
:02378nmm a2200289 4500
001
2125389
005
20171106112414.5
008
180830s2017 ||||||||||||||||| ||eng d
020
$a
9781369767995
035
$a
(MiAaPQ)AAI10275712
035
$a
AAI10275712
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Zheng, Yi.
$3
1034006
245
1 3
$a
An Information Based Optimal Subdata Selection Algorithm for Big Data Linear Regression and a Suitable Variable Selection Algorithm.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2017
300
$a
46 p.
500
$a
Source: Masters Abstracts International, Volume: 56-04.
500
$a
Adviser: John Stufken.
502
$a
Thesis (M.S.)--Arizona State University, 2017.
520
$a
This article proposes a new information-based subdata selection (IBOSS) algorithm, Squared Scaled Distance Algorithm (SSDA). It is based on the invariance of the determinant of the information matrix under orthogonal transformations, especially rotations. Extensive simulation results show that the new IBOSS algorithm retains nice asymptotic properties of IBOSS and gives a larger determinant of the subdata information matrix. It has the same order of time complexity as the D-optimal IBOSS algorithm. However, it exploits the advantages of vectorized calculation avoiding for loops and is approximately 6 times as fast as the D-optimal IBOSS algorithm in R. The robustness of SSDA is studied from three aspects: nonorthogonality, including interaction terms and variable misspecification. A new accurate variable selection algorithm is proposed to help the implementation of IBOSS algorithms when a large number of variables are present with sparse important variables among them. Aggregating random subsample results, this variable selection algorithm is much more accurate than the LASSO method using full data. Since the time complexity is associated with the number of variables only, it is also very computationally efficient if the number of variables is fixed as n increases and not massively large. More importantly, using subsamples it solves the problem that full data cannot be stored in the memory when a data set is too large.
590
$a
School code: 0010.
650
4
$a
Statistics.
$3
517247
650
4
$a
Computer science.
$3
523869
690
$a
0463
690
$a
0984
710
2
$a
Arizona State University.
$b
Statistics.
$3
3184918
773
0
$t
Masters Abstracts International
$g
56-04(E).
790
$a
0010
791
$a
M.S.
792
$a
2017
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10275712
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9336001
電子資源
01.外借(書)_YB
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入