語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Universal outlier hypothesis testing...
~
Li, Yun.
FindBook
Google Book
Amazon
博客來
Universal outlier hypothesis testing with applications to anomaly detection.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Universal outlier hypothesis testing with applications to anomaly detection./
作者:
Li, Yun.
面頁冊數:
117 p.
附註:
Source: Dissertation Abstracts International, Volume: 77-12(E), Section: B.
Contained By:
Dissertation Abstracts International77-12B(E).
標題:
Electrical engineering. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10151841
ISBN:
9781369065442
Universal outlier hypothesis testing with applications to anomaly detection.
Li, Yun.
Universal outlier hypothesis testing with applications to anomaly detection.
- 117 p.
Source: Dissertation Abstracts International, Volume: 77-12(E), Section: B.
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2015.
Outlier hypothesis testing is studied in a universal setting. Multiple sequences of observations are collected, a small subset (possibly empty) of which are outliers. A sequence is considered an outlier if the observations in that sequence are distributed according to an "outlier" distribution, distinct from the "typical" distribution governing the observations in the majority of the sequences. The outlier and typical distributions are not fully known, and they can be arbitrarily close. The goal is to design a universal test to best discern the outlier sequence(s). Both fixed sample size and sequential settings are considered in this dissertation. In the fixed sample size setting, for models with exactly one outlier, the generalized likelihood test is shown to be universally exponentially consistent. A single letter characterization of the error exponent achieved by such a test is derived, and it is shown that the test achieves the optimal error exponent asymptotically as the number of sequences goes to infinity. When the null hypothesis with no outlier is included, a modification of the generalized likelihood test is shown to achieve the same error exponent under each non-null hypothesis, and also consistency under the null hypothesis. Then, models with multiple outliers are considered. When the outliers can be distinctly distributed, in order to achieve exponential consistency, it is shown that it is essential that the number of outliers be known at the outset. For the setting with a known number of distinctly distributed outliers, the generalized likelihood test is shown to be universally exponentially consistent. The limiting error exponent achieved by such a test is characterized, and the test is shown to be asymptotically exponentially consistent. For the setting with an unknown number of identically distributed outliers, a modification of the generalized likelihood test is shown to achieve a positive error exponent under each non-null hypothesis, and consistency under the null hypothesis. In the sequential setting, a test with the flavor of the repeated significance test is proposed. The test is shown to be universally consistent, and universally exponentially consistent under non-null hypotheses. In addition, with the typical distribution being known, the test is shown to be asymptotically optimal universally when the number of outliers is the largest possible. In all cases, the asymptotic performance of the proposed test when none of the underlying distributions is known is shown to converge to that when only the typical distribution is known as the number of sequences goes to infinity. For models with continuous alphabets, a test with the same structure as the generalized likelihood test is proposed, and it is shown to be universally consistent. It is also demonstrated that there is a close connection between universal outlier hypothesis testing and cluster analysis. The performance of various proposed tests is evaluated against a synthetic data set, and contrasted with that of two popular clustering methods. Applied to a real data set for spam detection, the sequential test is shown to outperform the fixed sample size test when the lengths of the sequences exceed a certain value. In addition, the performance of the proposed tests is shown to be superior to that of another kernel-based test for large sample sizes.
ISBN: 9781369065442Subjects--Topical Terms:
649834
Electrical engineering.
Universal outlier hypothesis testing with applications to anomaly detection.
LDR
:04306nmm a2200289 4500
001
2116185
005
20170417135056.5
008
180830s2015 ||||||||||||||||| ||eng d
020
$a
9781369065442
035
$a
(MiAaPQ)AAI10151841
035
$a
AAI10151841
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Li, Yun.
$3
1262653
245
1 0
$a
Universal outlier hypothesis testing with applications to anomaly detection.
300
$a
117 p.
500
$a
Source: Dissertation Abstracts International, Volume: 77-12(E), Section: B.
500
$a
Adviser: Venugopal Veeravalli.
502
$a
Thesis (Ph.D.)--University of Illinois at Urbana-Champaign, 2015.
520
$a
Outlier hypothesis testing is studied in a universal setting. Multiple sequences of observations are collected, a small subset (possibly empty) of which are outliers. A sequence is considered an outlier if the observations in that sequence are distributed according to an "outlier" distribution, distinct from the "typical" distribution governing the observations in the majority of the sequences. The outlier and typical distributions are not fully known, and they can be arbitrarily close. The goal is to design a universal test to best discern the outlier sequence(s). Both fixed sample size and sequential settings are considered in this dissertation. In the fixed sample size setting, for models with exactly one outlier, the generalized likelihood test is shown to be universally exponentially consistent. A single letter characterization of the error exponent achieved by such a test is derived, and it is shown that the test achieves the optimal error exponent asymptotically as the number of sequences goes to infinity. When the null hypothesis with no outlier is included, a modification of the generalized likelihood test is shown to achieve the same error exponent under each non-null hypothesis, and also consistency under the null hypothesis. Then, models with multiple outliers are considered. When the outliers can be distinctly distributed, in order to achieve exponential consistency, it is shown that it is essential that the number of outliers be known at the outset. For the setting with a known number of distinctly distributed outliers, the generalized likelihood test is shown to be universally exponentially consistent. The limiting error exponent achieved by such a test is characterized, and the test is shown to be asymptotically exponentially consistent. For the setting with an unknown number of identically distributed outliers, a modification of the generalized likelihood test is shown to achieve a positive error exponent under each non-null hypothesis, and consistency under the null hypothesis. In the sequential setting, a test with the flavor of the repeated significance test is proposed. The test is shown to be universally consistent, and universally exponentially consistent under non-null hypotheses. In addition, with the typical distribution being known, the test is shown to be asymptotically optimal universally when the number of outliers is the largest possible. In all cases, the asymptotic performance of the proposed test when none of the underlying distributions is known is shown to converge to that when only the typical distribution is known as the number of sequences goes to infinity. For models with continuous alphabets, a test with the same structure as the generalized likelihood test is proposed, and it is shown to be universally consistent. It is also demonstrated that there is a close connection between universal outlier hypothesis testing and cluster analysis. The performance of various proposed tests is evaluated against a synthetic data set, and contrasted with that of two popular clustering methods. Applied to a real data set for spam detection, the sequential test is shown to outperform the fixed sample size test when the lengths of the sequences exceed a certain value. In addition, the performance of the proposed tests is shown to be superior to that of another kernel-based test for large sample sizes.
590
$a
School code: 0090.
650
4
$a
Electrical engineering.
$3
649834
650
4
$a
Statistics.
$3
517247
650
4
$a
Computer science.
$3
523869
690
$a
0544
690
$a
0463
690
$a
0984
710
2
$a
University of Illinois at Urbana-Champaign.
$b
Electrical and Computer Engineering.
$3
3182633
773
0
$t
Dissertation Abstracts International
$g
77-12B(E).
790
$a
0090
791
$a
Ph.D.
792
$a
2015
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10151841
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9326805
電子資源
01.外借(書)_YB
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入