語系:
繁體中文
English
說明(常見問題)
回圖書館首頁
手機版館藏查詢
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Estimation and Compression Over Larg...
~
Acharya, Jayadev.
FindBook
Google Book
Amazon
博客來
Estimation and Compression Over Large Alphabets.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Estimation and Compression Over Large Alphabets./
作者:
Acharya, Jayadev.
面頁冊數:
95 p.
附註:
Source: Dissertation Abstracts International, Volume: 75-07(E), Section: B.
Contained By:
Dissertation Abstracts International75-07B(E).
標題:
Engineering, Electronics and Electrical. -
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3615302
ISBN:
9781303812156
Estimation and Compression Over Large Alphabets.
Acharya, Jayadev.
Estimation and Compression Over Large Alphabets.
- 95 p.
Source: Dissertation Abstracts International, Volume: 75-07(E), Section: B.
Thesis (Ph.D.)--University of California, San Diego, 2014.
This item must not be sold to any third party vendors.
Compression, estimation, and prediction are basic problems in Information theory, statistics and machine learning. These problems have been extensively studied in all these fields, though the primary focus in a large portion of the work has been on understanding and solving the problems in the asymptotic regime, i.e. the alphabet size is fixed and the length of observations grow. Compression of long i.i.d. sequences over a small alphabet has been studied extensively. Kieffer, Davisson, and others showed that there is no good scheme for compression of i.i.d. distributions over an infinite alphabet. With the advent of data with larger underlying alphabet size over the past decade, researchers have considered various methods/models for which efficient compression is possible.
ISBN: 9781303812156Subjects--Topical Terms:
626636
Engineering, Electronics and Electrical.
Estimation and Compression Over Large Alphabets.
LDR
:04811nmm a2200337 4500
001
2057369
005
20150610074910.5
008
170521s2014 ||||||||||||||||| ||eng d
020
$a
9781303812156
035
$a
(MiAaPQ)AAI3615302
035
$a
AAI3615302
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Acharya, Jayadev.
$3
3171202
245
1 0
$a
Estimation and Compression Over Large Alphabets.
300
$a
95 p.
500
$a
Source: Dissertation Abstracts International, Volume: 75-07(E), Section: B.
500
$a
Adviser: Alon Orlitsky.
502
$a
Thesis (Ph.D.)--University of California, San Diego, 2014.
506
$a
This item must not be sold to any third party vendors.
520
$a
Compression, estimation, and prediction are basic problems in Information theory, statistics and machine learning. These problems have been extensively studied in all these fields, though the primary focus in a large portion of the work has been on understanding and solving the problems in the asymptotic regime, i.e. the alphabet size is fixed and the length of observations grow. Compression of long i.i.d. sequences over a small alphabet has been studied extensively. Kieffer, Davisson, and others showed that there is no good scheme for compression of i.i.d. distributions over an infinite alphabet. With the advent of data with larger underlying alphabet size over the past decade, researchers have considered various methods/models for which efficient compression is possible.
520
$a
We use redundancy, the extra number of bits beyond the optimal (entropy) as the performance metric. We consider three general models to address compression of large alphabets.
520
$a
The first model considers sources with only a few modes. Most natural distributions over the integers consists of only a few modes. Moreover, mixture of a few simple distributions also satisfies this property. However, even the class M of all monotone distributions over N also has infinite redundancy. In spite of this, Foster, Stine and Wyner constructed encoding schemes that have diminishing redundancy for any monotone distribution with a finite entropy. We restrict our attention to M k, the class of monotone distributions over k alphabets. The precise redundancy of this class of distributions is characterized by Shamir for the range k = O(n), i.e. for block-length at most linear in the alphabet size. We extend the characterization and in fact show that as long as the underlying alphabet size is sub-exponential in the block-length, it is possible to compress monotone distributions with diminishing per-symbol redundancy. We extend these results to distributions with a constant number of modes, whose locations are unknown.
520
$a
A second elegant approach proposed by Boucheron, Garivier and Gassiat considers distributions that are bounded by an envelope. They provide characterization of the redundancy of such classes and in particular, find bounds on the redundancy of power-law and exponential envelopes. Bontemps and later Bontemps, Boucheron and Gassiat consider the class of sub-exponential envelopes and characterize its redundancy precisely. However, these methods do not work for distributions with a heavy tail, e.g., the power-law distributions. Poisson sampling is a widely used method to introduce independence among the number of symbol appearances, and thereby simplifying the analysis of many algorithms. We show that for discrete distributions, the redundancy of Poisson sampled sequences is sufficient to characterize the redundancy of fixed length sequences. Using this, we provide simple bounds on the redundancy of envelope classes. We then demonstrate the efficacy of these bounds by proving tight bounds on the redundancy of power-law classes, answering an open question of Boucheron et al.
520
$a
The third approach, proposed initially by Aberg, Shtarkov and Smeets, and studied extensively by Orlitsky, Santhanam, Zhang, and Shamir, consider compressing the structure (called pattern) and dictionary of the sequence separately. In particular, they show that patterns can be compressed with redundancy that grows as O(n1/2) with the block-length n, independent of the underlying alphabet size. This problem can also be interpreted as studying the Good-Turing probability estimation problem under logarithmic loss. We develop several simple and useful tools to bound redundancy of distribution classes and use them with Poisson sampling to show that the redundancy of patterns grows as O(n1/3).
590
$a
School code: 0033.
650
4
$a
Engineering, Electronics and Electrical.
$3
626636
650
4
$a
Information Technology.
$3
1030799
690
$a
0544
690
$a
0489
710
2
$a
University of California, San Diego.
$b
Electrical Engineering (Communication Theory and Systems).
$3
1018760
773
0
$t
Dissertation Abstracts International
$g
75-07B(E).
790
$a
0033
791
$a
Ph.D.
792
$a
2014
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3615302
筆 0 讀者評論
館藏地:
全部
電子資源
出版年:
卷號:
館藏
1 筆 • 頁數 1 •
1
條碼號
典藏地名稱
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
W9289873
電子資源
11.線上閱覽_V
電子書
EB
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
評論
新增評論
分享你的心得
Export
取書館
處理中
...
變更密碼
登入