東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Probabilistic query models for trans...

Pavlov, Dmitry Yurievich.

FindBook

Google Book

Amazon

博客來

Probabilistic query models for transaction data.

紀錄類型:	書目-語言資料,印刷品 : Monograph/item
正題名/作者:	Probabilistic query models for transaction data./
作者:	Pavlov, Dmitry Yurievich.
面頁冊數:	171 p.
附註:	Chair: Padhraic Smyth.
Contained By:	Dissertation Abstracts International63-01B
標題:	Computer Science -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3039221
ISBN:	0493524673

Probabilistic query models for transaction data.
Pavlov, Dmitry Yurievich.

Probabilistic query models for transaction data. - 171 p.

Chair: Padhraic Smyth.

Thesis (Ph.D.)--University of California, Irvine, 2002.

Interactive querying of massive data sets is an increasingly important application in the fields like interactive data mining, exploratory data analysis, prediction of customer behaviour and optimization of database management systems (DBMS). Formally, the task can be formulated as follows: given a binary transaction data set and a query on a subset of its attributes, find (predict) the probability that a randomly selected data record satisfies the query. A straightforward solution of scanning the data directly does not scale up to typical transaction data sets that are huge in a data mining sense. Although fast, memory efficient and one of the most popular solutions in commercial DBMS, the independence model has long been criticized for often being quite inaccurate. Other techniques proposed in the literature suffer from the curse of dimensionality and are practical only for low-dimensional data sets.

ISBN: 0493524673Subjects--Topical Terms:

890869
Computer Science

Probabilistic query models for transaction data.
LDR:03434nam 2200289 a 45 001 936588
005 20110510
008 110510s2002 eng d
020 $a 0493524673
035 $a (UnM)AAI3039221
035 $a AAI3039221
040 $a UnM $c UnM
100 1 $a Pavlov, Dmitry Yurievich. $3 1260302
245 1 0 $a Probabilistic query models for transaction data.
300 $a 171 p.
500 $a Chair: Padhraic Smyth.
500 $a Source: Dissertation Abstracts International, Volume: 63-01, Section: B, page: 0358.
502 $a Thesis (Ph.D.)--University of California, Irvine, 2002.
520 $a Interactive querying of massive data sets is an increasingly important application in the fields like interactive data mining, exploratory data analysis, prediction of customer behaviour and optimization of database management systems (DBMS). Formally, the task can be formulated as follows: given a binary transaction data set and a query on a subset of its attributes, find (predict) the probability that a randomly selected data record satisfies the query. A straightforward solution of scanning the data directly does not scale up to typical transaction data sets that are huge in a data mining sense. Although fast, memory efficient and one of the most popular solutions in commercial DBMS, the independence model has long been criticized for often being quite inaccurate. Other techniques proposed in the literature suffer from the curse of dimensionality and are practical only for low-dimensional data sets.
520 $a In this dissertation, we develop a number of novel probabilistic approaches to the querying problem. Unlike previously proposed methods our techniques satisfy the following requirements: (a) they work on high-dimensional binary transaction data, and (b) they allow the user to effectively trade accuracy in the estimates for query execution time and memory taken by the model. We empirically explore the tradeoffs in terms of accuracy, time and memory offered by these models and establish that the choice of the best-performing model for any particular querying situation depends on a variety of factors, such as the density of the data and the distribution of user queries. We illustrate that no single model universally dominates
520 $a This naturally leads to the question of how to automatically determine which approximation model is optimal for any given situation. To resolve this question, we leverage recent results in the machine learning and statistics literature that show that combinations of models can outperform any single model for prediction tasks. Specifically, for query approximation, we propose a data-adaptive technique for combining probabilistic query approximation models. We demonstrate that on real-world and simulated data sets the combined model can reduce the error of any single model by factors of up to 50%. Furthermore, we demonstrate how time and memory constraints can be easily incorporated into the model-combining algorithm. Adapting the model is a straightforward and scalable optimization problem that can be completely automated, providing a practical and query-adaptive framework for online query approximation
590 $a School code: 0030
650 $a Computer Science $3 890869
690 $a 098
710 2 $a University of California, Irvine $3 1260303
773 0 $t Dissertation Abstracts International $g 63-01B
790 $a 003
790 1 $a Smyth, Padhraic, $e adviso
791 $a Ph.D
792 $a 200
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3039221