東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Comparison of clustering algorithms ...

Chen, Jie.

FindBook

Google Book

Amazon

博客來

Comparison of clustering algorithms and its application to document clustering.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Comparison of clustering algorithms and its application to document clustering./
作者:	Chen, Jie.
面頁冊數:	214 p.
附註:	Source: Dissertation Abstracts International, Volume: 66-03, Section: B, page: 1543.
Contained By:	Dissertation Abstracts International66-03B.
標題:	Computer Science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3169789
ISBN:	0542059452

Comparison of clustering algorithms and its application to document clustering.
Chen, Jie.

Comparison of clustering algorithms and its application to document clustering. - 214 p.

Source: Dissertation Abstracts International, Volume: 66-03, Section: B, page: 1543.

Thesis (Ph.D.)--Princeton University, 2005.

We investigate the methodology to evaluate and compare the quality of clustering algorithms. We study the issues raised in evaluation, such as data generation and choice of evaluation metrics. We give head-to-head comparison of six important clustering algorithms from different research communities.

ISBN: 0542059452Subjects--Topical Terms:

626642
Computer Science.

Comparison of clustering algorithms and its application to document clustering.
LDR:03388nmm 2200337 4500 001 1848446
005 20051202072552.5
008 130614s2005 eng d
020 $a 0542059452
035 $a (UnM)AAI3169789
035 $a AAI3169789
040 $a UnM $c UnM
100 1 $a Chen, Jie. $3 623762
245 1 0 $a Comparison of clustering algorithms and its application to document clustering.
300 $a 214 p.
500 $a Source: Dissertation Abstracts International, Volume: 66-03, Section: B, page: 1543.
500 $a Adviser: Andrea S. LaPaugh.
502 $a Thesis (Ph.D.)--Princeton University, 2005.
520 $a We investigate the methodology to evaluate and compare the quality of clustering algorithms. We study the issues raised in evaluation, such as data generation and choice of evaluation metrics. We give head-to-head comparison of six important clustering algorithms from different research communities.
520 $a We generate data using an extended mixture of Gaussian model that controls data characteristics such as shape, volume and size of a class. The added control facilitates more thorough exploration into the parameter space of the generative model and therefore makes the algorithmic evaluation more comprehensive.
520 $a We summarize datasets by their data characteristics. Previous comparison conclusions based on specific datasets are hard to be applied to other datasets. In contrast, conclusions based on data characteristics can be easily generalized. Therefore, we can predict the performance of an algorithm. The prediction is a significant step forward for comparison studies.
520 $a We compare three evaluation indices from different research fields. We recommend the f-score measurement if a single evaluation index is used. However, it is more desirable to apply multiple indices and vote among them to determine the ranking of algorithms.
520 $a We isolate the objective function (sum-of-squares) and the optimization approach (greedy) used in the popular k-means algorithm. We investigate the effectiveness and efficiency in the four algorithms resulting from substituting another objective function (min-max cut) and another optimization approach (Kernighan-Lin). We illustrate by a quantitative study that contrary to conventional wisdom, k-means is not only fast but also produces quality clusters. It achieves 95% of the best quality among the candidate algorithms running in time an order of magnitude faster.
520 $a We present detailed studies of the algorithmic behavior in response to data characteristics. We give an alternative definition of cluster separation and show that the new definition measures the degree of cluster difficulty for spherical and ellipsoidal clusters more consistently.
520 $a We develop and systematically evaluate a practical version of a spectral clustering algorithm originally specified for provable guarantees of correctness. We observe that the modified algorithm can find perfect solution when the clusters are well separated, where iterative algorithms such as k-means tend to miss the perfect solution.
590 $a School code: 0181.
650 4 $a Computer Science. $3 626642
690 $a 0984
710 2 0 $a Princeton University. $3 645579
773 0 $t Dissertation Abstracts International $g 66-03B.
790 1 0 $a LaPaugh, Andrea S., $e advisor
790 $a 0181
791 $a Ph.D.
792 $a 2005
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3169789