東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Developing and Deploying Data Mining...

Piri, Saeed.

FindBook

Google Book

Amazon

博客來

Developing and Deploying Data Mining Techniques in Healthcare.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Developing and Deploying Data Mining Techniques in Healthcare./
作者:	Piri, Saeed.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2017,
面頁冊數:	125 p.
附註:	Source: Dissertations Abstracts International, Volume: 79-10, Section: B.
Contained By:	Dissertations Abstracts International79-10B.
標題:	Industrial engineering. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10601600
ISBN:	9780355853629

Developing and Deploying Data Mining Techniques in Healthcare.
Piri, Saeed.

Developing and Deploying Data Mining Techniques in Healthcare. - Ann Arbor : ProQuest Dissertations & Theses, 2017 - 125 p.

Source: Dissertations Abstracts International, Volume: 79-10, Section: B.

Thesis (Ph.D.)--Oklahoma State University, 2017.

This item must not be sold to any third party vendors.

Improving healthcare is a top priority for all nations. US healthcare expenditure was $3 trillion in 2014. In the same year, the share of GDP assigned to healthcare expenditure was 17.5%. These statistics shows the importance of making improvement in healthcare delivery system. In this research, we developed several data mining methods and algorithms to address healthcare problems. These methods can also be applied to the problems in other domains. The first part of this dissertation is about rare item problem in association analysis. This problem deals with the discovering rare rules, which include rare items. In this study, we introduced a novel assessment metric, called adjusted_support to address this problem. By applying this metric, we can retrieve rare rules without over-generating association rules. We applied this method to perform association analysis on complications of diabetes. The second part of this dissertation is developing a clinical decision support system for predicting retinopathy. Retinopathy is the leading cause of vision loss among American adults. In this research, we analyzed data from more than 1.4 million diabetic patients and developed four sets of predictive models: basic, comorbid, over-sampled, and ensemble models. The results show that incorporating comorbidity data and oversampling improved the accuracy of prediction. In addition, we developed a novel "confidence margin" ensemble approach that outperformed the existing ensemble models. In ensemble models, we also addressed the issue of tie in voting-based ensemble models by comparing the confidence margins of the base predictors. The third part of this dissertation addresses the problem of imbalanced data learning, which is a major challenge in machine learning. While a standard machine learning technique could have a good performance on balanced datasets, when applied to imbalanced datasets its performance deteriorates dramatically. This poor performance is rather troublesome especially in detecting the minority class that usually is the class of interest. In this study, we proposed a synthetic informative minority over-sampling (SIMO) algorithm embedded into support vector machine. We applied SIMO to 15 publicly available benchmark datasets and assessed its performance in comparison with seven existing approaches. The results showed that SIMO outperformed all existing approaches.

ISBN: 9780355853629Subjects--Topical Terms:

526216
Industrial engineering.

Developing and Deploying Data Mining Techniques in Healthcare.
LDR:03494nmm a2200337 4500 001 2207491
005 20190920131238.5
008 201008s2017 ||||||||||||||||| ||eng d
020 $a 9780355853629
035 $a (MiAaPQ)AAI10601600
035 $a (MiAaPQ)okstate:15272
035 $a AAI10601600
040 $a MiAaPQ $c MiAaPQ
100 1 $a Piri, Saeed. $3 3434481
245 1 0 $a Developing and Deploying Data Mining Techniques in Healthcare.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2017
300 $a 125 p.
500 $a Source: Dissertations Abstracts International, Volume: 79-10, Section: B.
500 $a Publisher info.: Dissertation/Thesis.
500 $a Advisor: Liu, Tieming.
502 $a Thesis (Ph.D.)--Oklahoma State University, 2017.
506 $a This item must not be sold to any third party vendors.
520 $a Improving healthcare is a top priority for all nations. US healthcare expenditure was $3 trillion in 2014. In the same year, the share of GDP assigned to healthcare expenditure was 17.5%. These statistics shows the importance of making improvement in healthcare delivery system. In this research, we developed several data mining methods and algorithms to address healthcare problems. These methods can also be applied to the problems in other domains. The first part of this dissertation is about rare item problem in association analysis. This problem deals with the discovering rare rules, which include rare items. In this study, we introduced a novel assessment metric, called adjusted_support to address this problem. By applying this metric, we can retrieve rare rules without over-generating association rules. We applied this method to perform association analysis on complications of diabetes. The second part of this dissertation is developing a clinical decision support system for predicting retinopathy. Retinopathy is the leading cause of vision loss among American adults. In this research, we analyzed data from more than 1.4 million diabetic patients and developed four sets of predictive models: basic, comorbid, over-sampled, and ensemble models. The results show that incorporating comorbidity data and oversampling improved the accuracy of prediction. In addition, we developed a novel "confidence margin" ensemble approach that outperformed the existing ensemble models. In ensemble models, we also addressed the issue of tie in voting-based ensemble models by comparing the confidence margins of the base predictors. The third part of this dissertation addresses the problem of imbalanced data learning, which is a major challenge in machine learning. While a standard machine learning technique could have a good performance on balanced datasets, when applied to imbalanced datasets its performance deteriorates dramatically. This poor performance is rather troublesome especially in detecting the minority class that usually is the class of interest. In this study, we proposed a synthetic informative minority over-sampling (SIMO) algorithm embedded into support vector machine. We applied SIMO to 15 publicly available benchmark datasets and assessed its performance in comparison with seven existing approaches. The results showed that SIMO outperformed all existing approaches.
590 $a School code: 0664.
650 4 $a Industrial engineering. $3 526216
650 4 $a Health care management. $3 2122906
650 4 $a Systems science. $3 3168411
690 $a 0546
690 $a 0769
690 $a 0790
710 2 $a Oklahoma State University. $b Industrial Engineering & Management. $3 1065143
773 0 $t Dissertations Abstracts International $g 79-10B.
790 $a 0664
791 $a Ph.D.
792 $a 2017
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10601600