東華大學圖書館 |

語系: 繁體中文

說明(常見問題)

回圖書館首頁

手機版館藏查詢

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

FindBook

Google Book

Amazon

博客來

A Novel Approach for Improving the Quality of Data Using Aggregation Mechanism.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	A Novel Approach for Improving the Quality of Data Using Aggregation Mechanism./
作者:	Al-khateeb, Shadi Ahmed.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	102 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:	Dissertations Abstracts International83-03B.
標題:	Metadata. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28684742
ISBN:	9798544203049

A Novel Approach for Improving the Quality of Data Using Aggregation Mechanism.
Al-khateeb, Shadi Ahmed.

A Novel Approach for Improving the Quality of Data Using Aggregation Mechanism. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 102 p.

Source: Dissertations Abstracts International, Volume: 83-03, Section: B.

Thesis (Ph.D.)--University of Pittsburgh, 2021.

This item must not be sold to any third party vendors.

Due to the inception of the big data applications, it is becoming increasingly important to manage and analyze large volumes of data. However, it is not always possible to efficiently analyze very big chunks of detailed data. Thus, data aggregation techniques emerged as an efficient solution for reducing the data size and providing summary of the key information in the original data. For example, yearly stock sales are used instead of daily sales to provide a general summary of the sales. Data aggregation aims to group raw data elements in order to facilitate the assessment of higher-level concepts. However, data aggregation can result in the loss of some important details in the original data, which means that the aggregation should be done in a creative manner in order to keep the data informative even if there is a loss in some details. In some cases, we may have only aggregated versions of the data due to the data collection constraints as well as high storage and processing requirements of the big data. In these cases, we need to find the relationship between aggregated datasets and original datasets. Data disaggregation is one solution for this issue. However, accurate disaggregation is not always possible and easy to utilize. In this dissertation, we introduce a novel approach to improve the quality of data to be more informative without disaggregating the data. We propose information preserving signature based preprocessing strategy, as well as an aggregation-based information retrieval architecture using signatures. We compensate the loss of details in the raw data by highlighting the most informative parts in the aggregated data. Our approach can be used to assess similarity and correspondence between datasets and to link aggregated historical data with most related datasets. We extended our approach to be used with time series datasets. We also created hybrid signatures to be used at any aggregation level.

ISBN: 9798544203049Subjects--Topical Terms:

590006
Metadata.

A Novel Approach for Improving the Quality of Data Using Aggregation Mechanism.
LDR:03048nmm a2200325 4500 001 2344857
005 20220531062159.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798544203049
035 $a (MiAaPQ)AAI28684742
035 $a (MiAaPQ)Pittsburgh41036
035 $a AAI28684742
040 $a MiAaPQ $c MiAaPQ
100 1 $a Al-khateeb, Shadi Ahmed. $3 3683682
245 1 0 $a A Novel Approach for Improving the Quality of Data Using Aggregation Mechanism.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 102 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500 $a Advisor: Grant, John;Pelechrinis, Konstantinos;Munro, Paul;Zadorozhny, Vladimir.
502 $a Thesis (Ph.D.)--University of Pittsburgh, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a Due to the inception of the big data applications, it is becoming increasingly important to manage and analyze large volumes of data. However, it is not always possible to efficiently analyze very big chunks of detailed data. Thus, data aggregation techniques emerged as an efficient solution for reducing the data size and providing summary of the key information in the original data. For example, yearly stock sales are used instead of daily sales to provide a general summary of the sales. Data aggregation aims to group raw data elements in order to facilitate the assessment of higher-level concepts. However, data aggregation can result in the loss of some important details in the original data, which means that the aggregation should be done in a creative manner in order to keep the data informative even if there is a loss in some details. In some cases, we may have only aggregated versions of the data due to the data collection constraints as well as high storage and processing requirements of the big data. In these cases, we need to find the relationship between aggregated datasets and original datasets. Data disaggregation is one solution for this issue. However, accurate disaggregation is not always possible and easy to utilize. In this dissertation, we introduce a novel approach to improve the quality of data to be more informative without disaggregating the data. We propose information preserving signature based preprocessing strategy, as well as an aggregation-based information retrieval architecture using signatures. We compensate the loss of details in the raw data by highlighting the most informative parts in the aggregated data. Our approach can be used to assess similarity and correspondence between datasets and to link aggregated historical data with most related datasets. We extended our approach to be used with time series datasets. We also created hybrid signatures to be used at any aggregation level.
590 $a School code: 0178.
650 4 $a Metadata. $3 590006
650 4 $a Deep learning. $3 3554982
650 4 $a Wavelet transforms. $3 3681479
650 4 $a Image retrieval. $3 3562846
650 4 $a Neural networks. $3 677449
650 4 $a Signal processing. $3 533904
650 4 $a Color. $3 533870
650 4 $a Databases. $3 747532
650 4 $a Decomposition. $3 3561186
650 4 $a Algorithms. $3 536374
650 4 $a Time series. $3 3561811
650 4 $a Semantics. $3 520060
650 4 $a Statistical analysis. $3 3543751
650 4 $a Linguistics. $3 524476
650 4 $a Language. $3 643551
650 4 $a Mathematics. $3 515831
650 4 $a Research. $3 531893
650 4 $a Accuracy. $3 3559958
650 4 $a Datasets. $3 3541416
650 4 $a Classification. $3 595585
650 4 $a Methods. $3 3560391
690 $a 0290
690 $a 0679
690 $a 0405
710 2 $a University of Pittsburgh. $3 958527
773 0 $t Dissertations Abstracts International $g 83-03B.
790 $a 0178
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28684742