東華大學圖書館 |

Faster Convolutional Neural Networks Training.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Faster Convolutional Neural Networks Training./
作者:	Jiang, Shanshan.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, : 2021,
面頁冊數:	90 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
Contained By:	Dissertations Abstracts International83-03B.
標題:	Computer science. -
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28713771
ISBN:	9798535566931

Faster Convolutional Neural Networks Training.
Jiang, Shanshan.

Faster Convolutional Neural Networks Training. - Ann Arbor : ProQuest Dissertations & Theses, 2021 - 90 p.

Source: Dissertations Abstracts International, Volume: 83-03, Section: B.

Thesis (Ph.D.)--The University of North Carolina at Charlotte, 2021.

This item must not be sold to any third party vendors.

Convolutional Neural Network (CNN) models have become the mainstream method in Artificial Intelligence (AI) areas for computer vision tasks like image classification and image segmentation. Deep CNNs contain a large volume of convolution calculations. Thus, training a CNN requires powerful GPU resources. Training a large CNN may take days or even weeks, which is time-consuming and costly. When we need multiple runs to search for the optimal CNN hypermeter settings, it would take a couple of months with limited GPUs, which is not acceptable and hinders the development of CNNs. It is essential to train CNN faster.There are two kinds of methods to train CNN faster when no additional computing resources are available. The first method is to do the model compression, either by reducing parameters or using less storage to represent the models. This method reduces training time by reducing the architecture complexity. The second method is to reduce the input data feed into the network without affecting the network architecture.Architecture complexity reduction is a popular research area to train CNN faster. Nowadays, mobile devices like smartphones and smart cars rely on deep CNNs to accomplish complex tasks like human body recognition and face recognition. Due to the high real-time demands and the memory constraints for mobile device applications, conventional large CNN is not suitable. CNN model compression is a trend to train a deep CNN model with less computation cost. Currently, there are many successful networks designed to solve this problem, like ResNeXt, MobileNet, ShuffleNet, and GhostNet. They use 1x1 convolution, depthwise convolution, or group convolution to replace the standard convolution to reduce the computation. However, there are fewer studies on the following questions. First, does the variety of convolution layers (the output channel number is larger or smaller than the input channel number) affect different compression strategies' performance? Second, does the expansion ratio (either the output channel number over the input channel number if the output channel number is larger, or the input channel number over the input channel number if the input channel is larger) of the convolution layers affect different compression strategies' performance? Third, does the compression ratio (the reduced parameter number/FLOPs over the original parameter number/FLOPs) affect the performance of different compression strategies? Current networks tend to use the same convolution strategy inside a basic network block, ignoring the variety of network layers. We have proposed a novel Conditional Reduction (CR) module to compress a single 1x1 convolution layer. Then we have developed a novel three-layer Conditional block (C-block) to compress the CNN bottleneck or inverted bottlenecks. At last we have developed a novel Conditional Network (CRnet) based on the CR module and C-block. We have tested the CRnet on two image classification datasets: CIFAR-10 and CIFAR-100, with multiple network expansion ratios and compression ratios. The experiments verify our methods' correctness with attention to the importance of the input-output pattern when selecting a compression strategy. The experiments show that our proposed CRnet better balances the model complexity and accuracy compared to the state-of-the-art group convolution and Ghost module compression.Data reduction reduces the training time in a direct and simple way through data dropping.There are works drop data by the sample importance ranking. The ranking process takes extra time when there is a large number of training samples. When we tune the different network settings to search for an optimal setting, we expect a way to reduce a large percentage of training time with tiny or no accuracy loss. There are fewer studies on the following questions. First, what are suitable sampling ratios? Second, should we use the same sampling ratio for each training epoch? Third, does the sampling ratio performs differently on small and large datasets? We have proposed a flat reduced random sampling training strategy and a bottleneck reduced random sampling strategy. We have proposed a three-stage training method based on the bottleneck reduced random sampling with consideration of the distinctiveness of the network early-stage training and end-stage training. Furthermore, we have proved the data visibility of a sample in the whole training process and the theoretical reduced time by four theorems and two corollaries. We have tested the two sampling strategies on three image classification datasets: CIFAR-10, CIFAR-100 and ImageNet. The experiments show that our proposed two sampling strategies effectively reduce a significant training time percentage at a very small accuracy loss.

ISBN: 9798535566931Subjects--Topical Terms:

523869
Computer science.
Subjects--Index Terms:

Convolutional Neural Network

Faster Convolutional Neural Networks Training.
LDR:05943nmm a2200349 4500 001 2344667
005 20220531064621.5
008 241004s2021 ||||||||||||||||| ||eng d
020 $a 9798535566931
035 $a (MiAaPQ)AAI28713771
035 $a AAI28713771
040 $a MiAaPQ $c MiAaPQ
100 1 $a Jiang, Shanshan. $3 2103204
245 1 0 $a Faster Convolutional Neural Networks Training.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2021
300 $a 90 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-03, Section: B.
500 $a Advisor: Wang, Sheng-Guo.
502 $a Thesis (Ph.D.)--The University of North Carolina at Charlotte, 2021.
506 $a This item must not be sold to any third party vendors.
520 $a Convolutional Neural Network (CNN) models have become the mainstream method in Artificial Intelligence (AI) areas for computer vision tasks like image classification and image segmentation. Deep CNNs contain a large volume of convolution calculations. Thus, training a CNN requires powerful GPU resources. Training a large CNN may take days or even weeks, which is time-consuming and costly. When we need multiple runs to search for the optimal CNN hypermeter settings, it would take a couple of months with limited GPUs, which is not acceptable and hinders the development of CNNs. It is essential to train CNN faster.There are two kinds of methods to train CNN faster when no additional computing resources are available. The first method is to do the model compression, either by reducing parameters or using less storage to represent the models. This method reduces training time by reducing the architecture complexity. The second method is to reduce the input data feed into the network without affecting the network architecture.Architecture complexity reduction is a popular research area to train CNN faster. Nowadays, mobile devices like smartphones and smart cars rely on deep CNNs to accomplish complex tasks like human body recognition and face recognition. Due to the high real-time demands and the memory constraints for mobile device applications, conventional large CNN is not suitable. CNN model compression is a trend to train a deep CNN model with less computation cost. Currently, there are many successful networks designed to solve this problem, like ResNeXt, MobileNet, ShuffleNet, and GhostNet. They use 1x1 convolution, depthwise convolution, or group convolution to replace the standard convolution to reduce the computation. However, there are fewer studies on the following questions. First, does the variety of convolution layers (the output channel number is larger or smaller than the input channel number) affect different compression strategies' performance? Second, does the expansion ratio (either the output channel number over the input channel number if the output channel number is larger, or the input channel number over the input channel number if the input channel is larger) of the convolution layers affect different compression strategies' performance? Third, does the compression ratio (the reduced parameter number/FLOPs over the original parameter number/FLOPs) affect the performance of different compression strategies? Current networks tend to use the same convolution strategy inside a basic network block, ignoring the variety of network layers. We have proposed a novel Conditional Reduction (CR) module to compress a single 1x1 convolution layer. Then we have developed a novel three-layer Conditional block (C-block) to compress the CNN bottleneck or inverted bottlenecks. At last we have developed a novel Conditional Network (CRnet) based on the CR module and C-block. We have tested the CRnet on two image classification datasets: CIFAR-10 and CIFAR-100, with multiple network expansion ratios and compression ratios. The experiments verify our methods' correctness with attention to the importance of the input-output pattern when selecting a compression strategy. The experiments show that our proposed CRnet better balances the model complexity and accuracy compared to the state-of-the-art group convolution and Ghost module compression.Data reduction reduces the training time in a direct and simple way through data dropping.There are works drop data by the sample importance ranking. The ranking process takes extra time when there is a large number of training samples. When we tune the different network settings to search for an optimal setting, we expect a way to reduce a large percentage of training time with tiny or no accuracy loss. There are fewer studies on the following questions. First, what are suitable sampling ratios? Second, should we use the same sampling ratio for each training epoch? Third, does the sampling ratio performs differently on small and large datasets? We have proposed a flat reduced random sampling training strategy and a bottleneck reduced random sampling strategy. We have proposed a three-stage training method based on the bottleneck reduced random sampling with consideration of the distinctiveness of the network early-stage training and end-stage training. Furthermore, we have proved the data visibility of a sample in the whole training process and the theoretical reduced time by four theorems and two corollaries. We have tested the two sampling strategies on three image classification datasets: CIFAR-10, CIFAR-100 and ImageNet. The experiments show that our proposed two sampling strategies effectively reduce a significant training time percentage at a very small accuracy loss.
590 $a School code: 0694.
650 4 $a Computer science. $3 523869
650 4 $a Random access memory. $3 623617
650 4 $a Accuracy. $3 3559958
650 4 $a Deep learning. $3 3554982
650 4 $a Application programming interface. $3 3562904
650 4 $a Datasets. $3 3541416
650 4 $a Artificial intelligence. $3 516317
650 4 $a Neural networks. $3 677449
650 4 $a Design techniques. $3 3561498
650 4 $a Bottlenecks. $3 3683459
653 $a Convolutional Neural Network
653 $a Deep learning
653 $a Architecture complexity reduction
653 $a Application programming interface
690 $a 0984
690 $a 0800
710 2 $a The University of North Carolina at Charlotte. $b Computer Science. $3 3181506
773 0 $t Dissertations Abstracts International $g 83-03B.
790 $a 0694
791 $a Ph.D.
792 $a 2021
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28713771