Improving batch normalization with skewness reduction for deep neural networks

Pak Lun Kevin Ding; Sarah Martin; Baoxin Li

doi:10.1109/ICPR48806.2021.9412949

Improving batch normalization with skewness reduction for deep neural networks

Pak Lun Kevin Ding, Sarah Martin, Baoxin Li

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

2 Scopus citations

Abstract

Batch Normalization (BN) is a well-known technique used in training deep neural networks. The main idea behind batch normalization is to normalize the features of the layers (i.e., transforming them to have a mean equal to zero and a variance equal to one). Such a procedure encourages the optimization landscape of the loss function to be smoother, and improves the learning of the networks for both speed and performance. In this paper, we demonstrate that the performance of the network can be improved, if the distributions of the features of the output in the same layer are similar. As normalizing based on mean and variance does not necessarily make the features to have the same distribution, we propose a new normalization scheme: Batch Normalization with Skewness Reduction (BNSR). Comparing with other normalization approaches, BNSR transforms not just only the mean and variance, but also the skewness of the data. By tackling this property of a distribution, we are able to make the output distributions of the layers to be further similar. The nonlinearity of BNSR may further improve the expressiveness of the underlying network. Comparisons with other normalization schemes are tested on the CIFAR-100 and ImageNet datasets. Experimental results show that the proposed approach can outperform other state-of-the-arts that are not equipped with BNSR.

Original language	English (US)
Title of host publication	Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	7165-7172
Number of pages	8
ISBN (Electronic)	9781728188089
DOIs	https://doi.org/10.1109/ICPR48806.2021.9412949
State	Published - 2020
Event	25th International Conference on Pattern Recognition, ICPR 2020 - Virtual, Milan, Italy Duration: Jan 10 2021 → Jan 15 2021

Publication series

Name	Proceedings - International Conference on Pattern Recognition
ISSN (Print)	1051-4651

Conference

Conference	25th International Conference on Pattern Recognition, ICPR 2020
Country/Territory	Italy
City	Virtual, Milan
Period	1/10/21 → 1/15/21

ASJC Scopus subject areas

Computer Vision and Pattern Recognition

Access to Document

10.1109/ICPR48806.2021.9412949

Cite this

Kevin Ding, P. L., Martin, S., & Li, B. (2020). Improving batch normalization with skewness reduction for deep neural networks. In Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition (pp. 7165-7172). Article 9412949 (Proceedings - International Conference on Pattern Recognition). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICPR48806.2021.9412949

Improving batch normalization with skewness reduction for deep neural networks. / Kevin Ding, Pak Lun; Martin, Sarah; Li, Baoxin.
Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition. Institute of Electrical and Electronics Engineers Inc., 2020. p. 7165-7172 9412949 (Proceedings - International Conference on Pattern Recognition).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Kevin Ding, PL, Martin, S & Li, B 2020, Improving batch normalization with skewness reduction for deep neural networks. in Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition., 9412949, Proceedings - International Conference on Pattern Recognition, Institute of Electrical and Electronics Engineers Inc., pp. 7165-7172, 25th International Conference on Pattern Recognition, ICPR 2020, Virtual, Milan, Italy, 1/10/21. https://doi.org/10.1109/ICPR48806.2021.9412949

Kevin Ding PL, Martin S, Li B. Improving batch normalization with skewness reduction for deep neural networks. In Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition. Institute of Electrical and Electronics Engineers Inc. 2020. p. 7165-7172. 9412949. (Proceedings - International Conference on Pattern Recognition). doi: 10.1109/ICPR48806.2021.9412949

@inproceedings{ac447978a5924b6f8fe005f119f145e4,

title = "Improving batch normalization with skewness reduction for deep neural networks",

abstract = "Batch Normalization (BN) is a well-known technique used in training deep neural networks. The main idea behind batch normalization is to normalize the features of the layers (i.e., transforming them to have a mean equal to zero and a variance equal to one). Such a procedure encourages the optimization landscape of the loss function to be smoother, and improves the learning of the networks for both speed and performance. In this paper, we demonstrate that the performance of the network can be improved, if the distributions of the features of the output in the same layer are similar. As normalizing based on mean and variance does not necessarily make the features to have the same distribution, we propose a new normalization scheme: Batch Normalization with Skewness Reduction (BNSR). Comparing with other normalization approaches, BNSR transforms not just only the mean and variance, but also the skewness of the data. By tackling this property of a distribution, we are able to make the output distributions of the layers to be further similar. The nonlinearity of BNSR may further improve the expressiveness of the underlying network. Comparisons with other normalization schemes are tested on the CIFAR-100 and ImageNet datasets. Experimental results show that the proposed approach can outperform other state-of-the-arts that are not equipped with BNSR.",

author = "{Kevin Ding}, {Pak Lun} and Sarah Martin and Baoxin Li",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE; 25th International Conference on Pattern Recognition, ICPR 2020 ; Conference date: 10-01-2021 Through 15-01-2021",

year = "2020",

doi = "10.1109/ICPR48806.2021.9412949",

language = "English (US)",

series = "Proceedings - International Conference on Pattern Recognition",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "7165--7172",

booktitle = "Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition",

}

TY - GEN

T1 - Improving batch normalization with skewness reduction for deep neural networks

AU - Kevin Ding, Pak Lun

AU - Martin, Sarah

AU - Li, Baoxin

PY - 2020

Y1 - 2020

N2 - Batch Normalization (BN) is a well-known technique used in training deep neural networks. The main idea behind batch normalization is to normalize the features of the layers (i.e., transforming them to have a mean equal to zero and a variance equal to one). Such a procedure encourages the optimization landscape of the loss function to be smoother, and improves the learning of the networks for both speed and performance. In this paper, we demonstrate that the performance of the network can be improved, if the distributions of the features of the output in the same layer are similar. As normalizing based on mean and variance does not necessarily make the features to have the same distribution, we propose a new normalization scheme: Batch Normalization with Skewness Reduction (BNSR). Comparing with other normalization approaches, BNSR transforms not just only the mean and variance, but also the skewness of the data. By tackling this property of a distribution, we are able to make the output distributions of the layers to be further similar. The nonlinearity of BNSR may further improve the expressiveness of the underlying network. Comparisons with other normalization schemes are tested on the CIFAR-100 and ImageNet datasets. Experimental results show that the proposed approach can outperform other state-of-the-arts that are not equipped with BNSR.

AB - Batch Normalization (BN) is a well-known technique used in training deep neural networks. The main idea behind batch normalization is to normalize the features of the layers (i.e., transforming them to have a mean equal to zero and a variance equal to one). Such a procedure encourages the optimization landscape of the loss function to be smoother, and improves the learning of the networks for both speed and performance. In this paper, we demonstrate that the performance of the network can be improved, if the distributions of the features of the output in the same layer are similar. As normalizing based on mean and variance does not necessarily make the features to have the same distribution, we propose a new normalization scheme: Batch Normalization with Skewness Reduction (BNSR). Comparing with other normalization approaches, BNSR transforms not just only the mean and variance, but also the skewness of the data. By tackling this property of a distribution, we are able to make the output distributions of the layers to be further similar. The nonlinearity of BNSR may further improve the expressiveness of the underlying network. Comparisons with other normalization schemes are tested on the CIFAR-100 and ImageNet datasets. Experimental results show that the proposed approach can outperform other state-of-the-arts that are not equipped with BNSR.

UR - http://www.scopus.com/inward/record.url?scp=85110515891&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85110515891&partnerID=8YFLogxK

U2 - 10.1109/ICPR48806.2021.9412949

DO - 10.1109/ICPR48806.2021.9412949

M3 - Conference contribution

AN - SCOPUS:85110515891

T3 - Proceedings - International Conference on Pattern Recognition

SP - 7165

EP - 7172

BT - Proceedings of ICPR 2020 - 25th International Conference on Pattern Recognition

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 25th International Conference on Pattern Recognition, ICPR 2020

Y2 - 10 January 2021 through 15 January 2021

ER -

Improving batch normalization with skewness reduction for deep neural networks

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this