Appropriate Number of Standard 2 × 2 Max Pooling Layers and Their Allocation in Convolutional Neural Networks for Diverse and Heterogeneous Datasets

Open access

Abstract

A problem of appropriately allocating pooling layers in convolutional neural networks is considered. The consideration is based on CIFAR-10, NORB, and EEACL26 datasets for preventing “overfitting” in a solution of the problem. For highly accurate image recognition within these datasets, the networks are used with the max pooling operation. The most common form of such operation, which is a 2 × 2 pooling layer, is applied with a stride of 2 without padding after convolutional layers. Based on performance against a series of the network architectures, a rule for the best allocation of max pooling layers is formulated. The rule is to insert a few pooling layers after the starting convolutional layers and to insert a one pooling layer after the last but one convolutional layer (“11...100...010”). For much simpler datasets, the best allocation is “11...100...0”.

[1] S. Lai, L. Jin, and W. Yang, “Toward high-performance online HCCR: A CNN approach with DropDistortion, path signature and spatial stochastic max-pooling,” Pattern Recognition Letters, vol. 89, pp. 60-66, Apr. 2017. https://doi.org/10.1016/j.patrec.2017.02.011

[2] M. Sun, Z. Song, X. Jiang, J. Pan, and Y. Pang, “Learning Pooling for Convolutional Neural Network,” Neurocomputing, vol. 224, pp. 96-104, Feb. 2017. https://doi.org/10.1016/j.neucom.2016.10.049

[3] N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” Journal of Machine Learning Research, vol. 15, pp. 1929-1958, 2014.

[4] H. Wu and X. Gu, “Max-Pooling Dropout for Regularization of Convolutional Neural Networks,” Lecture Notes in Computer Science, pp. 46-54, 2015. https://doi.org/10.1007/978-3-319-26532-2_6

[5] Y. Gong, L. Wang, R. Guo, and S. Lazebnik, “Multi-scale Orderless Pooling of Deep Convolutional Activation Features,” Lecture Notes in Computer Science, pp. 392-407, 2014. https://doi.org/10.1007/978-3-319-10584-0_26

[6] D. Scherer, A. Müller, and S. Behnke, “Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition,” Lecture Notes in Computer Science, pp. 92-101, 2010. https://doi.org/10.1007/978-3-642-15825-4_10

[7] B. Graham, Fractional Max-Pooling, May 2015.

[8] J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. A. Riedmiller, Striving for simplicity: the all convolutional net, 2015.

[9] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, iss. 6, pp. 84-90, Jun. 2017. https://doi.org/10.1145/3065386

[10] P. Date, J. A. Hendler, and C. D. Carothers, “Design Index for Deep Neural Networks,” Procedia Computer Science, vol. 88, pp. 131-138, 2016. https://doi.org/10.1016/j.procs.2016.07.416

[11] D. C. Ciresan, U. Meier, J. Masci, L. M. Gambardella, and J. Schmidhuber, “Flexible, high performance convolutional neural networks for image classification,” Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, pp. 1237-1242, 2011. https://doi.org/10.5591/978-1-57735-516-8/IJCAI11-210

[12] V. V. Romanuke, “Two-layer perceptron for classifying flat scaledturned- shifted objects by additional feature distortions in training,” Journal of Uncertain Systems, vol. 9, no. 4, pp. 286-305, 2015.

[13] V. V. Romanuke, “Appropriate Number and Allocation of ReLUs in Convolutional Neural Networks,” Research Bulletin of the National Technical University of Ukraine “Kyiv Polytechnic Institute”, no. 1, pp. 69-78, Mar. 2017. https://doi.org/10.20535/1810-0546.2017.1.88156

[14] J.-J. Lv, X.-H. Shao, J.-S. Huang, X.-D. Zhou, and X. Zhou, “Data augmentation for face recognition,” Neurocomputing, vol. 230, pp. 184- 196, Mar. 2017. https://doi.org/10.1016/j.neucom.2016.12.025

[15] V. V. Romanuke, “Optimal Training Parameters and Hidden Layer Neuron Number of Two-Layer Perceptron for Generalised Scaled Object Classification Problem,” Information Technology and Management Science, vol. 18, no. 1, pp. 42-48, Jan. 2015. https://doi.org/10.1515/itms-2015-0007

[16] V. V. Romanuke, “Training data expansion and boosting of convolutional neural networks for reducing the MNIST dataset error rate,” Research Bulletin of the National Technical University of Ukraine “Kyiv Polytechnic Institute”, no. 6, pp. 29-34, Dec. 2016. https://doi.org/10.20535/1810-0546.2016.6.84115

[17] V. V. Romanuke, “A framework for classifier single training parameter optimization on training two-layer perceptron in a problem of turned 60-by-80-images classification,” Radio Electronics, Computer Science, Control, no. 2, pp. 85-93, Oct. 2014. https://doi.org/10.15588/1607-3274-2014-2-13

[18] Y. Zhang and B. Shi, “Improving pooling method for regularization of convolutional networks based on the failure probability density,” Optik - International Journal for Light and Electron Optics, vol. 145, pp. 258- 265, Sep. 2017. https://doi.org/10.1016/j.ijleo.2017.07.045

[19] F. Shen, Y. Yang, X. Zhou, X. Liu, and J. Shao, “Face identification with second-order pooling in single-layer networks,” Neurocomputing, vol. 187, pp. 11-18, Apr. 2016. https://doi.org/10.1016/j.neucom.2015.07.133

[20] J. Li, D. Zhang, J. Zhang, J. Zhang, T. Li, Y. Xia, Q. Yan, and L. Xun, “Facial Expression Recognition with Faster R-CNN,” Procedia Computer Science, vol. 107, pp. 135-140, 2017. https://doi.org/10.1016/j.procs.2017.03.069

[21] S. Mukherjee, P. Niyogi, T. Poggio, and R. Rifkin, “Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization,” Advances in Computational Mathematics, vol. 25, no. 1-3, pp. 161-193, Jul. 2006. https://doi.org/10.1007/s10444-004-7634-z

Information Technology and Management Science

The Journal of Riga Technical University

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 231 231 13
PDF Downloads 116 116 6