On Training Deep Neural Networks Using a Streaming Approach

Piotr Duda 1 , Maciej Jaworski 1 , Andrzej Cader 2  und Lipo Wang 3
  • 1 Department of Computer Engineering, Częstochowa, Poland
  • 2 Clark University, Worcester
  • 3 School of Electrical and Electronic Engineering, Singapore

Abstract

In recent years, many deep learning methods, allowed for a significant improvement of systems based on artificial intelligence methods. Their effectiveness results from an ability to analyze large labeled datasets. The price for such high accuracy is the long training time, necessary to process such large amounts of data. On the other hand, along with the increase in the number of collected data, the field of data stream analysis was developed. It enables to process data immediately, with no need to store them. In this work, we decided to take advantage of the benefits of data streaming in order to accelerate the training of deep neural networks. The work includes an analysis of two approaches to network learning, presented on the background of traditional stochastic and batch-based methods.

Falls das inline PDF nicht korrekt dargestellt ist, können Sie das PDF hier herunterladen.

  • [1] Abdulsalam, H., Martin, P., and Skillicorn, D. S.; Streaming random forests. In 11th International Database Engineering and Applications Symposium (IDEAS 2007), pp. 225–232.

  • [2] Abdulsalam, H., Skillicorn, D. B., and Martin, P.; Classifying evolving data streams using dynamic streaming random forests. In International Conference on Database and Expert Systems Applications (2008), Springer, pp. 643–651.

  • [3] Baena-Garcia, M., del Campo-Avila, J., Fidalgo, R., Bifet, A., Gavalda, R., and Morales-Bueno, R.; Early drift detection method. In Fourth International Workshop on Knowledge Discovery from Data Streams (2006).

  • [4] Bengio, Y.; Learning deep architectures for AI. Foundations and Trends in Machine Learning 2, 1 (2009), 1–127.

  • [5] Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H.; Greedy layer-wise training of deep networks. In Proceedings of the 19th International Conference on Neural Information Processing Systems (Cambridge, MA, USA, 2006), NIPS’06, MIT Press, pp. 153–160.

  • [6] Bifet, A., and Gavaldà, R. Adaptive learning from evolving data streams. In International Symposium on Intelligent Data Analysis (2009), Springer, pp. 249–260.

  • [7] Bodyanskiy, Y., Vynokurova, O., Pliss, I., Setlak, G., and Mulesa, P.; Fast learning algorithm for deep evolving gmdh-svm neural network in data stream mining tasks. In 2016 IEEE First International Conference on Data Stream Mining Processing (DSMP) (Aug 2016), pp. 257–262.

  • [8] Bologna, G., and Hayashi, Y.; Characterization of symbolic rules embedded in deep dimlp networks: a challenge to transparency of deep learning. Journal of Artificial Intelligence and Soft Computing Research 7, 4 (2017), 265–286.

  • [9] Chung, J., Gülçehre, Ç., Cho, K., and Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014).

  • [10] deBarros, R. S. M., Hidalgo, J. I. G., and de Lima Cabral, D. R.; Wilcoxon rank sum test drift detector. Neurocomputing 275 (2018), 1954–1963.

  • [11] Demsar, J., and Bosnic, Z.; Detecting concept drift in data streams using model explanation. Expert Systems with Applications 92 (2018), 546–559.

  • [12] Deng, L., Hinton, G., and Kingsbury, B.; New types of deep neural network learning for speech recognition and related applications: An overview. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013), IEEE, pp. 8599–8603.

  • [13] Ditzler, G., Roveri, M., Alippi, C., and Polikar, R.; Learning in nonstationary environments: A survey. IEEE Computational Intelligence Magazine 10, 4 (2015), 12–25.

  • [14] Domingos, P., and Hulten, G.; Mining high-speed data streams. In Proc. 6th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (2000), pp. 71–80.

  • [15] Gama, J., Medas, P., Castillo, G., and Rodrigues, P.; Learning with drift detection. In Brazilian Symposium on Artificial Intelligence (2004), Springer, pp. 286–295.

  • [16] Gers, F. A., and Schmidhuber, J.; Recurrent nets that time and count. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium (July 2000), vol. 3, pp. 189–194 vol.3.

  • [17] Gomes, H. M., Barddal, J. P., Enembreck, F., and Bifet, A.; A survey on ensemble learning for data stream classification. ACM Computing Surveys (CSUR) 50, 2 (2017), 23.

  • [18] Gomes, H. M., Bifet, A., Read, J., Barddal, J. P., Enembreck, F., Pfharinger, B., Holmes, G., and Abdessalem, T.; Adaptive random forests for evolving data stream classification. Machine Learning 106, 9-10 (2017), 1469–1495.

  • [19] Goodfellow, I., Bengio, Y., and Courville, A.; Deep Learning. MIT Press, 2016.

  • [20] He, K., Zhang, X., Ren, S., and Sun, J.; Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016), pp. 770–778.

  • [21] Hinton, G. E., Osindero, S., and Teh, Y.-W.; A fast learning algorithm for deep belief nets. Journal of Neural Computation 18, 7 (July 2006), 1527–1554.

  • [22] Hinton, G. E., Sejnowski, T. J., and Ackley, D. H.; Boltzmann machines: Constraint satisfaction networks that learn. Tech. Rep. CMU-CS-84-119, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 1984.

  • [23] Hochreiter, S., Bengio, Y., Frasconi, P., and Schmidhuber, J.; Gradient flow in recurrent nets: the difficulty of learning long-term dependencies, 2001.

  • [24] Hou, Y., and Holder, L. B.; On graph mining with deep learning: Introducing model r for link weight prediction. Journal of Artificial Intelligence and Soft Computing Research 9, 1 (2019), 21–40.

  • [25] Huang, G., Liu, Z., v. d. Maaten, L., and Weinberger, K. Q.; Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July 2017), pp. 2261–2269.

  • [26] II, A. G. O., Giles, C. L., and Reitter, D.; Online semi-supervised learning with deep hybrid boltzmann machines and denoising autoencoders. CoRR abs/1511.06964 (2015).

  • [27] Jaworski, M., Duda, P., and Rutkowski, L.; On applying the Restricted Boltzmann Machine to active concept drift detection. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (Honolulu, USA, 2017), pp. 3512–3519.

  • [28] Jaworski, M., Duda, P., and Rutkowski, L.; Concept drift detection in streams of labelled data using the Restricted Boltzmann Machine. In 2018 International Joint Conference on Neural Networks (IJCNN) (2018), pp. 1–7.

  • [29] Jaworski, M., Rutkowski, L., Duda, P., and Cader, A.; Resource-aware data stream mining using the Restricted Boltzmann Machine. In Artificial Intelligence and Soft Computing (Cham, 2019), L. Rutkowski, R. Scherer, M. Korytkowski, W. Pedrycz, R. Tadeusiewicz, and J. M. Zurada, Eds., Springer International Publishing, pp. 15–24.

  • [30] Kingma, D. P., and Welling, M.; Stochastic gradient vb and the variational auto-encoder. In Second International Conference on Learning Representations, ICLR (2014), vol. 19.

  • [31] Krawczyk, B., Minku, L. L., Gama, J., Stefanowski, J., and Wozniak, M.; Ensemble learning for data stream analysis: A survey. Information Fusion 37 (2017), 132–156.

  • [32] Krizhevsky, A., Sutskever, I., and Hinton, G. E.; Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097–1105.

  • [33] LeCun, Y., Bengio, Y., and Hinton, G.; Deep learning. Nature 521, 7553 (2015), 436.

  • [34] Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P.; Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (Nov 1998), 2278–2324.

  • [35] LeCun, Y., and Cortes, C.; Mnist handwritten digit database (2010); http://yann.lecun.com/exdb/mnist/

  • [36] Mamoshina, P., Vieira, A., Putin, E., and Zhavoronkov, A.; Applications of deep learning in biomedicine; Molecular pharmaceutics 13, 5 (2016), 1445–1454

  • [37] Mello, R. F., Vaz, Y., H.Grossi, C., and Bifet, A.; On learning guarantees to unsupervised concept drift detection on data streams; Expert Systems with Applications 117 (2019), 90–102

  • [38] Page, E. S.,Continuous inspection schemes; Biometrika 41, 1/2 (1954), 100–115

  • [39] Read, J., Perez-Cruz, F., and Bifet, A., Deep learning in partially-labeled data streams; In Proceedings of the 30th Annual ACM Symposium on Applied Computing (New York, NY, USA, 2015), SAC ’15, ACM, pp. 954–959

  • [40] Simonyan, Karen; Zisserman, A., Very deep convolutional networks for large-scale image recognition; eprint arXiv:1409.1556 (2014)

  • [41] Szegedy, C., Wei Liu, Yangqing Jia, Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A., Going deeper with convolutions, In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015), pp. 1–9

  • [42] Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A., Extracting and composing robust features with denoising autoencoders; In Proceedings of the 25th International Conference on Machine Learning (New York, NY, USA, 2008), ICML ’08, ACM, pp. 1096–1103

  • [43] Zeiler, M. D., Adadelta: an adaptive learning rate method; arXiv preprint arXiv:1212.5701 (2012)

OPEN ACCESS

Zeitschrift + Hefte

Suche