Supposed Maximum Mutual Information for Improving Generalization and Interpretation of Multi-Layered Neural Networks

[1] R. Kamimura, Mutual information maximization for improving and interpreting multi-layered neural network, in Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI) (SSCI 2017), 2017.10.1109/SSCI.2017.8285182Search in Google Scholar

[2] R. Linsker, Self-organization in a perceptual network, Computer, vol. 21, no. 3, pp. 105–117, 1988.10.1109/2.36Search in Google Scholar

[3] R. Linsker, How to generate ordered maps by maximizing the mutual information between input and output signals, Neural computation, vol. 1, no. 3, pp. 402–411, 1989.10.1162/neco.1989.1.3.402Search in Google Scholar

[4] R. Linsker, Local synaptic learning rules suffice to maximize mutual information in a linear network, Neural Computation, vol. 4, no. 5, pp. 691–702, 1992.10.1162/neco.1992.4.5.691Search in Google Scholar

[5] R. Linsker, Improved local learning rule for information maximization and related applications, Neural networks, vol. 18, no. 3, pp. 261–265, 2005.10.1016/j.neunet.2005.01.002Search in Google Scholar

[6] R. Battiti, Using mutual information for selecting features in supervised neural net learning, IEEE Transactions on neural networks, vol. 5, no. 4, pp. 537–550, 1994.10.1109/72.29822418267827Search in Google Scholar

[7] S. Becker, Mutual information maximization: models of cortical self-organization, Network: Computation in Neural Systems, vol. 7, pp. 7–31, 1996.10.1080/0954898X.1996.1197865329480142Search in Google Scholar

[8] G. Deco, W. Finnoff, and H. Zimmermann, Unsupervised mutual information criterion for elimination of overtraining in supervised multilayer networks, Neural Computation, vol. 7, no. 1, pp. 86–107, 1995.10.1162/neco.1995.7.1.86Search in Google Scholar

[9] G. Deco and D. Obradovic, An information-theoretic approach to neural computing. Springer Science & Business Media, 2012.Search in Google Scholar

[10] J. C. Principe, D. Xu, and J. Fisher, Information theoretic learning, Unsupervised adaptive filtering, vol. 1, pp. 265–319, 2000.Search in Google Scholar

[11] J. C. Principe, Information theoretic learning: Renyi’s entropy and kernel perspectives, Springer Science & Business Media, 2010.10.1007/978-1-4419-1570-2Search in Google Scholar

[12] P. A. Estévez, M. Tesmer, C. A. Perez, and J. M. Zurada, Normalized mutual information feature selection, IEEE Transactions on Neural Networks, vol. 20, no. 2, pp. 189–201, 2009.10.1109/TNN.2008.200560119150792Search in Google Scholar

[13] P. Comon, Independent component analysis, Higher-Order Statistics, pp. 29–38, 1992.Search in Google Scholar

[14] A. J. Bell and T. J. Sejnowski, The independent components of natural scenes are edge filters, Vision research, vol. 37, no. 23, pp. 3327–3338, 1997.Search in Google Scholar

[15] A. Hyvärinen and E. Oja, Independent component analysis: algorithms and applications, Neural networks, vol. 13, no. 4, pp. 411–430, 2000.10.1016/S0893-6080(00)00026-5Search in Google Scholar

[16] P. Comon, Independent component analysis: a new concept, Signal Processing, vol. 36, pp. 287–314, 1994.10.1016/0165-1684(94)90029-9Search in Google Scholar

[17] A. Bell and T. J. Sejnowski, An information-maximization approach to blind separation and blind deconvolution, Neural Computation, vol. 7, no. 6, pp. 1129–1159, 1995.Search in Google Scholar

[18] J. Karhunen, A. Hyvarinen, R. Vigário, J. Hurri, and E. Oja, Applications of neural blind separation to signal and image processing, in Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on, vol. 1, pp. 131–134, IEEE, 1997.Search in Google Scholar

[19] H. B. Barlow, Unsupervised learning, Neural computation, vol. 1, no. 3, pp. 295–311, 1989.10.1162/neco.1989.1.3.295Search in Google Scholar

[20] H. B. Barlow, T. P. Kaushal, and G. J. Mitchison, Finding minimum entropy codes, Neural Computation, vol. 1, no. 3, pp. 412–423, 1989.10.1162/neco.1989.1.3.412Search in Google Scholar

[21] R. Kamimura, Simple and stable internal representation by potential mutual information maximization, in International Conference on Engineering Applications of Neural Networks, pp. 309–316, Springer, 2016.10.1007/978-3-319-44188-7_23Search in Google Scholar

[22] R. Kamimura, Self-organizing selective potentiality learning to detect important input neurons, in Systems, Man, and Cybernetics (SMC), 2015 IEEE International Conference on, pp. 1619–1626, IEEE, 2015.Search in Google Scholar

[23] R. Kamimura, Collective interpretation and potential joint information maximization, in Intelligent Information Processing VIII: 9th IFIP TC 12 International Conference, IIP 2016, Melbourne, VIC, Australia, November 18-21, 2016, Proceedings 9, pp. 12–21, 2016. Springer.10.1007/978-3-319-48390-0_2Search in Google Scholar

[24] R. Kamimura, Repeated potentiality assimilation: simplifying learning procedures by positive, independent and indirect operation for improving generalization and interpretation, in Neural Networks (IJCNN), 2016 International Joint Conference on, pp. 803–810, IEEE, 2016.10.1109/IJCNN.2016.7727282Search in Google Scholar

[25] R. Kamimura, Collective mutual information maximization to unify passive and positive approaches for improving interpretation and generalization, Neural Networks, vol. 90, pp. 56–71, 2017.10.1016/j.neunet.2017.03.001Search in Google Scholar

[26] R. Kamimura, Direct potentiality assimilation for improving multi-layered neural networks, in Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, pp. 19–23, 2017.10.15439/2017F552Search in Google Scholar

[27] R. Andrews, J. Diederich, and A. B. Tickle, Survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledge-based systems, vol. 8, no. 6, pp. 373–389, 1995.10.1016/0950-7051(96)81920-4Search in Google Scholar

[28] J. M. Benítez, J. L. Castro, and I. Requena, Are artificial neural networks black boxes?, IEEE Transactions on Neural Networks, vol. 8, no. 5, pp. 1156–1164, 1997.Search in Google Scholar

[29] M. Ishikawa, Rule extraction by successive regularization, Neural Networks, vol. 13, no. 10, pp. 1171–1183, 2000.Search in Google Scholar

[30] T. Q. Huynh and J. A. Reggia, Guiding hidden layer representations for improved rule extraction from neural networks, IEEE Transactions on Neural Networks, vol. 22, no. 2, pp. 264–275, 2011.10.1109/TNN.2010.2094205Search in Google Scholar

[31] B. Mak and T. Munakata, Rule extraction from expert heuristics: a comparative study of rough sets with neural network and ID3, European journal of Operational Research, vol. 136, pp. 212–229, 2002.10.1016/S0377-2217(01)00062-5Search in Google Scholar

[32] J. Yosinski, J. Clune, T. Fuchs, and H. Lipson, Understanding neural networks through deep visualization, in In ICML Workshop on Deep Learning, Citeseer, 2015.Search in Google Scholar

[33] D. Erhan, Y. Bengio, A. Courville, and P. Vincent, Visualizing higher-layer features of a deep network, University of Montreal, vol. 1341, 2009.Search in Google Scholar

[34] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks, vol. 61, pp. 85–117, 2015.10.1016/j.neunet.2014.09.00325462637Search in Google Scholar

[35] M. G. Cardoso, Logical discriminant models, in Quantitative Modelling In Marketing And Management, pp. 223–253, World Scientific, 2013.10.1142/9789814407724_0008Search in Google Scholar

eISSN:: 2083-2567
Language:: English

Publication timeframe:: 4 times per year
Journal Subjects:: Computer Sciences, Artificial Intelligence, Databases and Data Mining

Journal RSS Feed

Supposed Maximum Mutual Information for Improving Generalization and Interpretation of Multi-Layered Neural Networks

Published Online: Dec 31, 2018

Page range: 123 - 147

Received: Feb 06, 2018

Accepted: Aug 13, 2018

DOI: https://doi.org/10.2478/jaiscr-2018-0029

Keywordsmutual information, disentanglement, generalization, interpretation

© 2018 Ryotaro Kamimura, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.

Keywords
mutual information, disentanglement, generalization, interpretation