Rule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.
[1] M. Golea, On the complexity of rule extraction from neural networks and network querying, in: Rule Extraction From Trained Artificial Neural Networks Workshop, Society For the Study of Artificial Intelligence and Simulation of Behavior Workshop Series (AISB), 1996, pp. 51-59
[2] T. Hailesilassie, Rule extraction algorithm for deep neural networks: A review, International Journal of Computer Science and Information Security 14, 7, 2016, 376
[3] G. Bologna, Symbolic rule extraction from the dimlp neural network, in: Hybrid neural systems, Springer, 2000, pp. 240-254
[4] G. Bologna, A study on rule extraction from several combined neural networks, International journal of neural systems 11, 03, 2001, 247-255
[5] G. Bologn, Is it worth generating rules from neural network ensembles?, Journal of Applied Logic 2, 3, 2004, 325-348
[6] A. A. Freitas, Comprehensible classification models: a position paper, ACM SIGKDD explorations newsletter 15, 1, 2014, 1-10
[7] J. Chorowski, J. M. Zurada, Learning understandable neural networks with nonnegative weight constraints, Neural Networks and Learning Systems, IEEE Transactions on 26, 1, 2015, 62-69
[8] S. I. Gallant, Connectionist expert systems, Communications of the ACM 31 (2) (1988) 152-169.
[9] R. Andrews, J. Diederich, A. B. Tickle, Survey and critique of techniques for extracting rules from trained artificial neural networks, Knowledgebased systems 8, 6, 1995, 373-389
[10] J. Diederich, Rule extraction from support vector machines, Vol. 80, Springer Science & Business Media, 2008
[11] L. K. Hansen, P. Salamon, Neural network ensembles, IEEE transactions on pattern analysis and machine intelligence 12, 1990, 993-1001
[12] Z.-H. Zhou, Y. Jiang, S.-F. Chen, Extracting symbolic rules from trained neural network ensembles, Artificial Intelligence Communications 16 , 1, 2003 3-16.
[13] R. Setiono, B. Baesens, C. Mues, Recursive neural network rule extraction for data with mixed attributes, Neural Networks, IEEE Transactions on 19 , 2, 2008, 299-307
[14] A. Hara, Y. Hayashi, Ensemble neural network rule extraction using re-rx algorithm, in: Neural Networks (IJCNN), The 2012 International Joint Conference on, IEEE, 2012, pp. 1-6
[15] Y. Hayashi, R. Sato, S. Mitra, A new approach to three ensemble neural network rule extraction using recursive-rule extraction algorithm, in: Neural Networks (IJCNN), The 2013 International Joint Conference on, IEEE, 2013, pp. 1-7
[16] S. N. Tran, A. dAvila Garcez, Knowledge extraction from deep belief networks for images, in: IJCAI-2013Workshop on Neural-Symbolic Learning and Reasoning, 2013
[17] J. Zilke, Extracting rules from deep neural networks, Master’s thesis, Computer Science Department, Technische Universitt Darmstadt, 2015
[18] R. Setiono, W. K. Leow, Fernn: An algorithm for fast extraction of rules from neural networks, Applied Intelligence 12 , 1-2, 2000, 15-25
[19] J. R. Quinlan, C4.5: Programs for machine learning. morgan kaufmann publishers, inc., 1993, Machine Learning 16, 3, 1994, 235-240
[20] G. Bologna, C. Pellegrini, Constraining the mlp power of expression to facilitate symbolic rule extraction, in: Neural Networks Proceedings, 1998, IEEE World Congress on Computational Intelligence. The 1998 IEEE International Joint Conference on, Vol. 1, IEEE, 1998, pp. 146-151
[21] G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, Extreme learning machine: theory and applications, Neurocomputing 70 , 1, 2006, 489-501
[22] L. Breiman, Bagging predictors, Machine learning 24, 2, 1996, 123-140
[23] L. Breman, Bias, variance, and arcing classifiers (technical report 460), Statistics Department, University of California
[24] P. Vincent, H. Larochelle, Y. Bengio, P.A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proceedings of the 25th international conference on Machine learning, ACM, 2008, pp. 1096-1103
[25] M. Lichman, http://archive.ics.uci.edu/ml (UCI machine learning repository 2013)
[26] Y. Hayashi, S. Nakano, S. Fujisawa, Use of the recursiverule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease, Informatics in Medicine Unlocked 1, 2015, 1-8
[27] W. Duch, R. Adamczak, K. Grøbczewski, A new methodology of extraction, optimization and application of crisp and fuzzy logical rules, Neural Networks, IEEE Transactions on 12 , 2, 2001, 277-306
[28] S. Abe, R. Thawonmas, M. Kayama, A fuzzy classifier with ellipsoidal regions for diagnosis problems, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 29 , 1, 1999, 140-148
[29] J. Huysmans, R. Setiono, B. Baesens, J. Vanthienen, Minerva: Sequential covering for rule extraction, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 38 , 2, 2008, 299-309
[30] K. Odajima, Y. Hayashi, G. Tianxia, R. Setiono, Greedy rule generation from discrete data and its use in neural network rule extraction, Neural Networks 21 , 7, 2008, 1020-1028
[31] Y. Hayashi, S. Nakano, Use of a recursiverule extraction algorithm with j48graft to achieve highly accurate and concise rule extraction from a large breast cancer dataset, Informatics in Medicine Unlocked 1, 2015, 9-16
[32] Y. LeCun, C. Cortes, C. Burges, The mnist database of handwritten digits, 1998, 2012, Available electronically at http://yann.lecun.com/exdb/mnist
[33] V. Cherkassky, S. Dhar, Interpretation of blackbox predictive models, in: Measures of Complexity, Springer, 2015, pp. 267-286
[34] W. Verbeke, D. Martens, C. Mues, B. Baesens, Building comprehensible customer churn prediction models with advanced rule induction techniques, Expert Systems with Applications 38 , 3, 2011, 2354-2364
[35] G. Bologna, Y. Hayashi, Qsvm: A support vector machine for rule extraction, in: International WorkConference on Artificial Neural Networks, Springer, 2015, pp. 276-289