Rule extraction from neural networks is a fervent research topic. In the last 20 years many authors presented a number of techniques showing how to extract symbolic rules from Multi Layer Perceptrons (MLPs). Nevertheless, very few were related to ensembles of neural networks and even less for networks trained by deep learning. On several datasets we performed rule extraction from ensembles of Discretized Interpretable Multi Layer Perceptrons (DIMLP), and DIMLPs trained by deep learning. The results obtained on the Thyroid dataset and the Wisconsin Breast Cancer dataset show that the predictive accuracy of the extracted rules compare very favorably with respect to state of the art results. Finally, in the last classification problem on digit recognition, generated rules from the MNIST dataset can be viewed as discriminatory features in particular digit areas. Qualitatively, with respect to rule complexity in terms of number of generated rules and number of antecedents per rule, deep DIMLPs and DIMLPs trained by arcing give similar results on a binary classification problem involving digits 5 and 8. On the whole MNIST problem we showed that it is possible to determine the feature detectors created by neural networks and also that the complexity of the extracted rulesets can be well balanced between accuracy and interpretability.
The purpose of this study was to generate more concise rule extraction from the Recursive-Rule Extraction (Re-RX) algorithm by replacing the C4.5 program currently employed in Re-RX with the J48graft algorithm. Experiments were subsequently conducted to determine rules for six different two-class mixed datasets having discrete and continuous attributes and to compare the resulting accuracy, comprehensibility and conciseness. When working with the CARD1, CARD2, CARD3, German, Bene1 and Bene2 datasets, Re-RX with J48graft provided more concise rules than the original Re-RX algorithm. The use of Re-RX with J48graft resulted in 43.2%, 37% and 21% reductions in rules in the case of the German, Bene1 and Bene2 datasets compared to Re-RX. Furthermore, the Re-RX with J48graft showed 8.87% better accuracy than the Re-RX algorithm for the German dataset. These results confirm that the application of Re-RX in conjunction with J48graft has the capacity to facilitate migration from existing data systems toward new concise analytic systems and Big Data.