Use of Linear Genetic Programming and Artificial Neural Network Methods to Solve Classification Task
This paper presents a comparative analysis of linear genetic programming and artificial neural network methods to solve classification tasks. Usually classification tasks have data sets containing a large number of attributes and records, and more than two classes that will be processed using, for example, created classification rules. As a result, by using classical method to classify a large number of records, a high classification error value will be obtained. The artificial neural networks are often used to solve classification task, mostly obtaining good results. The linear genetic programming is a new direction of evolution algorithms that is not widely researched and its application areas are not well defined. However, some advantages of linear genetic programming are based on genetic operators whose structure does not require complicated calculations.
During this work approximately 400 experiments were conducted with linear genetic programming and artificial neural network methods, using various data sets with different quantity of records, attributes and classes.
Based on the results received, conclusions on possibilities of using the methods of linear genetic programming and artificial neural networks in classification problems were drawn, and suggestions for improving their performance were proposed.
Mining Online Store Client Assessment Classification Rules with Genetic Algorithms
The paper presents the results of the research into algorithms that are not meant to mine classification rules, yet they contain all the necessary functions which allow us to use them for mining classification rules such as Genetic algorithm (GA). The main task of the research is associated with the application of GA to classification rule mining. A classic GA was modified to match the chosen classification task and was compared with other popular classification algorithms - JRip, J48 and Naive Bayes classifier. The paper describes the algorithm proposed and the application task as well as provides a comparative analysis of the obtained results with other algorithms.
Impact of Antibody Panel Size on Classification Accuracy
This paper experimentally studies the influence of antibody panel size reduction on classification results. The presented study includes four classification methods and five feature evaluators that are applied to five different biomedical data sets with large dimensionality (1200 features). The behaviour of the classifiers in these data sets is examined to reveal overall trends of dimensionality reduction impact on classification accuracy.
Madara Gasparovica, Natalia Novoselova and Ludmila Aleksejeva
Using Fuzzy Logic to Solve Bioinformatics Tasks
The goal of this research is to investigate, collect and identify published methods that use fuzzy techniques in bioinformatics tasks. Special attention is paid to studying how the advantages of fuzzy techniques are used in various stages like preprocessing, optimization and building a classifier of classification task as difficult as processing microarray data. This article also inspects the most popular databases used in bioinformatics. The most perspective methods are given more detailed descriptions. Conclusions are made about working abilities of the algorithms and their use in further research.
Transportation Mode Choice Analysis Based on Classification Methods
Mode choice analysis has received the most attention among discrete choice problems in travel behavior literature. Most traditional mode choice models are based on the principle of random utility maximization derived from econometric theory. This paper investigates performance of mode choice analysis with classification methods - decision trees, discriminant analysis and multinomial logit. Experimental results have demonstrated satisfactory quality of classification.
This paper presents a literature review of articles related to the use of decision tree classifiers in gene microarray data analysis published in the last ten years. The main focus is on researches solving the cancer classification problem using single decision tree classifiers (algorithms C4.5 and CART) and decision tree forests (e.g. random forests) showing strengths and weaknesses of the proposed methodologies when compared to other popular classification methods. The article also touches the use of decision tree classifiers in gene selection.
Madara Gasparovica, Irena Tuleiko and Ludmila Aleksejeva
Influence of Membership Functions on Classification of Multi-Dimensional Data
The aim of this study is to explore whether the number of intervals for each attribute influences the classification result and whether a larger number of intervals provide better classification accuracy using the Fuzzy PRISM algorithm. The feature selection has been carried out using Fast correlation-based filter solution, and then the decreased data sets have been applied in experiments with preferences used in the previous experiment series. The article also provides conclusions about the obtained classification results and analyzes criteria of certain experiments and their impact on the final result. Also a series of experiments was carried out to assess how and whether the classification result is influenced by categorization of continuous data, which is one of the membership function construction steps; Fuzzy unordered rule induction algorithm was used. The experiments have been carried out using four real data sets - Golub leukemia, Singh prostate, as well as Gastric cancer and leukemia donor data sets of the Latvian Biomedical Research and Study Center.
Using Data Structure Properties in Decision Tree Classifier Design
This paper studies the techniques of performance enhancement for decision tree classifiers (DTC) that are based on data structure analysis. To improve the performance of DTC, two methods are used - class decomposition that uses the structure of class density and taxonomy based DTC design that uses interactions between attribute values. The paper shows experimental exploration of the methods, their strengths and imperfections and also outlines the directions for further research.
Arnis Kirshners, Galina Kuleshova and Arkady Borisov
Demand Forecasting Based on the Set of Short Time Series
This paper addresses the task of short historical time series and discrete descriptive parameters processing aimed at making demand forecast only on the basis of new product describing parameters. Several data mining methods are used for data processing including data extraction, pre-processing, cluster analysis and classification. Data preparation for data mining processes is made with user-defined parameters entered in the forecasting system. In the selected short historical time series the membership of an object in any class, which is a basis for creating prototypes, is determined using clustering. The k-means clustering algorithm is employed for finding the optimal number of clusters in the sample. The number of clusters is determined on the basis of the mean absolute error. As a result of classification, using inductive decision trees, a correlation between the prototype produced in the course of clustering and product describing parameters is determined. For new product demand clustering, a decision tree obtained as a result of classification is used. New product describing parameters are then projected on the tree, and a tree leave indicating the number of the prototype produced by means of clustering is found. The prototype curve structure depicts possible demand for a new product for the next period.
From Inductive Learning Towards Interactive Inductive Learning
Growing amount of information in the world encourage the use of automatic data processing techniques that reduce humans routine work. There is a wide range of methods used for machine learning; however inductive learning algorithms are preferable in the systems where understanding of decision making steps and further processing of results is needed, for instance the expert systems, where the rules induced by learning algorithms can be used. As the classification tasks are getting more complicated computer program may not make enough informed decision by itself. In such situations collaborative approach between machine and systems user (expert) would be useful. Inductive learning system learns classification from training examples and uses induced rules for classifying new cases. If a decision cannot be inferred from rules base, a guess is performed. Interactive inductive system in uncertain conditions could ask human for decision and improve its knowledge base with the rule derived from this human-made decision. The paper summarises approaches discussed in related works and classifies them by the phase in inductive learning process in which the human interaction appears. As a result a new approach to interactive inductive system is presented. Conceptual example of topographical map classification using this system is demonstrated.