A Study on the Behaviour of the Algorithm for Finding Relevant Attributes and Membership Functions
One of the most recent approaches in machine learning is fuzzy rules usage for solving classification problems. This paper describes the algorithm for finding relevant attributes and searching for membership functions. Experimental results are used to clarify - which data sets can be used to automatically gain primary membership functions from primary data. This quality - gaining of membership functions - is one of the pros of the algorithm, because it eases resolution of classification task. The ability to use it with fuzzy data is one more merit. As a result, there are obtained reliable fuzzy classification rules to separate classes. By reconstructing primary membership functions also the number of IF-THEN rules gained from decision tables is reduced up to three times. Four experiments are conducted with different training and testing data set sizes. Conclusions are made about the optimal size of the training and testing data set that is necessary for achieving better results as well as about the data this algorithm is appropriate for. Finally, possible directions for further research are outlined.
Using Fuzzy Algorithms for Modular Rules Induction
The goal of this research is to explore and compare two fuzzy algorithms that extract modular IF-THEN rules - Fuzzy PRISM and Fuzzy AQR learning strategy. The article describes the historical need for algorithms obtained in a different induction process - it points out the weak spots of ID3 algorithm and the necessity for improvements. PRISM algorithm is proposed as an improvement to ID3 algorithm changing its principal induction strategy. Both algorithms examined in this article are modifications of PRISM algorithm. This paper provides step-by-step descriptions of both algorithms, a comparison of the results acquired by both algorithms in working with real data as well as conclusions and directions of future research.
Pēteris Grabusts, Arkādijs Borisovs and Ludmila Aleksejeva
The aim of the article is to analyse and thoroughly research the methods of construction of the decision trees that use decision tree learning with statement propositionalized attributes. Classical decision tree learning algorithms, as well as decision tree learning with propositionalized attributes have been observed. The article provides the detailed analysis of one of the methodologies on the importance of using the decision trees in knowledge presentation. The concept of ontology use is offered to develop classification systems of decision trees. The application of the methodology would allow improving the classification accuracy.
Madara Gasparovica, Natalia Novoselova and Ludmila Aleksejeva
Using Fuzzy Logic to Solve Bioinformatics Tasks
The goal of this research is to investigate, collect and identify published methods that use fuzzy techniques in bioinformatics tasks. Special attention is paid to studying how the advantages of fuzzy techniques are used in various stages like preprocessing, optimization and building a classifier of classification task as difficult as processing microarray data. This article also inspects the most popular databases used in bioinformatics. The most perspective methods are given more detailed descriptions. Conclusions are made about working abilities of the algorithms and their use in further research.
Madara Gasparovica, Irena Tuleiko and Ludmila Aleksejeva
Influence of Membership Functions on Classification of Multi-Dimensional Data
The aim of this study is to explore whether the number of intervals for each attribute influences the classification result and whether a larger number of intervals provide better classification accuracy using the Fuzzy PRISM algorithm. The feature selection has been carried out using Fast correlation-based filter solution, and then the decreased data sets have been applied in experiments with preferences used in the previous experiment series. The article also provides conclusions about the obtained classification results and analyzes criteria of certain experiments and their impact on the final result. Also a series of experiments was carried out to assess how and whether the classification result is influenced by categorization of continuous data, which is one of the membership function construction steps; Fuzzy unordered rule induction algorithm was used. The experiments have been carried out using four real data sets - Golub leukemia, Singh prostate, as well as Gastric cancer and leukemia donor data sets of the Latvian Biomedical Research and Study Center.
Ivars Namatēvs, Ludmila Aleksejeva and Inese Poļaka
Extraction of meaningful information by using artificial neural networks, where the focus is upon developing new insights for sports performance and supporting decision making, is crucial to gain success. The aim of this article is to create a theoretical framework and structurally connect the sports and multi-layer artificial neural network domains through: (a) describing sports as a complex socio-technical system; (b) identification of pre-processing subsystem for classification; (c) feature selection by using data-driven valued tolerance ratio method; (d) design predictive system model of sports performance using a backpropagation neural network. This would allow identifying, classifying, and forecasting performance levels for an enlarged data set.
This article describes the fuzzy classification system developed by the authors and that is particularly applicable to bioinformatics data classification. The description focuses on the following steps in the system: 1) Data preprocessing; 2) Classifier training and construction of the rule base; 3) Classification of new records and 4) Evaluation of the results; it also explains the details of processes in each step as well as the processes of missing data replacement, reduction of the number of alternatives and functions, construction of membership functions and stretching of the induced rules. The article concludes with a justification of the methods and algorithms chosen for each process of the system.
The accumulation of knowledge and its use have become important factors that promote economic development as they contribute to a countryís competitiveness in the global economy. The basic significance of research is obtained by defining new approaches in the organisation, function and efficiency of the higher education system (HES) by emphasising its qualitative aspects. The aim of the article is to describe the influence of education reform on economic competitiveness, paying a special attention to analysing and evaluating international experiences from an interdisciplinary perspective, including economics, pedagogy, etc. Quantitative indicators are used to characterise specific features of the HES and the interaction of this system in the overall context of state development. Some aspects of the Latvian HES are also analysed. The economic activity of inhabitants often directly depends on their level of education. In order to reorganise the Latvian HES and increase its competitiveness and efficiency, thus ensuring quality and availability, the Latvian education system must define a middle-term (4ñ5 years) and long-term (10ñ15 years) development plan that is coordinated with national economic development.
Arnis Kirshners, Inese Polaka and Ludmila Aleksejeva
Data mining methods are applied to a medical task that seeks for the information about the influence of Helicobacter Pylori on the gastric cancer risk increase by analysing the adverse factors of individual lifestyle. In the process of data preprocessing, the data are cleared of noise and other factors, reduced in dimensionality, as well as transformed for the task and cleared of non-informative attributes. Data classification using C4.5, CN2 and k-nearest neighbour algorithms is carried out to find relationships between the analysed attributes and the descriptive class attribute – Helicobacter Pylori presence that could have influence on the cancer development risk. Experimental analysis is carried out using the data of the Latvian-based project “Interdisciplinary Research Group for Early Cancer Detection and Cancer Prevention” database.