Search Results

You are looking at 1 - 10 of 41 items for

  • Author: Arkady Borisov x
Clear All Modify Search
Open access

Pavels Osipovs and Arkady Borisov

Abstract

During the development of the system for anomaly detection in the electronic information system, there is a need to review the existing research in the field of user behaviour modelling. Approaches to user behaviour modelling are very diverse: the algorithms based on neural networks, agent-based approach, Bayesian networks and ontologies. Each approach has its advantages and disadvantages, features, and the applicability for the infrastructure of modern complex electronic systems.

Open access

Arnis Kirshners and Arkady Borisov

Analysis of Short Time Series in Gene Expression Tasks

The article analyzes various clustering approaches that are used in gene expression tasks. The chosen approaches are portrayed and examined from the viewpoint of use of data mining clustering algorithms. The article provides a short description of working principles and characteristics of the examined methods and algorithms and the data sets used in the experiments. The article presents results of the experiments that are directly connected to the use of clustering algorithms in processing of short time series in bioinformatics tasks, solving gene expression problems as well as provides conclusions and evaluations of each used approach. An analysis of future possibilities to build a new method that is based on data mining approaches and principles but solves bioinformatics tasks that are associated with processing of short time series and the achieved results are interpreted in a way that is easy to perceive for bioinformatics experts, is presented.

Open access

Inese Polaka and Arkady Borisov

Impact of Antibody Panel Size on Classification Accuracy

This paper experimentally studies the influence of antibody panel size reduction on classification results. The presented study includes four classification methods and five feature evaluators that are applied to five different biomedical data sets with large dimensionality (1200 features). The behaviour of the classifiers in these data sets is examined to reveal overall trends of dimensionality reduction impact on classification accuracy.

Open access

Pavel Osipov and Arkady Borisov

Practice of Web Data Mining Methods Application

Recent growth of information on the Internet imposes high demands on the effectiveness of processing algorithms. This paper discusses some algorithms from the field of Web Data Mining which have proved effective in many existing applications. The paper is divided into two logical parts; the first part provides a theoretical description of the algorithms, but the second one contains examples of their successful use to solve real problems. Search algorithms of vague duplicates of documents are currently actively used by all the leading search engines in the world. The paper describes the following algorithms: shingles, signature methods and image-based algorithms. Such methods of classification as a method of fuzzy clustering to-medium (Fuzzy cmeans/ FCM clustering) and clustering by ant colony (Standard Ant Clustering Algorithm SACA) are considered. In conclusion, the experience of the successful application of fuzzy clustering in conjunction with the software toolkit DataEngine to improve the efficiency of the bank "BCI Bank" is described as well as the sharing of the ant colony clustering method in conjunction with linear genetic programming to meet the increasing efficiency of predicting the load on the servers of high load Internet portal Monash Institut.

Open access

Darya Plinere and Arkady Borisov

SWRL: Rule Acquisition Using Ontology

Nowadays rule-based systems are very common. The use of ontology-based systems is becoming ever more popular, especially in addition to the rule-based one. The most widely used ontology development platform is Protégé. Protégé provides a knowledge acquisition tool, but still the main issue of the ontology-based rule system is rule acquisition. This paper presents an approach to using SWRL rules Tab, a plug-in to Protégé, for rule acquisition. SWRL rules Tab transforms conjunctive rules to Jess rules in IF…THEN form.

Open access

Pēteris Grabusts and Arkady Borisov

Clustering Methodology for Time Series Mining

A time series is a sequence of real data, representing the measurements of a real variable at time intervals. Time series analysis is a sufficiently well-known task; however, in recent years research has been carried out with the purpose to try to use clustering for the intentions of time series analysis. The main motivation for representing a time series in the form of clusters is to better represent the main characteristics of the data. The central goal of the present research paper was to investigate clustering methodology for time series data mining, to explore the facilities of time series similarity measures and to use them in the analysis of time series clustering results. More complicated similarity measures include Longest Common Subsequence method (LCSS). In this paper, two tasks have been completed. The first task was to define time series similarity measures. It has been established that LCSS method gives better results in the detection of time series similarity than the Euclidean distance. The second task was to explore the facilities of the classical k-means clustering algorithm in time series clustering. As a result of the experiment a conclusion has been drawn that the results of time series clustering with the help of k-means algorithm correspond to the results obtained with LCSS method, thus the clustering results of the specific time series are adequate.

Open access

Pavel Osipov and Arkady Borisov

Use of the Deferred Approach in Scientific Applications

In this paper, the implementation of security system that has strict requirements on the time of evaluation of each transaction made by the user is examined on the example of building a system for user behaviour modelling using Markov models. Evaluation of the effectiveness of both the classical approach to the implementation of a server that calculates metric of the user model and with the use of lightweight threads, as well as of a new ideology - Deferred-based event model is performed.

A number of tests of various configurations are conducted, showing the preferred server for the Deferred-based type of system as well as an approach to implementing this type of request service.

Open access

Pavel Osipov and Arkady Borisov

Usage of Ontologies in Systems of Data Exchange

This paper describes the methods and techniques used to effectively extract knowledge from large volumes of heterogeneous data. Also, methods to structure the raw data by the automatic classification using ontology are discussed. In the first part of the article the basic technologies to realize the Semantic WEB are described. Much attention is paid to the ontology, as the major concepts that structure information on a very high level. The second part examines AVT-DTL algorithm proposed by Jun Zhang which allows one to automatically create classifiers according to the available raw, potentially incomplete data. The considered algorithm uses a new concept of floating levels of ontology; the results of the tests show that it outperforms the best existing algorithms for creating classifiers.

Open access

Darya Plinere and Arkady Borisov

A Negotiation-Based Multi-Agent System for Supply Chain Management

A supply chain is a key definition in logistics. The supply chain is a set of logistics system nodes that is linearly ordered by the material, information or financial flow in order to analyze or synthesize a specific set of logistic functions and (or) costs. Multi-agent systems are suitable for the domains that involve interactions between different people or organizations with different (possibly conflicting) goals and proprietary information. They view the supply chain as a set of intelligent agents, each responsible for one or more activities in the supply chain. The ontology, in turn, describes the domain area and becomes a mechanism to aid in understanding and analyzing the information flow between agents. The use of ontologies for multi-agent system provides the following benefits: the ontology enables knowledge structuring and sharing, increases the reliability of agent system and provides the basis for the interaction between the agents. This paper proposes a method of multi-agent system application for supply chain node cooperation and shows the interaction between agents inside one of the supply chain nodes - manufacturer node.

Open access

Arnis Kirshners and Arkady Borisov

Processing Short Time Series with Data Mining Methods

This article examines several data mining approaches that perform short time series analysis. The basis of the methods is formed by clustering algorithms with or without modifications. The proposed methods implement short time series analysis when the numbers of the observations are not equal and the historical information is short. The inspected approaches are offered for solving complex tasks where statistical analysis methods cannot be applied or their functioning does not provide the necessary efficiency. The proposed methods are based on grid-based clustering and k-means algorithm modifications.