Search Results

1 - 5 of 5 items

  • Author: Demetrovics Janos x
Clear All Modify Search

Abstract

The problem of finding reducts plays an important role in processing information on decision tables. The objective of the attribute reduction problem is to reject a redundant attribute in order to find a core attribute for data processing. The attribute reduction in decision tables is the process of finding a minimal subset of conditional attributes which preserve the classification ability of decision tables. In this paper we present the time complexity of the problem of finding all reducts of a consistent decision table. We prove that this time complexity is exponential with respect to the number of attributes of the decision tables. Our proof is performed in two steps. The first step is to show that there exists an exponential algorithm which finds all reducts. The other step is to prove that the time complexity of the problem of finding all reducts of a decision table is not less than exponential.

Abstract

In rough set theory, the number of all reducts for a given decision table can be exponential with respect to the number of attributes. This paper investigates the problem of determining the set of all reductive attributes which are present in at least one reduct of an incomplete decision table. We theoretically prove that this problem can be solved in polynomial time. This result shows that the problem of determining the union of all reducts can be solved in polynomial time, and the problem of determining the set of all redundant attributes which are not present in any reducts can also be solved in polynomial time.

Abstract

Feature selection is a vital problem which needs to be effectively solved in knowledge discovery in databases and pattern recognition due to two basic reasons: minimizing costs and accurately classifying data. Feature selection using rough set theory is also called attribute reduction. It has attracted a lot of attention from researchers and numerous potential results have been gained. However, most of them are applied on static data and attribute reduction in dynamic databases is still in its early stages. This paper focuses on developing incremental methods and algorithms to derive reducts, employing a distance measure when decision systems vary in condition attribute set. We also conduct experiments on UCI data sets and the experimental results show that the proposed algorithms are better in terms of time consumption and reducts’ cardinality in comparison with non-incremental heuristic algorithm and the incremental approach using information entropy proposed by authors in [17].

Abstract

According to traditional rough set theory approach, attribute reduction methods are performed on the decision tables with the discretized value domain, which are decision tables obtained by discretized data methods. In recent years, researches have proposed methods based on fuzzy rough set approach to solve the problem of attribute reduction in decision tables with numerical value domain. In this paper, we proposeafuzzy distance between two partitions and an attribute reduction method in numerical decision tables based on proposed fuzzy distance. Experiments on data sets show that the classification accuracy of proposed method is more efficient than the ones based fuzzy entropy.

Abstract

Mining High Utility Sequential Patterns (HUSP) is an emerging topic in data mining which attracts many researchers. The HUSP mining algorithms can extract sequential patterns having high utility (importance) in a quantitative sequence database. In real world applications, the time intervals between elements are also very important. However, recent HUSP mining algorithms cannot extract sequential patterns with time intervals between elements. Thus, in this paper, we propose an algorithm for mining high utility sequential patterns with the time interval problem. We consider not only sequential patterns’ utilities, but also their time intervals. The sequence weight utility value is used to ensure the important downward closure property. Besides that, we use four time constraints for dealing with time interval in the sequence to extract more meaningful patterns. Experimental results show that our proposed method is efficient and effective in mining high utility sequential pattern with time intervals.