Open Access

Mining Similar Traces of Entities on Web

Cybernetics and Information Technologies's Cover Image
Cybernetics and Information Technologies
Special Issue on Logistics, Informatics and Service Science

Cite

Events about entities have been widely collected on Web, allowing us to analyze how peer entities interact and learn the relationships that exist among the entities. In this paper we investigate similar traces that have not been adequately studied so far. Intuitively, peer entities tend to have similar traces. The challenges in mining similar traces are: (1) the occurring time lags of traces are usually unknown and varying; (2) the existence of large-scale events of entities and complexity of the model representing all the events. In this paper we propose a simple, but practical method that addresses all these challenges. Firstly, sliding windows are adopted to filter out the significant events and then find the candidate topic sequences. Secondly, dynamic programming is employed to mine similar candidate topic sequences of entities. Finally, an efficient method is proposed to mine all the similar traces of entities. It is able to mine similar traces of peer entities with high accuracy. We conduct comprehensive experiments on synthetic datasets to demonstrate the efficiency of the method proposed.

eISSN:
1314-4081
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology