Markov Chains in the Task of Author’s Writing Style Profile Construction/ Markova ķēžu pielietošanas iespēju izpēte autora stila identifikācijai/ Исследование возможностей применения Марковских цепей для идентификации авторского стиля

Open access

Abstract

This paper examines the possibility of using Markov chains when constructing a profile of author’s writing style. Thus, the constructed profile can be then used to analyze other texts and calculate their level of similarity. The extraction of the unique profile of text writing style that is characteristic of a specific human can be a topical task in many spheres of human activity. As an example, the task of detecting authorship for scientific and fiction texts can be mentioned. The paper describes a basic theoretical apparatus used for profile construction, software implementation of the experimental system as well as the experiments made and provides experimental results and their analysis.

[1] P. A. Osipov and A. N. Borisov, “Abnormal action detection based on Markov models”, in Automatic Control and Computer Sciences, vol. 45, no. 2. 2011, pp. 94-105. http://dx.doi.org/10.3103/S0146411611020052

[2] The GraphML File Format. [Online]. Available: http://graphml.graphdrawing.org. [Accessed 05 July, 2014].

[3] M. S. Elayidom, C. Jose et al, “Text classification for authorship attribution analysis”, in Advanced Computing: An International Journal, ACIJ, vol. 4, no. 5, Sep. 2013, 10 p.

[4] N. Homem and J. P. Carvalho, “Authorship Identification and Author Fuzzy Fingerprints” in Fuzzy Information Processing Society (NAFIPS), 2011 Annual Meeting of the North American, 978-1-61284-968-3/11/2011 IEEE, 2011, pp. 1-6.

[5] A. Metwally, D. Agrawal and A. Abbadi “Efficient Computation of Frequent and Top-k Elements in Data Streams”, University of California, Santa Barbara, USA, Tech. Rep. 2005-23, September, 2005.

[6] R. M. Dabagh “Authorship attribution and statistical text analysis”, in Metodološki zvezki, vol. 4, no. 2, 2007, pp. 149-163.

[7] R. Zheng, Yi Qin, Z. Huang, H. Chen, “Authorship analysis in cybercrime investigation”, H. Chen et al. (Eds.): ISI 2003, LNCS 2665, Springer-Verlag Berlin Heidelberg, 2003, pp. 59-73.

[8] P. N. Bennett, S. T. Dumais and E. Horvitz. “The combination of text classifiers using reliability indicators”, Information Retrieval, vol. 8, no. 1, pp. 67-100, 2005.

[9] C. Sanderson and S. Guenter, “On Authorship Attribution via Markov Chains and Sequence Kernels,” 18th International Conference on Pattern Recognition, ICPR 2006, Aug. 20-24, 2006, Hong Kong, China. http://dx.doi.org/10.1109/ICPR.2006.899

[10] E. Stamatatos, W. Daelemans et al., “Overview of the Author Identification Task at PAN 2014”, CLEF Conference, PAN part, Sheffield, UK, Sep. 15-18, 2014.

[11] H. P. Langtangen, “A Primer on Scientific Programming with Python”, in Texts in Computational Science and Engineering, vol. 6. 4th ed. 2014, XXXI, 872 p. ISBN 978-3-642-54959-5.

[12] J. R. Johansson, P.D. Nation and F. Nori, “QuTiP: An open-source Python framework for the dynamics of open quantum systems”, in Computer Physics Communications, vol. 183, Issue 8, 2012, pp. 1760-1772. http://dx.doi.org/10.1016/j.cpc.2012.02.021

Information Technology and Management Science

The Journal of Riga Technical University

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 239 239 28
PDF Downloads 86 86 4