Markov Chains in the Task of Author’s Writing Style Profile Construction/ Markova ķēžu pielietošanas iespēju izpēte autora stila identifikācijai/ Исследование возможностей применения Марковских цепей для идентификации авторского стиля

Open access

Abstract

This paper examines the possibility of using Markov chains when constructing a profile of author’s writing style. Thus, the constructed profile can be then used to analyze other texts and calculate their level of similarity. The extraction of the unique profile of text writing style that is characteristic of a specific human can be a topical task in many spheres of human activity. As an example, the task of detecting authorship for scientific and fiction texts can be mentioned. The paper describes a basic theoretical apparatus used for profile construction, software implementation of the experimental system as well as the experiments made and provides experimental results and their analysis.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] P. A. Osipov and A. N. Borisov “Abnormal action detection based on Markov models” in Automatic Control and Computer Sciences vol. 45 no. 2. 2011 pp. 94-105. http://dx.doi.org/10.3103/S0146411611020052

  • [2] The GraphML File Format. [Online]. Available: http://graphml.graphdrawing.org. [Accessed 05 July 2014].

  • [3] M. S. Elayidom C. Jose et al “Text classification for authorship attribution analysis” in Advanced Computing: An International Journal ACIJ vol. 4 no. 5 Sep. 2013 10 p.

  • [4] N. Homem and J. P. Carvalho “Authorship Identification and Author Fuzzy Fingerprints” in Fuzzy Information Processing Society (NAFIPS) 2011 Annual Meeting of the North American 978-1-61284-968-3/11/2011 IEEE 2011 pp. 1-6.

  • [5] A. Metwally D. Agrawal and A. Abbadi “Efficient Computation of Frequent and Top-k Elements in Data Streams” University of California Santa Barbara USA Tech. Rep. 2005-23 September 2005.

  • [6] R. M. Dabagh “Authorship attribution and statistical text analysis” in Metodološki zvezki vol. 4 no. 2 2007 pp. 149-163.

  • [7] R. Zheng Yi Qin Z. Huang H. Chen “Authorship analysis in cybercrime investigation” H. Chen et al. (Eds.): ISI 2003 LNCS 2665 Springer-Verlag Berlin Heidelberg 2003 pp. 59-73.

  • [8] P. N. Bennett S. T. Dumais and E. Horvitz. “The combination of text classifiers using reliability indicators” Information Retrieval vol. 8 no. 1 pp. 67-100 2005.

  • [9] C. Sanderson and S. Guenter “On Authorship Attribution via Markov Chains and Sequence Kernels” 18th International Conference on Pattern Recognition ICPR 2006 Aug. 20-24 2006 Hong Kong China. http://dx.doi.org/10.1109/ICPR.2006.899

  • [10] E. Stamatatos W. Daelemans et al. “Overview of the Author Identification Task at PAN 2014” CLEF Conference PAN part Sheffield UK Sep. 15-18 2014.

  • [11] H. P. Langtangen “A Primer on Scientific Programming with Python” in Texts in Computational Science and Engineering vol. 6. 4th ed. 2014 XXXI 872 p. ISBN 978-3-642-54959-5.

  • [12] J. R. Johansson P.D. Nation and F. Nori “QuTiP: An open-source Python framework for the dynamics of open quantum systems” in Computer Physics Communications vol. 183 Issue 8 2012 pp. 1760-1772. http://dx.doi.org/10.1016/j.cpc.2012.02.021

Search
Journal information
Metrics
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 452 202 1
PDF Downloads 197 106 0