Stylometric authorship attribution aims to identify an anonymous or disputed document’s author by examining its writing style. The development of powerful machine learning based stylometric authorship attribution methods presents a serious privacy threat for individuals such as journalists and activists who wish to publish anonymously. Researchers have proposed several authorship obfuscation approaches that try to make appropriate changes (e.g. word/phrase replacements) to evade attribution while preserving semantics. Unfortunately, existing authorship obfuscation approaches are lacking because they either require some manual effort, require significant training data, or do not work for long documents. To address these limitations, we propose a genetic algorithm based random search framework called Mutant-X which can automatically obfuscate text to successfully evade attribution while keeping the semantics of the obfuscated text similar to the original text. Specifically, Mutant-X sequentially makes changes in the text using mutation and crossover techniques while being guided by a fitness function that takes into account both attribution probability and semantic relevance. While Mutant-X requires black-box knowledge of the adversary’s classifier, it does not require any additional training data and also works on documents of any length. We evaluate Mutant-X against a variety of authorship attribution methods on two different text corpora. Our results show that Mutant-X can decrease the accuracy of state-of-the-art authorship attribution methods by as much as 64% while preserving the semantics much better than existing automated authorship obfuscation approaches. While Mutant-X advances the state-of-the-art in automated authorship obfuscation, we find that it does not generalize to a stronger threat model where the adversary uses a different attribution classifier than what Mutant-X assumes. Our findings warrant the need for future research to improve the generalizability (or transferability) of automated authorship obfuscation approaches.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 A. Abbasi and H. Chen. Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Transactions on Information Systems (TOIS) 26(2):7 2008.
 S. Afroz A. C. Islam A. Stolerman R. Greenstadt and D. McCoy. Doppelgänger finder: Taking stylometry to the underground. In IEEE Symposium on Security and Privacy (IEEE S&P) pages 212–226. IEEE 2014.
 M. Almishari D. Kaafar G. Tsudik and E. Oguz. Stylometric linkability of tweets. In Proceedings of the 13th Workshop on Privacy in the Electronic Society (WPES 2014) pages 205–208. ACM 2014.
 M. Almishari E. Oguz and G. Tsudik. Fighting Authorship Linkability with Crowdsourcing. In ACM Conference on Online Social Networks (COSN) 2014.
 M. AlSallal R. Iqbal S. Amin A. James and V. Palade. An Integrated Machine Learning Approach for Extrinsic Plagiarism Detection. In 9th International Conference on Developments in eSystems Engineering. IEEE 2016.
 Anonymous. I’m an Amazon Employee. My Company Shouldn’t Sell Facial Recognition Tech to Police. https://medium.com/s/powertrip/im-an-amazon-employee-my-company-shouldn-t-sell-facial-recognition-tech-to-police-36b5fde934ac 2018.
 Anonymous. An Open Letter to Microsoft: Don’t Bid on the US Military’s Project JEDI. https://medium.com/s/story/an-open-letter-to-microsoft-dont-bid-on-the-us-military-s-project-jedi-7279338b7132 2018.
 S. Borenstein. Close Look at Word Choice Could ID Anonymous NYT Columnist: Word Detectives. https://www.nbcchicago.com/news/politics/Science-May-Help-Identify-Opinion-Columnist-492649561.html 2018.
 M. Brennan and R. Greenstadt. Practical Attacks Against Authorship Recognition Techniques. In Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence 2009.
 M. Brennan S. Afroz and R. Greenstadt. Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity. In ACM Transactions on Information and System Security (TISSEC) volume 15 2012.
 D. Castro-Castro R. O. Bueno and R. Munoz. Author Masking by Sentence Transformation. In Notebook for PAN at CLEF 2017.
 J. H. Clark and C. J. Hannon. An Algorithm for Identifying Authors Using Synonyms. In Eighth Mexican International Conference on Current Trends in Computer Science (ENC 2007) pages 99–104. IEEE 2007.
 M. Denkowski and A. Lavie. Meteor universal: Language specific translation evaluation for any target language. In Proceedings of the Ninth Workshop on Statistical Machine Translation pages 376–380 Baltimore Maryland USA June 2014. Association for Computational Linguistics. 10.3115/v1/W14-3348. URL https://www.aclweb.org/anthology/W14-3348.
 M. Ebrahimpour T. J. Putnins M. J. Berryman A. Allison B. W.-H. Ng and D. Abbott. Automated Authorship Attribution Using Advanced Signal Classification Techniques. PLOS ONE 2013. URL https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0054998.
 O. Ehmoda and E. Charniak. Statistical Stylometrics and the Marlowe-Shakespeare Authorship Debate. Master’s thesis Department of Cognitive Linguistic & Psychological Sciences Brown University 2012.
 C. Emmery E. Manjavacas and G. Chrupala. Style Obfuscation by Invariance. In Proceedings of the 27th International Conference on Computational Linguistics 2018.
 D. E. Goldberg. Genetic Algorithms in Search Optimization and Machine Learning. Addison-Wesley Longman Publishing Co. Inc. Boston MA USA 1989.
 R. Gunning. The Fog Index After Twenty Years. International Journal of Business Communication 1969.
 J. Guo S. Lu H. Cai W. Zhang and Y. Yu. Long text generation via adversarial training with leaked information. In AAAI 2018.
 D. I. Holmes and R. S. Forsyth. The Federalist revisited: New directions in authorship attribution. PhD thesis Department of Mathematical Sciences University of the West of England Bristol UK 1995.
 J. Howard and S. Ruder. Universal Language Model Fine-tuning for Text Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics volume abs/1801.06146 2018.
 G. Karadzhov T. Mihaylova Y. Kiprov G. Georgiev I. Koychev and P. Nakov. The case for being average: A mediocrity approach to style masking and author obfuscation. In International Conference of the Cross-Language Evaluation Forum for European Languages pages 173–185. Springer 2017.
 Y. Keswani H. Trivedi P. Mehta and P. Majumder. Author Masking through Translation. In Notebook for PAN at CLEF 2016 pages 890–894 2016.
 M. Mansoorizadeh T. Rahgooy M. Aminiyan and M. Eskandari. Author obfuscation using WordNet and language models. In Notebook for PAN at CLEF 2016 2016.
 A. W. McDonald S. Afroz A. Caliskan A. Stolerman and R. Greenstadt. Use fewer instances of the letter ‘i’: Toward writing style anonymization. In International Symposium on Privacy Enhancing Technologies Symposium pages 299–318. Springer 2012.
 A. W. McDonald J. Ulman M. Barrowclift and R. Greenstadt. Anonymouth Revamped: Getting Closer to Stylometric Anonymity. In PETools: Workshop on Privacy Enhancing Tools volume 20 2013.
 L. McInnes J. Healy and J. Melville. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. ArXiv e-prints Feb. 2018.
 T. Mikolov I. Sutskever K. Chen G. S. Corrado and J. Dean. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems pages 3111–3119 2013.
 G. A. Miller. WordNet: a lexical database for English. Communications of the ACM 38(11):39–41 1995.
 F. Mosteller and D. Wallace. Inference and disputed authorship: The Federalist. 1964.
 A. Narayanan H. Paskov N. Z. Gong J. Bethencourt E. Stefanov E. C. R. Shin and D. Song. On the Feasibility of Internet-Scale Author Identification. In IEEE Symposium on Security and Privacy (SP) pages 300–314. IEEE 2012.
 R. Overdorf and R. Greenstadt. Blogs twitter feeds and reddit comments: Cross-domain authorship attribution. 2016.
 E. Pitler and A. Nenkova. Revisiting Readability: A Unified Framework for Predicting Text Quality. In Empirical Methods in Natural Language Processing (EMNLP) 2008.
 S. Potthast and S. Hagen. Overview of the Author Obfuscation Task at PAN 2018: A New Approach to Measuring Safety. In Notebook for PAN at CLEF 2018 2018.
 P. Rajapaksha R. Farahbakhsh and N. Crespi. Identifying Content Originator in Social Networks. In IEEE Global Communications Conference pages 1–6. IEEE 2017.
 S. Ruder P. Ghaffari and J. G. Breslin. Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. arXiv:1609.06686 2016. URL https://arxiv.org/abs/1609.06686.
 J. Schler M. Koppel S. Argamon and J. W. Pennebaker. Effects of age and gender on blogging. In AAAI spring symposium: Computational approaches to analyzing weblogs volume 6 pages 199–205 2006.
 U. Shahid S. Farooqi R. Ahmad Z. Shafiq P. Srinivasan and F. Zaffar. Accurate detection of automatically spun content via stylometric analysis. In Data Mining (ICDM) 2017 IEEE International Conference on pages 425–434. IEEE 2017.
 R. Shetty B. Schiele and M. Fritz. A4NT: Author Attribute Anonymity by Adversarial Training of Neural Machine Translation. In USENIX Security Symposium 2018.
 A. Stolerman R. Overdorf S. Afroz and R. Greenstadt. Classify but verify: Breaking the closed-world assumption in stylometric authorship attribution. In IFIP Working Group volume 11 page 64 2013.
 D. Tang F. Wei N. Yang M. Zhou T. Liu and B. Qin. Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) volume 1 pages 1555–1565 2014.
 L.-C. Yu J. Wang K. R. Lai and X. Zhang. Refining word embeddings using intensity scores for sentiment analysis. In IEEE/ACM Transactions on Audio Speech and Language Processing 2018.