The EU General Data Protection Regulation (GDPR) is one of the most demanding and comprehensive privacy regulations of all time. A year after it went into effect, we study its impact on the landscape of privacy policies online. We conduct the first longitudinal, in-depth, and at-scale assessment of privacy policies before and after the GDPR. We gauge the complete consumption cycle of these policies, from the first user impressions until the compliance assessment. We create a diverse corpus of two sets of 6,278 unique English-language privacy policies from inside and outside the EU, covering their pre-GDPR and the post-GDPR versions. The results of our tests and analyses suggest that the GDPR has been a catalyst for a major overhaul of the privacy policies inside and outside the EU. This overhaul of the policies, manifesting in extensive textual changes, especially for the EU-based websites, comes at mixed benefits to the users.
While the privacy policies have become considerably longer, our user study with 470 participants on Amazon MTurk indicates a significant improvement in the visual representation of privacy policies from the users’ perspective for the EU websites. We further develop a new workflow for the automated assessment of requirements in privacy policies. Using this workflow, we show that privacy policies cover more data practices and are more consistent with seven compliance requirements post the GDPR. We also assess how transparent the organizations are with their privacy practices by performing specificity analysis. In this analysis, we find evidence for positive changes triggered by the GDPR, with the specificity level improving on average. Still, we find the landscape of privacy policies to be in a transitional phase; many policies still do not meet several key GDPR requirements or their improved coverage comes with reduced specificity.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 W. F. Adkinson J. A. Eisenach and T. M. Lenard “Privacy online: A report on the information practices and policies of commercial web sites” Progress and Freedom Foundation 2002.
 E. AI. [Online]. Available: https://spacy.io/
 A. I. Anton J. B. Earp Q. He W. Stufflebeam D. Bolchini and C. Jensen “Financial privacy policies and the need for standardization” IEEE Security & privacy vol. 2 no. 2 pp. 36–45 2004.
 A. I. Antón J. B. Earp and A. Reese “Analyzing website privacy requirements using a privacy goal taxonomy” in Requirements Engineering 2002. Proceedings. IEEE Joint International Conference on. IEEE 2002 pp. 23–31.
 A. I. Anton J. B. Earp M. W. Vail N. Jain C. M. Gheen and J. M. Frink “Hipaa’s effect on web site privacy policies” IEEE Security & Privacy vol. 5 no. 1 pp. 45–52 2007.
 T. H. R. Campaign. [Online]. Available: https://www.hrc.org/hrc-story/privacy-policy
 A. Cohen. [Online]. Available: https://github.com/seatgeek/fuzzywuzzy
 G. Contissa K. Docter F. Lagioia M. Lippi H.-W. Micklitz P. Pałka G. Sartor and P. Torroni “Claudette meets gdpr: Automating the evaluation of privacy policies using artificial intelligence” 2018.
 M. Degeling C. Utz C. Lentzsch H. Hosseini F. Schaub and T. Holz “We value your privacy ... now take some cookies: Measuring the gdpr’s impact on web privacy” in 26th Annual Network and Distributed System Security Symposium NDSS 2019 San Diego California USA February 24-27 2019. The Internet Society 2019. [Online]. Available: https://www.ndss-symposium.org/ndss2019/
 E.-P. Directive “Directive 2002/58/ec of the european parliament and of the council of 12 july 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (directive on privacy and electronic communications)” Official Journal L vol. 201 no. 31 p. 07 2002.
 E. Directive “95/46/ec of the european parliament and of the council of 24 october 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data” Official Journal of the EC vol. 23 no. 6 1995.
 H. Harkous K. Fawaz R. Lebret F. Schaub K. Shin and K. Aberer “Polisis: Automated analysis and presentation of privacy policies using deep learning” in 27th USENIX Security Symposium (USENIX Security 18). USENIX Association 2018.
 B. Kahle. [Online]. Available: https://archive.org/help/wayback_api.php
 Y. Kim “Convolutional neural networks for sentence classification” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing EMNLP 2014 October 25-29 2014 Doha Qatar A meeting of SIGDAT a Special Interest Group of the ACL 2014 pp. 1746–1751. [Online]. Available: http://aclweb.org/anthology/D/D14/D14-1181.pdf
 C. Kohlschütter P. Fankhauser and W. Nejdl “Boilerplate detection using shallow text features” in Proceedings of the third ACM international conference on Web search and data mining. ACM 2010 pp. 441–450.
 L. Lebanoff and F. Liu “Automatic detection of vague words and sentences in privacy policies” arXiv preprint arXiv:1808.06219 2018.
 G. Lindgaard G. Fernandes C. Dudek and J. Brown “Attention web designers: You have 50 milliseconds to make a good first impression!” Behaviour & information technology vol. 25 no. 2 pp. 115–126 2006.
 M. Lippi P. Palka G. Contissa F. Lagioia H.-W. Micklitz G. Sartor and P. Torroni “Claudette: an automated detector of potentially unfair clauses in online terms of service” arXiv preprint arXiv:1805.01217 2018.
 K. Litman-Navarro “We read 150 privacy policies. they were an incomprehensible disaster.” https://www.nytimes.com/interactive/2019/06/12/opinion/facebook-google-privacy-policies.html 2019 accessed: 2019-06-13.
 C. Liu and K. P. Arnett “Raising a red flag on global www privacy policies” Journal of Computer Information Systems vol. 43 no. 1 pp. 117–127 2002.
 E. T. Loiacono R. T. Watson D. L. Goodhue et al. “Webqual: A measure of website quality” Marketing theory and applications vol. 13 no. 3 pp. 432–438 2002.
 N. Lomas “Privacy policies are still too horrible to read in full.” https://techcrunch.com/2019/06/13/privacy-policies-are-still-too-horrible-to-read-in-full/ 2019 accessed: 2019-06-13.
 M. Lui and T. Baldwin “langid. py: An off-the-shelf language identification tool” in Proceedings of the ACL 2012 system demonstrations. Association for Computational Linguistics 2012 pp. 25–30.
 F. Marotta-Wurgler “Self-regulation and competition in privacy policies” The Journal of Legal Studies vol. 45 no. S2 pp. S13–S39 2016.
 G. R. Milne and M. J. Culnan “Using the content of online privacy notices to inform public policy: A longitudinal analysis of the 1998-2001 us web surveys” The Information Society vol. 18 no. 5 pp. 345–359 2002.
 M. A. Napierala “What is the bonferroni correction” AAOS Now vol. 6 no. 4 p. 40 2012.
 R. Ramanath F. Liu N. M. Sadeh and N. A. Smith “Unsupervised alignment of privacy policies using hidden markov models” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics ACL 2014 June 22-27 2014 Baltimore MD USA Volume 2: Short Papers 2014 pp. 605–610. [Online]. Available: http://aclweb.org/anthology/P/P14/P14-2099.pdf
 K. Reinecke T. Yeh L. Miratrix R. Mardiko Y. Zhao J. Liu and K. Z. Gajos “Predicting users’ first impressions of website aesthetics with a quantification of perceived visual complexity and colorfulness” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM 2013 pp. 2049–2058.
 L. Richardson. [Online]. Available: https://www.crummy.com/software/BeautifulSoup/
 J. Singer-Vine “WayBackPack: Open source scientific tools for Python.” [Online]. Available: https://pypi.org/project/waybackpack/
 J. Turow M. Hennessy and N. Draper “Persistent misperceptions: Americans’ misplaced confidence in privacy policies 2003–2015” Journal of Broadcasting & Electronic Media vol. 62 no. 3 pp. 461–478 2018.
 M. W. Vail J. B. Earp and A. I. Antón “An empirical study of consumer perceptions and comprehension of web site privacy policies” IEEE Transactions on Engineering Management vol. 55 no. 3 pp. 442–454 2008.
 A. Van Lamsweerde “Goal-oriented requirements engineering: A guided tour” in Requirements Engineering 2001. Proceedings. Fifth IEEE International Symposium on. IEEE 2001 pp. 249–262.