Missed by Filter Lists: Detecting Unknown Third-Party Trackers with Invisible Pixels

Imane fouad, Nataliia Bielova, Arnaud Legout and Natasa Sarafijanovic-Djukic

Abstract

Web tracking has been extensively studied over the last decade. To detect tracking, previous studies and user tools rely on filter lists. However, it has been shown that filter lists miss trackers. In this paper, we propose an alternative method to detect trackers inspired by analyzing behavior of invisible pixels. By crawling 84,658 webpages from 8,744 domains, we detect that third-party invisible pixels are widely deployed: they are present on more than 94.51% of domains and constitute 35.66% of all third-party images. We propose a fine-grained behavioral classification of tracking based on the analysis of invisible pixels. We use this classification to detect new categories of tracking and uncover new collaborations between domains on the full dataset of 4, 216, 454 third-party requests. We demonstrate that two popular methods to detect tracking, based on EasyList&EasyPrivacy and on Disconnect lists respectively miss 25.22% and 30.34% of the trackers that we detect. Moreover, we find that if we combine all three lists, 379, 245 requests originated from 8,744 domains still track users on 68.70% of websites.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] Gunes Acar, Christian Eubank, Steven Englehardt, Marc Juárez, Arvind Narayanan, and Claudia Díaz. The web never forgets: Persistent tracking mechanisms in the wild. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, pages 674–689, 2014.

  • [2] Gunes Acar, Marc Juárez, Nick Nikiforakis, Claudia Díaz, Seda F. Gürses, Frank Piessens, and Bart Preneel. Fpdetective: dusting the web for fingerprinters. In 2013 ACM SIGSAC Conference on Computer and Communications Security (CCS’13), pages 1129–1140, 2013.

  • [3] Adblock offcial website. https://getadblock.com/.

  • [4] Adblock list parser. https://github.com/scrapinghub/adblockparser.

  • [5] Adblock Plus Official website. https://adblockplus.org/.

  • [6] David Martin Adil Alsaid. Detecting web bugs with bugnosis: Privacy advocacy through education. In Privacy Enhancing Technologies, 2002.

  • [7] Alexa official website. https://www.alexa.com/.

  • [8] Google-analytics service. https://developers.google.com/analytics/devguides/collection/protocol/v1/devguide.

  • [9] Mika D Ayenson, Dietrich James Wambach, Ashkan Soltani, Nathan Good, and Chris Jay Hoofnagle. Flash cookies and privacy ii: Now with html5 and etag respawning. Technical report, Available at SSRN: https://ssrn.com/abstract=1898390orhttp://dx.doi.org/10.2139/ssrn.1898390, 2011.

  • [10] Muhammad Ahmad Bashir, Sajjad Arshad, Engin Kirda, William K. Robertson, and Christo Wilson. How tracking companies circumvented ad blockers using websockets. In Internet Measurement Conference 2018, pages 471–477, 2018.

  • [11] Muhammad Ahmad Bashir, Sajjad Arshad, William Robertson, and Christo Wilson. Tracing Information Flows Between Ad Exchanges Using Retargeted Ads. In Proceedings of the 25th USENIX Security Symposium, 2016.

  • [12] Muhammad Ahmad Bashir and Christo Wilson. Diffusion of User Tracking Data in the Online Advertising Ecosystem. In Proceedings on Privacy Enhancing Technologies (PETS 2018), 2018.

  • [13] Yinzhi Cao, Song Li, and Erik Wijmans. (cross-)browser fingerprinting via os and hardware level features. In 24th Annual Network and Distributed System Security Symposium, NDSS 2017, San Diego, California, USA, 26 February - 1 March, 2017, 2017.

  • [14] Symantec categorization service. http://sitereview.bluecoat.com/#/.

  • [15] Anupam Das, Gunes Acar, Nikita Borisov, and Amogh Pradeep. The web’s sixth sense: A study of scripts accessing smartphone sensors. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, CCS 2018, Toronto, ON, Canada, October 15-19, 2018, 2018.

  • [16] Disconnect Official website. https://disconnect.me/.

  • [17] Disconnect List. https://disconnect.me/trackerprotection/blocked.

  • [18] Jaromir Dobias. Privacy effects of web bugs amplified by web 2.0. In Privacy and Identity Management for Life - 6th IFIP WG 9.2, 9.6/11.7, 11.4, 11.6/PrimeLife International Summer School, Helsingborg, Sweden, August 2-6, 2010, Revised Selected Papers, pages 244–257, 2010.

  • [19] Doubleclick cookie syncing. https://developers.google.com/ad-exchange/rtb/cookie-guide.

  • [20] EasyList filter lists. https://easylist.to/.

  • [21] EasyPrivacy filter lists. https://easylist.to/easylist/easyprivacy.txt.

  • [22] Peter Eckersley. How Unique is Your Web Browser? In Proceedings of the 10th International Conference on Privacy Enhancing Technologies, PETS’10, pages 1–18. Springer-Verlag, 2010.

  • [23] Steven Englehardt, Jeffrey Han, and Arvind Narayanan. I never signed up for this! privacy implications of email tracking. In Privacy Enhancing Technologies, 2018.

  • [24] Steven Englehardt and Arvind Narayanan. Online tracking: A 1-million-site measurement and analysis. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security ACM CCS, pages 1388–1401, 2016.

  • [25] Steven Englehardt, Dillon Reisman, Christian Eubank, Peter Zimmerman, Jonathan Mayer, Arvind Narayanan, and Edward W. Felten. Cookies that give you away: The surveil-lance implications of web tracking. In Proceedings of WWW 2015, pages 289–299, 2015.

  • [26] The new Firefox. Fast for good. https://www.mozilla.org/en-US/firefox/new/.

  • [27] Ghostery Official website. https://www.ghostery.com/.

  • [28] Muhammad Ikram, Hassan Jameel Asghar, Mohamed Ali Kaafar, Anirban Mahanti, and Balachandar Krishnamurthy. Towards seamless tracking-free web: Improved detection of trackers via one-class learning. In Privacy Enhancing Technologies, 2017.

  • [29] Costas Iordanou, Georgios Smaragdakis, Ingmar Poese, and Nikolaos Laoutaris. Tracing cross border web tracking. In ACM Internet Measurement Conference (IMC), 2018.

  • [30] Tobias Lauinger, Abdelberi Chaabane, Sajjad Arshad, William Robertson, Christo Wilson, and Engin Kirda. Thou shalt not depend on me: Analysing the use of outdated javascript libraries on the web. In Network and Distributed System Security Symposium, NDSS, 2017.

  • [31] Adam Lerner, Anna Kornfeld Simpson, Tadayoshi Kohno, and Franziska Roesner. Internet jones and the raiders of the lost trackers: An archaeological study of web tracking from 1996 to 2016. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, 2016.

  • [32] Timothy Libert, Lucas Graves, and Rasmus Kleis Nielsen. Changes in third-party content on european news websites after gdpr„ 2018. https://timlibert.me/pdf/Libert_et_al-2018-Changes_in_Third-Party_Content_on_EU_News_After_GDPR.pdf.

  • [33] Timothy Libert and Rasmus Kleis Nielsen. Third-party web content on eu news sites: Potential challenges and paths to privacy improvement, 2018. https://timlibert.me/pdf/Libert_Nielsen-2018-Third_Party_Content_EU_News_GDPR.pdf.

  • [34] David Martin, Hailin Wu, and Adil Alsaid. Hidden surveil-lance by web sites: Web bugs in contemporary use. 2003.

  • [35] Georg Merzdovnik, Markus Huber, Damjan Buhov, Nick Nikiforakis, Sebastian Neuner, Martin Schmiedecker, and Edgar Weippl. Block me if you can: A large-scale study of tracker-blocking tools. In 2nd IEEE European Symposium on Security and Privacy, Paris, France, 2017. To appear.

  • [36] Steve Nichols. Big brother is watching: An update on web bugs. In SANS Institute, 2001.

  • [37] Nick Nikiforakis, Alexandros Kapravelos, Wouter Joosen, Christopher Kruegel, Frank Piessens, and Giovanni Vigna. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In IEEE Symposium on Security and Privacy, SP 2013, pages 541–555, 2013.

  • [38] Lukasz Olejnik, Minh-Dung Tran, and Claude Castelluccia. Selling off user privacy at auction. In 21st Annual Network and Distributed System Security Symposium, NDSS 2014, San Diego, California, USA, February 23-26, 2014, 2014.

  • [39] Panagiotis Papadopoulos, Nicolas Kourtellis, and Evangelos P. Markatos. Cookie synchronization: Everything you always wanted to know but were afraid to ask. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, pages 1432–1442, 2019.

  • [40] Panagiotis Papadopoulos, Pablo Rodríguez Rodríguez, Nicolas Kourtellis, and Nikolaos Laoutaris. If you are not paying for it, you are the product: how much do advertisers pay to reach you? In Internet Measurement Conference, IMC, pages 142–156, 2017.

  • [41] Privacy Badger Official website - Electronic Frontier Foundation. https://www.eff.org/privacybadger.

  • [42] Abbas Razaghpanah, Rishab Nithyanand, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Mark Allman, Christian Kreibich, and Phillipa Gill. Apps, trackers, privacy, and regulators: A global study of the mobile tracking ecosystem. In Network and Distributed System Security Symposium, NDSS, 2018.

  • [43] Franziska Roesner, Tadayoshi Kohno, and David Wetherall. Detecting and defending against third-party tracking on the web. In Proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2012, pages 155–168, 2012.

  • [44] Jukka Ruohonen and Ville Leppänen. Invisible pixels are dead, long live invisible pixels! In Workshop on Privacy in the Electronic Society, WPES@CCS, pages 28–32, 2018.

  • [45] Same Origin Policy. https://www.w3.org/Security/wiki/Same_Origin_Policy.

  • [46] Ashkan Soltani, Shannon Canty, Quentin Mayo, Lauren Thomas, and Chris Jay Hoofnagle. Flash cookies and privacy. In AAAI Spring Symposium: Intelligent Information Privacy Management, 2010.

  • [47] uBlock Origin - An efficient blocker for Chromium and Fire-fox. Fast and lean. https://github.com/gorhill/uBlock.

  • [48] Whois library. https://pypi.org/project/whois/.

  • [49] Zhonghao Yu, Sam Macbeth, Konark Modi, and Josep M. Pujol. Tracking the trackers. In International Conference on World Wide Web, WWW, pages 121–132, 2016.

OPEN ACCESS

Journal + Issues

Search