Several recent studies have demonstrated that people show large behavioural uniqueness. This has serious privacy implications as most individuals become increasingly re-identifiable in large datasets or can be tracked, while they are browsing the web, using only a couple of their attributes, called as their fingerprints. Often, the success of these attacks depends on explicit constraints on the number of attributes learnable about individuals, i.e., the size of their fingerprints. These constraints can be budget as well as technical constraints imposed by the data holder. For instance, Apple restricts the number of applications that can be called by another application on iOS in order to mitigate the potential privacy threats of leaking the list of installed applications on a device. In this work, we address the problem of identifying the attributes (e.g., smartphone applications) that can serve as a fingerprint of users given constraints on the size of the fingerprint. We give the best fingerprinting algorithms in general, and evaluate their effectiveness on several real-world datasets. Our results show that current privacy guards limiting the number of attributes that can be queried about individuals is insufficient to mitigate their potential privacy risks in many practical cases.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 European Commission. Proposal for European Parliament and the Council (General Data Protection Regulation) 2012. http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=COM:2012:0011:FIN:EN:PDF.
 Apple uikit framework reference entry on canopenurl call. https://developer.apple.com/library/ios/documentation/UIKit/Reference/UIApplication_Class/#//apple_ref/occ/instm/UIApplication/canOpenURL: 2016.
 G. Acar C. Eubank S. Englehardt M. Juarez A. Narayanan and C. Diaz. The web never forgets: Persistent tracking mechanisms in the wild. In ACM CCS pages 674-689 2014.
 J. P. Achara G. Acs and C. Castelluccia. On the unicity of smartphone applications. In Proceedings of WPES pages 27-36. ACM 2015.
 Article 29 Data Protection Working Party. Opinion 05/2014 on anonymization techniques April 2014.
 K. Boda A. M. Földes G. G. Gulyás and S. Imre. User tracking on the web via cross-browser fingerprinting. In P. Laud editor Information Security Technology for Applications volume 7161 of LNCS pages 31-46. 2012.
 T. Bujlow V. Carela-Espanol J. Sole-Pareta and P. Barlet- Ros. Web tracking: Mechanisms implications and defenses. In http://arxiv.org/abs/1507.07872 2015.
 X. Cai X. C. Zhang B. Joshi and R. Johnson. Touching from a distance: Website fingerprinting attacks and defenses. In ACM CCS 2012.
 C. Chekuri and A. Kumar. Maximum coverage problem with group budget constraints and applications. In Approximation Randomization and Combinatorial Optimization pages 72-83. Springer LNCS 2004.
 Y.-A. de Montjoye C. A. Hidalgo M. Verleysen and V. D. Blondel. Unique in the crowd: The privacy bounds of human mobility. Scientific Reports Nature March 2013.
 Y.-A. de Montjoye L. Radaelli V. K. Singh and A. Pentland. Unique in the shopping mall: On the reidentifiability of credit card metadata. Science 347(6221) January 2015.
 P. Eckersley. How unique is your web browser? In PETS pages 1-18 2010.
 U. Feige. A threshold of ln(n) for approximating set cover. Journal of the ACM 45(4):634-652 July 1998.
 gacar. Improve persistence and webfont compatibility of font patch comment #13. https://trac.torproject.org/projects/tor/ticket/5798comment:#13 2013.
 M. Graham and S. D. Sabbata. The anonymous internet. http://geography.oii.ox.ac.uk/?page=tor 2014.
 G. G. Gulyás G. Acs and C. Castelluccia. Update your tor browser settings - otherwise it is less anonymous than you would think. https://gulyas.info/blog/read/16/2015-11-30-Update-your-TOR-Browser-settings-otherwise-it-is-lessanonymous-than-you-would-think.php 11 2015.
 J. Hayes and G. Danezis. k-fingerprinting: a robust scalable website fingerprinting technique. In http://arxiv.org/abs/1509.00789 2015.
 D. Lemire L. Boytsov and N. Kurz. SIMD compression and the intersection of sorted integers. CoRR abs/1401.6399 2014.
 J. Marshall. Twitter is tracking users’ installed apps for ad targeting. http://blogs.wsj.com/cmo/2014/11/26/twitteris-tracking-users-installed-apps-for-ad-targeting/ 11 2014.
 E. C. Mike Perry and S. Murdoch. The design and implementation of the tor browser [draft]. https://www.torproject.org/projects/torbrowser/design/ 5 2015.
 R. Motwani and Y. Xu. Efficient algorithms for masking and finding quasi-identifiers. In VLDB 2007.
 G. Nemhauser L. Wolsey and M. Fisher. An analysis of approximations for maximizing submodular set functions i. Mathematical Programming 14(1):265-294 1978.
 N. Nikiforakis A. Kapravelos W. Joosen C. Kruegel F. Piessens and G. Vigna. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In IEEE Symposium on S&P pages 541-555 May 2013.
 L. Olejnik C. Castelluccia and A. Janc. On the uniqueness of web browsing history patterns. Annals of Telecommunications 69(1) February 2014.
 A. J. Oliner A. P. Iyer I. Stoica E. Lagerspetz and S. Tarkoma. Carat: Collaborative energy diagnosis for mobile devices. In ACM SenSys 2013.
 H. T. T. Truong E. Lagerspetz P. Nurmi A. J. Oliner S. Tarkoma N. Asokan and S. Bhattacharya. The company you keep: Mobile malware infection rates and inexpensive risk indicators. In WWW 2014.
 H. Zang and J. Bolot. Anonymization of location data does not work: A large-scale measurement study. In MobiCom 2011.