Websites extensively track users via identifiers that uniquely map to client machines or user accounts. Although such tracking has desirable properties like enabling personalization and website analytics, it also raises serious concerns about online user privacy, and can potentially enable illicit surveillance by adversaries who broadly monitor network traffic.
In this work we seek to understand the possibilities of latent identifiers appearing in user traffic in forms beyond those already well-known and studied, such as browser and Flash cookies. We develop a methodology for processing large network traces to semi-automatically discover identifiers sent by clients that distinguish users/devices/browsers, such as usernames, cookies, custom user agents, and IMEI numbers. We address the challenges of scaling such discovery up to enterprise-sized data by devising multistage filtering and streaming algorithms. The resulting methodology reflects trade-offs between reducing the ultimate analysis burden and the risk of missing potential identifier strings. We analyze 15 days of data from a site with several hundred users and capture dozens of latent identifiers, primarily in HTTP request components, but also in non-HTTP protocols.
 G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, and C. Diaz. The web never forgets: Persistent tracking mechanisms in the wild. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2014.
 G. Acar, M. Juarez, N. Nikiforakis, C. Diaz, S. Gürses, F. Piessens, and B. Preneel. Fpdetective: Dusting the web for fingerprinters. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2013.
 J. P. Achara, J. Lefruit, V. Roca, and C. Castelluccia. Detecting privacy leaks in the RATP app: How we proceeded and what we found. Journal of Computer Virology and Hacking Techniques, 10(4):229-238, 2014.
 C. M. Arranz. IP Telephony: Peer-to-peer versus SIP. MS Thesis, KTH, 2005.
 P. Eckersley. How unique is your web browser? In Proceedings of the 10th International Conference on Privacy Enhancing Technologies, 2010.
 M. Egele, C. Kruegel, E. Kirda, and G. Vigna. Pios: Detecting privacy leaks in ios applications. In Proceedings of the Net work and Distributed System Security Symposium, NDSS, 2011.
 W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. Mc- Daniel, and A. N. Sheth. Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, 2010.
 S. Englehardt, D. Reisman, C. Eubank, P. Zimmerman, J. Mayer, A. Narayanan, and E. W. Felten. Cookies that give you away: The surveillance implications of web tracking. In Proceedings of the 24th World Wide Web Conference, 2015.
 C. Eubank, M. Melara, D. Perez-Botero, and A. Narayanan. Shining the floodlights on mobile web tracking - a privacy survey. In Proceedings of Web 2.0 Security and Privacy (W2SP), 2013.
 S. Han, J. Jung, and D. Wetherall. A study of third-party tracking by mobile apps in the wild. Technical Report, UWCSE-12-03-01, 2012.
 T. Kohno, A. Broido, and K. C. Claffy. Remote physical device fingerprinting. Dependable and Secure Computing, IEEE Transactions on, 2(2):93-108, 2005.
 C. Kreibich and J. Crowcroft. Honeycomb: Creating intrusion detection signatures using honeypots. SIGCOMM Comput. Commun. Rev., 34(1):51-56, 2004.
 B. Krishnamurthy and C. E. Wills. On the leakage of personally identifiable information via online social networks. In Proceedings of the 2nd ACM Workshop on Online Social Networks, 2009.
 B. Krishnamurthy and C. E. Wills. Privacy diffusion on the web: a longitudinal perspective. In Proceedings of the 18th World Wide Web Conference, 2009.
 J. R. Mayer. Any person... a pamphleteer: Internet anonymity in the age of web 2.0. Undergraduate Senior Thesis, Princeton University, 2009.
 K. Mowery and H. Shacham. Pixel perfect: Fingerprinting canvas in HTML5. In Proceedings of Web 2.0 Security and Privacy, 2012.
 N. Nikiforakis, A. Kapravelos, W. Joosen, C. Kruegel, F. Piessens, and G. Vigna. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In Proceedings of the IEEE Symposium on Security and Privacy, 2013.
 L. Olejnik, T. Minh-Dung, C. Castelluccia, et al. Selling off privacy at auction. In Proceedings of Network and Distributed System Security Symposium, 2014.
 S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprinting. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation, 2004.
 Y. Xie, F. Yu, and M. Abadi. De-anonymizing the internet using unreliable ids. In Proceedings of the ACM SIGCOMM Conference on Data Communication, 2009.
 T.-F. Yen, Y. Xie, F. Yu, R. P. Yu, and M. Abadi. Host fingerprinting and tracking on the web: Privacy and security implications. In Proceedings of Network and Distributed System Security Symposium, 2012.