Website fingerprinting allows a local, passive observer monitoring a web-browsing client’s encrypted channel to determine her web activity. Previous attacks have shown that website fingerprinting could be a threat to anonymity networks such as Tor under laboratory conditions. However, there are significant differences between laboratory conditions and realistic conditions. First, in laboratory tests we collect the training data set together with the testing data set, so the training data set is fresh, but an attacker may not be able to maintain a fresh data set. Second, laboratory packet sequences correspond to a single page each, but for realistic packet sequences the split between pages is not obvious. Third, packet sequences may include background noise from other types of web traffic. These differences adversely affect website fingerprinting under realistic conditions. In this paper, we tackle these three problems to bridge the gap between laboratory and realistic conditions for website fingerprinting. We show that we can maintain a fresh training set with minimal resources. We demonstrate several classification-based techniques that allow us to split full packet sequences effectively into sequences corresponding to a single page each. We describe several new algorithms for tackling background noise. With our techniques, we are able to build the first website fingerprinting system that can operate directly on packet sequences collected in the wild.
 G. D. Bissias, M. Liberatore, D. Jensen, and B. N. Levine. Privacy Vulnerabilities in Encrypted HTTP Streams. In Privacy Enhancing Technologies, pages 1-11. Springer, 2006.
 X. Cai, R. Nithyanand, T. Wang, I. Goldberg, and R. Johnson. A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses. In Proceedings of the 21th ACM Conference on Computer and Communications Security, 2014.
 X. Cai, X. Zhang, B. Joshi, and R. Johnson. Touching from a Distance: Website Fingerprinting Attacks and Defenses. In Proceedings of the 19th ACM Conference on Computer and Communications Security, pages 605-616, 2012.
 E. Casalicchio and M. Colajanni. A client-aware dispatching algorithm for web clusters providing multiple services. In Proceedings of the 10th international conference on World Wide Web, pages 535-544, 2001.
 H. Cheng and R. Avnur. Traffic Analysis of SSL-Encrypted Web Browsing. http://www.cs.berkeley.edu/~daw/teaching/cs261-f98/projects/final-reports/ronathan-heyning.ps.
 M. E. Crovella and A. Bestavros. Self-similarity in World Wide Web traffic: evidence and possible causes. Networking, IEEE/ACM Transactions on, 5(6):835-846, 1997.
 R. Dingledine, N. Mathewson, and P. Syverson. Tor: The Second-Generation Onion Router. In Proceedings of the 13th USENIX Security Symposium, 2004.
 K. Dyer, S. Coull, T. Ristenpart, and T. Shrimpton. Peek-a- Boo, I Still See You: Why Efficient Traffic Analysis Countermeasures Fail. In Proceedings of the 2012 IEEE Symposium on Security and Privacy, pages 332-346, 2012.
 G. Greenwald. XKeyscore: NSA tool collects ’nearly everything a user does on the internet’. http://www.theguardian.com/world/2013/jul/31/nsa-top-secret-program-online-data, July 2013. Accessed Feb. 2015.
 J. Hayes and G. Danezis. k-fingerprinting: a Robust Scalable Website Fingerprinting Technique. arXiv:1509.00789v3, 19 Feb 2016.
 D. Herrmann, R. Wendolsky, and H. Federrath. Website Fingerprinting: Attacking Popular Privacy Enhancing Technologies with the Multinomial Naïve-Bayes Classifier. In Proceedings of the 2009 ACM Workshop on Cloud Computing Security, pages 31-42, 2009.
 A. Hintz. Fingerprinting Websites Using Traffic Analysis. In Privacy Enhancing Technologies, pages 171-178. Springer, 2003.
 M. Juarez, S. Afroz, G. Acar, C. Diaz, and R. Greenstadt. A Critical Evaluation of Website Fingerprinting Attacks. In Proceedings of the 21th ACM Conference on Computer and Communications Security, 2014.
 A. Kwon, M. AlSabah, D. Lazar, M. Dacier, and S. Devadas. Circuit fingerprinting attacks: passive deanonymization of tor hidden services. In 24th USENIX Security Symposium (USENIX Security 15), pages 287-302, 2015.
 M. Liberatore and B. Levine. Inferring the Source of Encrypted HTTP Connections. In Proceedings of the 13th ACM Conference on Computer and Communications Security, pages 255-263, 2006.
 C. Liu, R. White, and S. Dumais. Understanding web browsing behaviors through Weibull analysis of dwell time. In Proceedings of the 33rd international ACM SIGIR Conference, pages 379-386, 2010.
 L. Lu, E.-C. Chang, and M. C. Chan. Website Fingerprinting and Identification Using Ordered Feature Sequences. In Computer Security-ESORICS 2010, pages 199-214. Springer, 2010.
 M. Molina, P. Castelli, and G. Foddis. Web traffic modeling exploiting TCP connections’ temporal clustering through HTML-REDUCE. Network, IEEE, 14(3):46-55, 2000.
 A. Panchenko, F. Lanze, A. Zinnen, M. Henze, J. Pennekamp, K. Wehrle, and T. Engel. Website fingerprinting at internet scale. In Proceedings of the 23rd Network and Distributed System Security Symposium, 2016.
 A. Panchenko, L. Niessen, A. Zinnen, and T. Engel. Website Fingerprinting in Onion Routing Based Anonymization Networks. In Proceedings of the 10th ACM Workshop on Privacy in the Electronic Society, pages 103-114, 2011.
 M. Perry. A Critique of Website Traffic Fingerprinting Attacks. https://blog.torproject.org/blog/critique-website-trafficfingerprinting-attacks, November 2013. Accessed Feb. 2015.
 Q. Sun, D. R. Simon, Y.-M. Wang, W. Russell, V. N. Padmanabhan, and L. Qiu. Statistical Identification of Encrypted Web Browsing Traffic. In Proceedings of the 2002 IEEE Symposium on Security and Privacy, pages 19-30. IEEE, 2002.
 Tor. Tor Metrics Portal. https://metrics.torproject.org/. Accessed Feb. 2015.
 T. Wang, X. Cai, R. Nithyanand, R. Johnson, and I. Goldberg. Effective Attacks and Provable Defenses for Website Fingerprinting. In Proceedings of the 23rd USENIX Security Symposium, 2014.
 T. Wang and I. Goldberg. Improved Website Fingerprinting on Tor. In Proceedings of the 12th ACM Workshop on Privacy in the Electronic Society, pages 201-212, 2013.