To partly address people’s concerns over web tracking, Google has created the Ad Settings webpage to provide information about and some choice over the profiles Google creates on users. We present AdFisher, an automated tool that explores how user behaviors, Google’s ads, and Ad Settings interact. AdFisher can run browser-based experiments and analyze data using machine learning and significance tests. Our tool uses a rigorous experimental design and statistical analysis to ensure the statistical soundness of our results. We use AdFisher to find that the Ad Settings was opaque about some features of a user’s profile, that it does provide some choice on ads, and that these choices can lead to seemingly discriminatory ads. In particular, we found that visiting webpages associated with substance abuse changed the ads shown but not the settings page. We also found that setting the gender to female resulted in getting fewer instances of an ad related to high paying jobs than setting it to male. We cannot determine who caused these findings due to our limited visibility into the ad ecosystem, which includes Google, advertisers, websites, and users. Nevertheless, these results can form the starting point for deeper investigations by either the companies themselves or by regulatory bodies.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 J. R. Mayer and J. C. Mitchell, “Third-party web tracking: Policy and technology,” in IEEE Symposium on Security and Privacy, 2012, pp. 413-427.
 B. Ur, P. G. Leon, L. F. Cranor, R. Shay, and Y. Wang, “Smart, useful, scary, creepy: Perceptions of online behavioral advertising,” in Proceedings of the Eighth Symposium on Usable Privacy and Security. ACM, 2012, pp. 4:1-4:15.
 Executive Office of the President, “Big data: Seizing opportunities, preserving values,” Posted at http://www. whitehouse.gov/sites/default/files/docs/big_data_privacy_ report_may_1_2014.pdf, 2014, accessed Jan. 26, 2014.
 R. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning fair representations,” in Proceedings of the 30th International Conference on Machine Learning (ICML-13), S. Dasgupta and D. Mcallester, Eds., vol. 28. JMLR Workshop and Conference Proceedings, May 2013, pp. 325-333. [Online]. Available: http: //jmlr.org/proceedings/papers/v28/zemel13.pdf
 F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
 E. Jones, T. Oliphant, P. Peterson et al., “SciPy: Open source scientific tools for Python,” 2001, http://www.scipy.org/.
 M. C. Tschantz, A. Datta, A. Datta, and J. M. Wing, “A methodology for information flow experiments,” ArXiv, Tech. Rep. arXiv:1405.2376v1, 2014.
 P. Good, Permutation, Parametric and Bootstrap Tests of Hypotheses. Springer, 2005.
 C. E. Wills and C. Tatar, “Understanding what they do with what they know,” in Proceedings of the 2012 ACM Workshop on Privacy in the Electronic Society, 2012, pp. 13-18.
 P. Barford, I. Canadi, D. Krushevskaja, Q. Ma, and S. Muthukrishnan, “Adscape: Harvesting and analyzing online display ads,” in Proceedings of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2014, pp. 597-608.
 B. Liu, A. Sheth, U. Weinsberg, J. Chandrashekar, and R. Govindan, “AdReveal: Improving transparency into online targeted advertising,” in Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks. ACM, 2013, pp. 12:1-12:7.
 M. Lécuyer, G. Ducoffe, F. Lan, A. Papancea, T. Petsios, R. Spahn, A. Chaintreau, and R. Geambasu, “XRay: Increasing the web’s transparency with differential correlation,” in Proceedings of the USENIX Security Symposium, 2014.
 S. Englehardt, C. Eubank, P. Zimmerman, D. Reisman, and A. Narayanan, “Web privacy measurement: Scientific principles, engineering platform, and new results,” Manuscript posted at http://randomwalker.info/ publications/WebPrivacyMeasurement.pdf, 2014, accessed Nov. 22, 2014.
 S. Guha, B. Cheng, and P. Francis, “Challenges in measuring online advertising systems,” in Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement, 2010, pp. 81-87.
 R. Balebako, P. Leon, R. Shay, B. Ur, Y. Wang, and L. Cranor, “Measuring the effectiveness of privacy tools for limiting behavioral advertising,” in Web 2.0 Security and Privacy Workshop, 2012.
 L. Sweeney, “Discrimination in online ad delivery,” Commun. ACM, vol. 56, no. 5, pp. 44-54, 2013.
 R. A. Fisher, The Design of Experiments. Oliver & Boyd, 1935.
 S. Greenland and J. M. Robins, “Identifiability, exchangeability, and epidemiological confounding,” International Journal of Epidemiology, vol. 15, no. 3, pp. 413-419, 1986.
 T. M. Mitchell, Machine Learning. McGraw-Hill, 1997.
 D. D. Jensen, “Induction with randomization testing: Decision-oriented analysis of large data sets,” Ph.D. dissertation, Sever Institute of Washington University, 1992.
 Alexa, “Is popularity in the top sites by category directory based on traffic rank?” https://support.alexa.com/hc/enus/ articles/200461970, accessed Nov. 21, 2014.
 C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
 S. Holm, “A simple sequentially rejective multiple test procedure,” Scandinavian Journal of Statistics, vol. 6, no. 2, pp. 65-70, 1979.
 H. Abdi, “Bonferroni and Šidák corrections for multiple comparisons,” in Encyclopedia of Measurement and Statistics, N. J. Salkind, Ed. Sage, 2007.
 D. Hume, A Treatise of Human Nature: Being an Attempt to Introduce the Experimental Method of Reasoning into Moral Subjects, 1738, book III, part I, section I.
 Pew Research Center’s Social and Demographic Trends Project, “On pay gap, millennial women near parity - for now: Despite gains, many see roadblocks ahead,” 2013.
 T. Z. Zarsky, “Understanding discrimination in the scored society,” Washington Law Review, vol. 89, pp. 1375-1412, 2014.
 R. S. Zemel, Y. Wu, K. Swersky, T. Pitassi, and C. Dwork, “Learning fair representations,” in Proceedings of the 30th International Conference on Machine Learning, ser. JMLR: W&CP, vol. 28. JMLR.org, 2013, pp. 325-333.
 Adgooroo, “Adwords cost per click rises 26% between 2012 and 2014,” http://www.adgooroo.com/resources/blog/ adwords-cost-per-click-rises-26-between-2012-and-2014/, accessed Nov. 21, 2014.
 L. Olejnik, T. Minh-Dung, and C. Castelluccia, “Selling off privacy at auction,” in Network and Distributed System Security Symposium (NDSS). The Internet Society, 2013.
 C. J. Clopper and E. S. Pearson, “The use of confidence or fiducial limits illustrated in the case of the binomial,” Biometrika, vol. 26, no. 4, pp. 404-413, 1934.