Automated Experiments on Ad Privacy Settings

Open access

Abstract

To partly address people’s concerns over web tracking, Google has created the Ad Settings webpage to provide information about and some choice over the profiles Google creates on users. We present AdFisher, an automated tool that explores how user behaviors, Google’s ads, and Ad Settings interact. AdFisher can run browser-based experiments and analyze data using machine learning and significance tests. Our tool uses a rigorous experimental design and statistical analysis to ensure the statistical soundness of our results. We use AdFisher to find that the Ad Settings was opaque about some features of a user’s profile, that it does provide some choice on ads, and that these choices can lead to seemingly discriminatory ads. In particular, we found that visiting webpages associated with substance abuse changed the ads shown but not the settings page. We also found that setting the gender to female resulted in getting fewer instances of an ad related to high paying jobs than setting it to male. We cannot determine who caused these findings due to our limited visibility into the ad ecosystem, which includes Google, advertisers, websites, and users. Nevertheless, these results can form the starting point for deeper investigations by either the companies themselves or by regulatory bodies.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] J. R. Mayer and J. C. Mitchell “Third-party web tracking: Policy and technology” in IEEE Symposium on Security and Privacy 2012 pp. 413-427.

  • [2] B. Ur P. G. Leon L. F. Cranor R. Shay and Y. Wang “Smart useful scary creepy: Perceptions of online behavioral advertising” in Proceedings of the Eighth Symposium on Usable Privacy and Security. ACM 2012 pp. 4:1-4:15.

  • [3] Google “About ads settings” https://support.google.com/ ads/answer/2662856 accessed Nov. 21 2014.

  • [4] Yahoo! “Ad interest manager” https://info.yahoo.com/ privacy/us/yahoo/opt_out/targeting/details.html accessed Nov. 21 2014.

  • [5] Microsoft “Microsoft personalized ad preferences” http: //choice.microsoft.com/en-us/opt-out accessed Nov. 21 2014.

  • [6] Executive Office of the President “Big data: Seizing opportunities preserving values” Posted at http://www. whitehouse.gov/sites/default/files/docs/big_data_privacy_ report_may_1_2014.pdf 2014 accessed Jan. 26 2014.

  • [7] R. Zemel Y. Wu K. Swersky T. Pitassi and C. Dwork “Learning fair representations” in Proceedings of the 30th International Conference on Machine Learning (ICML-13) S. Dasgupta and D. Mcallester Eds. vol. 28. JMLR Workshop and Conference Proceedings May 2013 pp. 325-333. [Online]. Available: http: //jmlr.org/proceedings/papers/v28/zemel13.pdf

  • [8] Google “Privacy policy” https://www.google.com/intl/en/ policies/privacy/ accessed Nov. 21 2014.

  • [9] F. Pedregosa G. Varoquaux A. Gramfort V. Michel B. Thirion O. Grisel M. Blondel P. Prettenhofer R. Weiss V. Dubourg J. Vanderplas A. Passos D. Cournapeau M. Brucher M. Perrot and E. Duchesnay “Scikit-learn: Machine learning in Python” Journal of Machine Learning Research vol. 12 pp. 2825-2830 2011.

  • [10] E. Jones T. Oliphant P. Peterson et al. “SciPy: Open source scientific tools for Python” 2001 http://www.scipy.org/.

  • [11] M. C. Tschantz A. Datta A. Datta and J. M. Wing “A methodology for information flow experiments” ArXiv Tech. Rep. arXiv:1405.2376v1 2014.

  • [12] P. Good Permutation Parametric and Bootstrap Tests of Hypotheses. Springer 2005.

  • [13] C. E. Wills and C. Tatar “Understanding what they do with what they know” in Proceedings of the 2012 ACM Workshop on Privacy in the Electronic Society 2012 pp. 13-18.

  • [14] P. Barford I. Canadi D. Krushevskaja Q. Ma and S. Muthukrishnan “Adscape: Harvesting and analyzing online display ads” in Proceedings of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee 2014 pp. 597-608.

  • [15] B. Liu A. Sheth U. Weinsberg J. Chandrashekar and R. Govindan “AdReveal: Improving transparency into online targeted advertising” in Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks. ACM 2013 pp. 12:1-12:7.

  • [16] M. Lécuyer G. Ducoffe F. Lan A. Papancea T. Petsios R. Spahn A. Chaintreau and R. Geambasu “XRay: Increasing the web’s transparency with differential correlation” in Proceedings of the USENIX Security Symposium 2014.

  • [17] S. Englehardt C. Eubank P. Zimmerman D. Reisman and A. Narayanan “Web privacy measurement: Scientific principles engineering platform and new results” Manuscript posted at http://randomwalker.info/ publications/WebPrivacyMeasurement.pdf 2014 accessed Nov. 22 2014.

  • [18] S. Guha B. Cheng and P. Francis “Challenges in measuring online advertising systems” in Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement 2010 pp. 81-87.

  • [19] R. Balebako P. Leon R. Shay B. Ur Y. Wang and L. Cranor “Measuring the effectiveness of privacy tools for limiting behavioral advertising” in Web 2.0 Security and Privacy Workshop 2012.

  • [20] L. Sweeney “Discrimination in online ad delivery” Commun. ACM vol. 56 no. 5 pp. 44-54 2013.

  • [21] R. A. Fisher The Design of Experiments. Oliver & Boyd 1935.

  • [22] S. Greenland and J. M. Robins “Identifiability exchangeability and epidemiological confounding” International Journal of Epidemiology vol. 15 no. 3 pp. 413-419 1986.

  • [23] T. M. Mitchell Machine Learning. McGraw-Hill 1997.

  • [24] D. D. Jensen “Induction with randomization testing: Decision-oriented analysis of large data sets” Ph.D. dissertation Sever Institute of Washington University 1992.

  • [25] Alexa “Is popularity in the top sites by category directory based on traffic rank?” https://support.alexa.com/hc/enus/ articles/200461970 accessed Nov. 21 2014.

  • [26] C. M. Bishop Pattern Recognition and Machine Learning. Springer 2006.

  • [27] S. Holm “A simple sequentially rejective multiple test procedure” Scandinavian Journal of Statistics vol. 6 no. 2 pp. 65-70 1979.

  • [28] Google “Google privacy and terms” http://www.google. com/policies/technologies/ads/ accessed Nov. 22 2014.

  • [29] H. Abdi “Bonferroni and Šidák corrections for multiple comparisons” in Encyclopedia of Measurement and Statistics N. J. Salkind Ed. Sage 2007.

  • [30] D. Hume A Treatise of Human Nature: Being an Attempt to Introduce the Experimental Method of Reasoning into Moral Subjects 1738 book III part I section I.

  • [31] Pew Research Center’s Social and Demographic Trends Project “On pay gap millennial women near parity - for now: Despite gains many see roadblocks ahead” 2013.

  • [32] T. Z. Zarsky “Understanding discrimination in the scored society” Washington Law Review vol. 89 pp. 1375-1412 2014.

  • [33] R. S. Zemel Y. Wu K. Swersky T. Pitassi and C. Dwork “Learning fair representations” in Proceedings of the 30th International Conference on Machine Learning ser. JMLR: W&CP vol. 28. JMLR.org 2013 pp. 325-333.

  • [34] Adgooroo “Adwords cost per click rises 26% between 2012 and 2014” http://www.adgooroo.com/resources/blog/ adwords-cost-per-click-rises-26-between-2012-and-2014/ accessed Nov. 21 2014.

  • [35] L. Olejnik T. Minh-Dung and C. Castelluccia “Selling off privacy at auction” in Network and Distributed System Security Symposium (NDSS). The Internet Society 2013.

  • [36] C. J. Clopper and E. S. Pearson “The use of confidence or fiducial limits illustrated in the case of the binomial” Biometrika vol. 26 no. 4 pp. 404-413 1934.

Search
Journal information
Cited By
Metrics
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 4097 3551 328
PDF Downloads 2211 2063 215