NoMoATS: Towards Automatic Detection of Mobile Tracking

Anastasia Shuba 1  and Athina Markopoulou 2
  • 1
  • 2 University of California, , Irvine

Abstract

Today’s mobile apps employ third-party advertising and tracking (A&T) libraries, which may pose a threat to privacy. State-of-the-art detects and blocks outgoing A&T HTTP/S requests by using manually curated filter lists (e.g. EasyList), and recently, using machine learning approaches. The major bottleneck of both filter lists and classifiers is that they rely on experts and the community to inspect traffic and manually create filter list rules that can then be used to block traffic or label ground truth datasets. We propose NoMoATS – a system that removes this bottleneck by reducing the daunting task of manually creating filter rules, to the much easier and scalable task of labeling A&T libraries. Our system leverages stack trace analysis to automatically label which network requests are generated by A&T libraries. Using NoMoATS, we collect and label a new mobile traffic dataset. We use this dataset to train decision tree classifiers, which can be applied in real-time on the mobile device and achieve an average F-score of 93%. We show that both our automatic labeling and our classifiers discover thousands of requests destined to hundreds of different hosts, previously undetected by popular filter lists. To the best of our knowledge, our system is the first to (1) automatically label which mobile network requests are engaged in A&T, while requiring to only manually label libraries to their purpose and (2) apply on-device machine learning classifiers that operate at the granularity of URLs, can inspect connections across all apps, and detect not only ads, but also tracking.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] AppBrain. Android library statistics. https://www.appbrain.com/stats/libraries/. (accessed Feb. 2019).

  • [2] Bin Liu, Bin Liu, Hongxia Jin, and Ramesh Govindan. Efficient Privilege De-Escalation for Ad Libraries in Mobile Apps. In Proc. 13th Annu. Int. Conf. Mobile Systems, Applications, and Services, pages 89–103, Florence, Italy, May 2015.

  • [3] Saksham Chitkara, Nishad Gothoskar, Suhas Harish, Jason I Hong, and Yuvraj Agarwal. Does this App Really Need My Location?: Context-Aware Privacy Management for Smart-phones. In Proc. ACM Interactive, Mobile, Wearable and Ubiquitous Technologies, volume 1, page 42, 2017.

  • [4] Julian Andres Klode. DNS-based Host Blocker for Android. https://github.com/julian-klode/dns66. (accessed Nov. 2019).

  • [5] AdGuard. AdGuard for Android. https://adguard.com/en/adguard-android/overview.html. (accessed Nov. 2019).

  • [6] EasyList. EasyList. https://easylist.to. (accessed Nov. 2019).

  • [7] J. Ren, A. Rao, M. Lindorfer, A. Legout, and D. Choffnes. ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic. In Proc. 13th Annu. Int. Conf. on Mobile Systems, Applications, and Services, volume 16, New York, NY, USA, 2016.

  • [8] Abbas Razaghpanah, Rishab Nithyanand, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Mark Allman, and Christian Kreibich Phillipa Gill. Apps, Trackers, Privacy, and Regulators. In Proc. Network and Distributed System Security Symp., volume 2018, San Diego, CA, USA, Feb. 2018.

  • [9] Anastasia Shuba, Athina Markopoulou, and Zubair Shafiq. NoMoAds: Effective and Efficient Cross-App Mobile Ad-Blocking. volume 2018, Barcelona, Spain, Jul. 2018.

  • [10] Abbas Razaghpanah, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Christian Kreibich, Phillipa Gill, Mark Allman, and Vern Paxson. Haystack: A Multi-Purpose Mobile Vantage Point in User Space. arXiv:1510.01419v3, Oct. 2016.

  • [11] Y. Song and U. Hengartner. PrivacyGuard: A VPN-based Platform to Detect Information Leakage on Android Devices. In Proc. 5th Annual ACM CCS Workshop Security and Privacy in Smartphones and Mobile Devices, pages 15–26, Denver, CO, USA, Feb. 2015.

  • [12] Anastasia Shuba, Anh Le, Emmanouil Alimpertis, Minas Gjoka, and Athina Markopoulou. AntMonitor: System and Applications. arXiv preprint arXiv:1611.04268, 2016.

  • [13] BSDgeek_Jake. MoaAB: Mother of All AD-BLOCKING. https://forum.xda-developers.com/showthread.php?t=1916098. (accessed Nov. 2019).

  • [14] UCI Networking Group. NoMoAds Open Source. http://athinagroup.eng.uci.edu/projects/nomoads. (accessed Dec. 2019).

  • [15] Peter Eckersley. How Unique Is Your Web Browser? In Proc. Privacy Enhancing Technologies, pages 1–18, Berlin, Germany, Jul. 2010.

  • [16] Nick Nikiforakis, Alexandros Kapravelos, Wouter Joosen, Christopher Kruegel, Frank Piessens, and Giovanni Vigna. Cookieless Monster: Exploring the Ecosystem of Web-based Device Fingerprinting. In 2013 IEEE Symp. Security and Privacy, pages 541–555, San Francisco, CA, USA, May 2013.

  • [17] Gunes Acar, Marc Juarez, Nick Nikiforakis, Claudia Diaz, Seda Gürses, Frank Piessens, and Bart Preneel. FPDetective: Dusting the Web for Fingerprinters. In Proc. 2013 ACM SIGSAC Conf. Computer and Communications Security, pages 1129–1140, Berlin, Germany, Nov. 2013.

  • [18] Steven Englehardt and Arvind Narayanan. Online Tracking: A 1-million-site Measurement and Analysis. In Proc. 2013 ACM SIGSAC Conf. Computer and Communications Security, pages 1388–1401, Berlin, Germany, Nov. 2016.

  • [19] Georg Merzdovnik, Markus Huber, Damjan Buhov, Nick Nikiforakis, Sebastian Neuner, Martin Schmiedecker, and Edgar Weippl. Block Me If You Can: A Large-Scale Study of Tracker-Blocking Tools. In Proc. 2017 IEEE European Security and Privacy, pages 319–333, Paris, France, Apr. 2017.

  • [20] Anupam Das, Gunes Acar, Nikita Borisov, and Amogh Pradeep. The Web’s Sixth Sense: A Study of Scripts Accessing Smartphone Sensors. In Proc. 2013 ACM SIGSAC Conf. Computer and Communications Security, pages 1515–1532, Berlin, Germany, Nov. 2018.

  • [21] Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. FlowDroid: Precise Context, Flow, Field, Object-sensitive and Lifecycle-aware Taint Analysis for Android Apps. ACM SIGPLAN Notices, 49(6):259–269, 2014.

  • [22] Michael I Gordon, Deokhwan Kim, Jeff H Perkins, Limei Gilham, Nguyen Nguyen, and Martin C Rinard. Information Flow Analysis of Android Applications in DroidSafe. In Proc. Network and Distributed System Security Symp., volume 15, page 110, San Diego, CA, USA, Feb. 2015.

  • [23] Clint Gibler, Jonathan Crussell, Jeremy Erickson, and Hao Chen. AndroidLeaks: Automatically Detecting Potential Privacy Leaks In Android Applications on a Large Scale. In Proc. Int. Conf. Trust and Trustworthy Computing, pages 291–307, Vienna, Austria, Jun. 2012.

  • [24] Michael C Grace, Wu Zhou, Xuxian Jiang, and Ahmad-Reza Sadeghi. Unsafe Exposure Analysis of Mobile In-App Advertisements. In Proc. 5th ACM Conf. Security and Privacy in Wireless and Mobile Networks, pages 101–112, Tucson, AZ, USA, Apr. 2012.

  • [25] William Enck, Peter Gilbert, Seungyeop Han, Vasant Tendulkar, Byung-Gon Chun, Landon P Cox, Jaeyeon Jung, Patrick McDaniel, and Anmol N Sheth. TaintDroid: An Information-Flow Tracking System for Real-Time Privacy Monitoring on Smartphones. ACM Transactions Computer Systems, 32(2):5, 2014.

  • [26] Peter Hornyack, Seungyeop Han, Jaeyeon Jung, Stuart Schechter, and David Wetherall. “These Aren’t the Droids You’re Looking For”: Retrofitting Android to Protect Data from Imperious Applications. In Proc. 18th ACM Conf. on Computer and Communications Security, pages 639–652, Chicago, IL, USA, Oct. 2011.

  • [27] AdGuard: Content Blocker for Samsung and Yandex. https://play.google.com/store/apps/details?id=com.adguard.android.contentblocker. (accessed Nov. 2019).

  • [28] Apple. Safari. https://www.apple.com/safari. (accessed Nov. 2019).

  • [29] Adblock Plus. Adblock Plus. https://adblockplus.org. (accessed Nov. 2019).

  • [30] Jason Bau, Jonathan Mayer, Hristo Paskov, and John C Mitchell. A Promising Direction for Web Tracking Countermeasures. In Proc. of W2SP, San Francisco, CA, USA, May 2013.

  • [31] Sruti Bhagavatula, Christopher Dunn, Chris Kanich, Minaxi Gupta, and Brian Ziebart. Leveraging Machine Learning to Improve Unwanted Resource Filtering. In Proc. 2014 Workshop Artificial Intelligent and Security Workshop, pages 95–102, Scottsdale, AZ, USA, Nov. 2014.

  • [32] David Gugelmann, Markus Happe, Bernhard Ager, and Vincent Lenders. An Automated Approach for Complementing Ad Blockers’ Blacklists. volume 2015, pages 282–298, Philadelphia, PA, USA, Jun. 2015.

  • [33] Umar Iqbal, Zubair Shafiq, Peter Snyder, Shitong Zhu, Zhiyun Qian, and Benjamin Livshits. AdGraph: A Machine Learning Approach to Automatic and Effective Adblocking. arXiv preprint arXiv:1805.09155, 2018.

  • [34] Zain ul Abi Din, Panagiotis Tigas, Samuel T King, and Benjamin Livshits. Percival: Making In-Browser Perceptual Ad Blocking Practical With Deep Learning. arXiv preprint arXiv:1905.07444, 2019.

  • [35] Antoine Vastel, Peter Snyder, and Benjamin Livshits. Who Filters the Filters: Understanding the Growth, Usefulness and Efficiency of Crowdsourced Ad Blocking. arXiv preprint arXiv:1810.09160, 2018.

  • [36] Umar Iqbal, Zubair Shafiq, and Zhiyun Qian. The Ad Wars: Retrospective Measurement and Analysis of Anti-Adblock Filter Lists. In Proc. 2017 Internet Measurement Conf., London, UK, Nov. 2017.

  • [37] Ziang Ma, Haoyu Wang, Yao Guo, and Xiangqun Chen. LibRadar: Fast and Accurate Detection of Third-party Libraries in Android Apps. In Proc. 38th Int. Conf. Software Engineering Companion, pages 653–656, Austin, TX, USA, May 2016.

  • [38] Haoyu Wang, Zhe Liu, Jingyue Liang, Narseo Vallina-Rodriguez, Yao Guo, Li Li, Juan Tapiador, Jingcun Cao, and Guoai Xu. Beyond Google Play: A Large-Scale Comparative Study of Chinese Android App Markets. In Proc. 2018 Internet Measurement Conf., pages 293–307, Boston, MA, USA, Oct. 2018.

  • [39] Ma Zi’ang. LiteRadar. https://github.com/pkumza/LiteRadar. (accessed Nov. 2019).

  • [40] Ole André V. Ravnås. Frida. https://www.frida.re. (accessed Nov. 2019).

  • [41] Domenico Iezzi. Google Play Unofficial Python API. https://github.com/NoMore201/googleplay-api. (accessed Nov. 2019).

  • [42] Unity. Unity Ads. https://unity.com/solutions/unity-ads. (accessed Nov. 2019).

  • [43] Yuanchun Li, Ziyue Yang, Yao Guo, and Xiangqun Chen. DroidBot: a Lightweight UI-guided Test Input Generator for Android. In Proc. 2017 IEEE/ACM 39th Int. Conf. Software Engineering Companion, pages 23–26, Buenos Aires, Argentina, 2017.

  • [44] Anastasia Shuba, Evita Bakopoulou, Milad Asgari Mehrabadi, Hieu Le, David Choffnes, and Athina Markopoulou. AntShield: On-Device Detection of Personal Information Exposure. arXiv preprint arXiv:1803.01261, 2018.

  • [45] Haojian Jin, Minyi Liu, Kevan Dodhia, Yuanchun Li, Gaurav Srivastava, Matthew Fredrikson, Yuvraj Agarwal, and Jason I Hong. Why Are They Collecting My Data?: Inferring the Purposes of Network Traffic in Mobile Apps. volume 2, page 173, 2018.

  • [46] iana. Permanent Message Header Field Names. http://www.iana.org/assignments/message-headers/message-headers.xhtml#perm-headers. (accessed Sep. 2019).

  • [47] Chartboost. Chartboost. https://www.chartboost.com. (accessed Nov. 2019).

  • [48] AppBrain. Android and Google Play statistics. https://www.appbrain.com/stats/. (accessed Sep. 2019).

  • [49] VERISIGN. Top-Level Domain Zone File Information. https://www.verisign.com/channel-resources/domain-registry-products/zone-file/index.xhtml. (accessed Feb. 2019).

  • [50] App Annie. Spotlight on Consumer App Usage. https://www.appannie.com/insights/market-data/global-consumer-app-usage-data/. (accessed Feb. 2019).

  • [51] Unity. Unity 3D. https://www.unity3d.com. (accessed Nov. 2019).

  • [52] Mario Almeida, Muhammad Bilal, Alessandro Finamore, Ilias Leontiadis, Yan Grunenberger, Matteo Varvello, and Jeremy Blackburn. CHIMP: Crowdsourcing Human Inputs for Mobile Phones. In Proc. 2018 World Wide Web Conf., pages 45–54, Lyon, France, Apr. 2018.

  • [53] Scrapinghub. adblockparser. https://github.com/scrapinghub/adblockparser. (accessed Nov. 2019).

  • [54] eyeo. Writing Adblock Plus Filters. https://adblockplus.org/filters. (accessed Nov. 2019).

  • [55] Google. Google APIs for Android. https://developers.google.com/android/reference/packages. (accessed Nov. 2019).

  • [56] Google. Alphabet Announces Fourth Quarter and Fiscal Year2018 Results. https://abc.xyz/investor/static/pdf/2018Q4_alphabet_earnings_release.pdf, 2019. (accessed Nov. 2019).

  • [57] Facebook. Facebook Reports Third Quarter 2019 Results. https://investor.fb.com/investor-news/press-release-details/2019/Facebook-Reports-Third-Quarter-2019-Results, 2019. (accessed Nov. 2019).

  • [58] Abbas Razaghpanah, Arian Akhavan Niaki, Narseo Vallina-Rodriguez, Srikanth Sundaresan, Johanna Amann, and Phillipa Gill. Studying TLS Usage in Android Apps. In Proc. 13th ACM Int. Conf. Emerging Networking Experiments and Technologies, pages 350–362, Seoul/Incheon, South Korea, Dec. 2017.

  • [59] Google. Android Accessibility API. https://developer.android.com/guide/topics/ui/accessibility. (accessed Nov. 2019).

  • [60] Google. UI/Application Exerciser Monkey. https://developer.android.com/studio/test/monkey. (accessed Nov. 2019).

  • [61] Adblock Plus. Adblock Plus Library for Android. https://github.com/adblockplus/libadblockplus-android. (accessed Nov. 2019).

  • [62] Zhonghao Yu, Sam Macbeth, Konark Modi, and Josep M Pujol. Tracking the Trackers. In Proc. 2013 World Wide Web Conf., pages 121–132, Montreal, Canada, April 2016.

  • [63] Enric Pujol, Oliver Hohlfeld, and Anja Feldmann. Annoyed users: Ads and Ad-Block Usage in the Wild. In Proc. 2017 Internet Measurement Conf., pages 93–106, Tokyo, Japan, Oct. 2015.

OPEN ACCESS

Journal + Issues

Search