What Does The Crowd Say About You? Evaluating Aggregation-based Location Privacy

Open access

Abstract

Information about people’s movements and the locations they visit enables an increasing number of mobility analytics applications, e.g., in the context of urban and transportation planning, In this setting, rather than collecting or sharing raw data, entities often use aggregation as a privacy protection mechanism, aiming to hide individual users’ location traces. Furthermore, to bound information leakage from the aggregates, they can perturb the input of the aggregation or its output to ensure that these are differentially private.

In this paper, we set to evaluate the impact of releasing aggregate location time-series on the privacy of individuals contributing to the aggregation. We introduce a framework allowing us to reason about privacy against an adversary attempting to predict users’ locations or recover their mobility patterns. We formalize these attacks as inference problems, and discuss a few strategies to model the adversary’s prior knowledge based on the information she may have access to. We then use the framework to quantify the privacy loss stemming from aggregate location data, with and without the protection of differential privacy, using two real-world mobility datasets. We find that aggregates do leak information about individuals’ punctual locations and mobility profiles. The density of the observations, as well as timing, play important roles, e.g., regular patterns during peak hours are better protected than sporadic movements. Finally, our evaluation shows that both output and input perturbation offer little additional protection, unless they introduce large amounts of noise ultimately destroying the utility of the data.

[1] Waze. https://www.waze.com, 2016.

[2] G. Acs and C. Castelluccia. A case study: privacy preserving release of spatio-temporal density in paris. In KDD, 2014.

[3] M. E. Andrés, N. E. Bordenabe, K. Chatzikokolakis, and C. Palamidessi. Geo-indistinguishability: Differential privacy for location-based systems. In CCS, 2013.

[4] S. Bocconi, A. Bozzon, A. Psyllidis, C. Titos Bolivar, and G.-J. Houben. Social glass: A platform for urban analytics and decision-making through heterogeneous social data. In WWW, 2015.

[5] J. W. Brown, O. Ohrimenko, and R. Tamassia. Haze: privacy-preserving real-time traffic statistics. In SIGSPATIAL, 2013.

[6] Y. Cao, M. Yoshikawa, Y. Xiao, and L. Xiong. Quantifying differential privacy under temporal correlations. In ICDE, 2017.

[7] I. Ceapa, C. Smith, and L. Capra. Avoiding the crowds: understanding tube station congestion patterns from trip data. In International Workshop on Urban Computing, 2012.

[8] T.-H. H. Chan, E. Shi, and D. Song. Private and continual release of statistics. ACM TISSEC, 14(3), 2011.

[9] R. Chen, H. Li, A. Qin, S. P. Kasiviswanathan, and H. Jin. Private spatial data aggregation in the local setting. In ICDE, 2016.

[10] R. Chen, A. Reznichenko, P. Francis, and J. Gehrke. Towards Statistical Queries over Distributed Private User Data. In NSDI, volume 12, 2012.

[11] Y.-A. De Montjoye, C. A. Hidalgo, M. Verleysen, and V. D. Blondel. Unique in the crowd: The privacy bounds of human mobility. Scientific reports, 2013.

[12] Y. De Mulder, G. Danezis, L. Batina, and B. Preneel. Identification via location-profiling in GSM networks. In WPES, 2008.

[13] C. Dwork. Differential privacy: A survey of results. In TAMC, 2008.

[14] C. Dwork, M. Naor, T. Pitassi, and G. N. Rothblum. Differential privacy under continual observation. In STOC, 2010.

[15] A. Eland. Tackling urban mobility with technology. https://europe.googleblog.com/2015/11/tackling-urban-mobility-with-technology.html, 2015.

[16] D. M. Endres and J. E. Schindelin. A new metric for probability distributions. IEEE Transactions on Information theory, 2003.

[17] Ú. Erlingsson, V. Pihur, and A. Korolova. RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response. In CCS, 2014.

[18] L. Fan and L. Xiong. Real-time aggregate monitoring with differential privacy. In CIKM, 2012.

[19] G. Ghinita. Privacy for location-based services. Synthesis Lectures on Information Security, Privacy, & Trust, 4(1), 2013.

[20] P. Golle and K. Partridge. On the anonymity of home/work location pairs. In Pervasive Computing, 2009.

[21] S.-S. Ho and S. Ruan. Differential privacy for location pattern mining. In Workshop on Security and Privacy in GIS and LBS, 2011.

[22] E. J. Horvitz, J. Apacible, R. Sarin, and L. Liao. Prediction, expectation, and surprise: Methods, designs, and study of a deployed traffic forecasting service. arXiv preprint arXiv:1207.1352, 2012.

[23] J. Kaneps. Apple’s ’differential privacy’ is about collecting your data—but not your data. https://www.wired.com/2016/06/apples-differential-privacy-collecting-data/, 2016.

[24] C. Kopp, M. Mock, and M. May. Privacy-preserving distributed monitoring of visit quantities. In SIGSPATIAL, 2012.

[25] J. Krumm. Inference attacks on location tracks. In Pervasive Computing, 2007.

[26] J. Krumm. A survey of computational location privacy. Personal and Ubiquitous Computing, 13(6), 2009.

[27] S. Kullback and R. A. Leibler. On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 1951.

[28] N. Lathia, C. Smith, J. Froehlich, and L. Capra. Individuals among commuters: Building personalised transport information services from fare collection systems. Pervasive and Mobile Computing, 9(5), 2013.

[29] J. Lin. Divergence measures based on the shannon entropy. IEEE Transactions on Information theory, 1991.

[30] A. Machanavajjhala, D. Kifer, J. Abowd, J. Gehrke, and L. Vilhuber. Privacy: Theory meets practice on the map. In ICDE, 2008.

[31] L. Melis, G. Danezis, and E. De Cristofaro. Efficient private statistics with succinct sketches. In NDSS, 2016.

[32] B. Pan, Y. Zheng, D. Wilkie, and C. Shahabi. Crowd sensing of traffic anomalies based on human mobility and social media. In SIGSPATIAL, 2013.

[33] V. Pandurangan. On Taxis and Rainbows. https://tech.vijayp.ca/of-taxis-and-rainbows-f6bc289679a1, 2014.

[34] M. Piorkowski, N. Sarafijanovic-Djukic, and M. Grossglauser. CRAWDAD Dataset. http://crawdad.org/epfl/mobility/20090224, 2009.

[35] I. Polakis, G. Argyros, T. Petsios, S. Sivakorn, and A. D. Keromytis. Where’s wally?: Precise user discovery attacks in location proximity services. In CCS, 2015.

[36] R. A. Popa, A. J. Blumberg, H. Balakrishnan, and F. H. Li. Privacy and accountability for location-based aggregate statistics. In CCS, 2011.

[37] A. Pyrgelis, E. De Cristofaro, and G. Ross. Privacy-Friendly Mobility Analytics using Aggregate Location Data. In SIGSPATIAL, 2016.

[38] D. Quercia, I. Leontiadis, L. McNamara, C. Mascolo, and J. Crowcroft. Spotme if you can: Randomized responses for location obfuscation on mobile phones. In ICDCS, 2011.

[39] V. Rastogi and S. Nath. Differentially private aggregation of distributed time-series with transformation and encryption. In SIGMOD, 2010.

[40] E. Shi, H. Chan, E. Rieffel, R. Chow, and D. Song. Privacy-preserving aggregation of time-series data. In NDSS, 2011.

[41] R. Shokri, G. Theodorakopoulos, G. Danezis, J.-P. Hubaux, and J.-Y. Le Boudec. Quantifying location privacy: the case of sporadic location exposure. In PETS, 2011.

[42] R. Shokri, G. Theodorakopoulos, J.-Y. Le Boudec, and J.-P. Hubaux. Quantifying location privacy. In IEEE Symposium on Security and Privacy, 2011.

[43] R. Shokri, C. Troncoso, C. Diaz, J. Freudiger, and J.-P. Hubaux. Unraveling an old cloak: k-anonymity for location privacy. In WPES, 2010.

[44] R. Silva, S. M. Kang, and E. M. Airoldi. Predicting traffic volumes and estimating the effects of shocks in massive transportation systems. Proceedings of the National Academy of Sciences, 112(18), 2015.

[45] H. To, K. Nguyen, and C. Shahabi. Differentially Private Publication of Location Entropy. In SIGSPATIAL, 2016.

[46] G. Wang, B. Wang, T. Wang, A. Nika, H. Zheng, and B. Y. Zhao. Whispers in the dark: analysis of an anonymous social network. In IMC, 2014.

[47] S. L. Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309), 1965.

[48] A. Waseda and R. Nojima. Analyzing randomized response mechanisms under differential privacy. In ICIS, 2016.

[49] M. Wernke, P. Skvortsov, F. Dürr, and K. Rothermel. A classification of location privacy attacks and approaches. Personal and Ubiquitous Computing, 18(1), 2014.

[50] F. Xu, Z. Tu, Y. Li, P. Zhang, X. Fu, and D. Jin. Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data. In WWW, 2017.

[51] M. Xue, C. L. Ballard, K. Liu, C. L. Nemelka, Y. Wu, K. W. Ross, and H. Qian. You can yak but you can’t hide: Localizing anonymous social network users. In IMC, 2016.

[52] H. Zang and J. Bolot. Anonymization of location data does not work: A large-scale measurement study. In MobiCom, 2011.

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 94 94 23
PDF Downloads 25 25 7