Expectation-Maximization Tensor Factorization for Practical Location Privacy Attacks

Open access

Abstract

Location privacy attacks based on a Markov chain model have been widely studied to de-anonymize or de-obfuscate mobility traces. An adversary can perform various kinds of location privacy attacks using a personalized transition matrix, which is trained for each target user. However, the amount of training data available to the adversary can be very small, since many users do not disclose much location information in their daily lives. In addition, many locations can be missing from the training traces, since many users do not disclose their locations continuously but rather sporadically. In this paper, we show that the Markov chain model can be a threat even in this realistic situation. Specifically, we focus on a training phase (i.e. mobility profile building phase) and propose Expectation-Maximization Tensor Factorization (EMTF), which alternates between computing a distribution of missing locations (E-step) and computing personalized transition matrices via tensor factorization (M-step). Since the time complexity of EMTF is exponential in the number of missing locations, we propose two approximate learning methods, one of which uses the Viterbi algorithm while the other uses the Forward Filtering Backward Sampling (FFBS) algorithm. We apply our learning methods to a de-anonymization attack and a localization attack, and evaluate them using three real datasets. The results show that our learning methods significantly outperform a random guess, even when there is only one training trace composed of 10 locations per user, and each location is missing with probability 80% (i.e. even when users hardly disclose two temporally-continuous locations).

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] Acar E Dunlavy DM Kolda TG Morup M (2011) Scalable tensor factorizations for incomplete data. Chemometrics and Intelligent Laboratory Systems 106(1):41–56

  • [2] Albright R Cox J Duling D Langville AN Meyer CD (2014) Algorithms initializations and convergence for the nonnegative matrix factorization. SAS Technical Report pp 1–18

  • [3] Anandkumar A Ge R Hsu D Kakade SM Telgarsky M (2014) Tensor decompositions for learning latent variable models. Journal of Machine Learning Research 15(1):2773–2832

  • [4] Andrés ME Bordenabe NE Chatzikokolakis K Palamidessi C (2013) Geo-indistinguishability: Differential privacy for location-based systems. In: Proceedings of the 20th ACM Conference on Computer and Communications Security (CCS’13) pp 901–914

  • [5] Bishop C (2006) Pattern Recognition and Machine Learning. Springer

  • [6] Cheng Z Caverlee J Lee K Sui DZ (2011) Exploring millions of footprints in location sharing services. Proceedings of the 5th International AAAI Conference on Weblogs and Social Media (ICWSM’11) pp 81–88

  • [7] Cho E Myers SA Leskovec J (2011) Friendship and mobility: User movement in location-based social networks. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11) pp 1082–1090

  • [8] Cichocki A Zdunek R Phan AH Amari S (2009) Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation. Wiley

  • [9] Eagle N Pentland A Lazer D (2009) Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences (PNAS) 106(36):15274–15278

  • [10] Freudiger J Shokri R Hubaux JP (2011) Evaluating the privacy risk of location-based services. In: Proceedings of the 15th international conference on Financial Cryptography and Data Security (FC’11) pp 31–46

  • [11] Gambs S Killijian MO Núñez del Prado Cortez M (2014) De-anonymization attack on geolocated data. Journal of Computer and System Sciences 80(8):1597–1614

  • [12] Gedik B Liu L (2008) Protecting location privacy with personalized k-anonymity: Architecture and algorithms. IEEE Transactions on Mobile Computing 7(1):1–18

  • [13] Ghinita G (2013) Privacy for Location-based Services. Morgan & Claypool Publishers

  • [14] Golle P Partridge K (2009) On the anonymity of home/work location pairs. In: Proceedings of the 7th International Conference on Pervasive Computing (Pervasive’09) pp 390–397

  • [15] Hoh B Gruteser M Xiong H Alrabady A (2010) Achieving guaranteed anonymity in GPS traces via uncertainty-aware path cloaking. IEEE Transactions on Mobile Computing 9(8):1089–1107

  • [16] Hull B Bychkovsky V Zhang Y Chen K Goraczko M (2006) CarTel: A distributed mobile sensor computing system. In: Proceedings of the 4th International Conference on Embedded Networked Sensor Systems (SenSys’06) pp 125–138

  • [17] Ji S Li W Srivatsa M Beyah R (2014) Structural data de-anonymization: Quantification practice and implications. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS’14) pp 1040–1053

  • [18] Koren Y Bell R Volinsky C (2009) Matrix factorization techniques for recommender systems. IEEE Computer 42(8):30–37

  • [19] Krumm J (2009) A survey of computational location privacy. Personal and Ubiquitous Computing 13(6):391–399

  • [20] Kuleshov V Chaganty AT Liang P (2015) Tensor factorization via matrix factorization. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS’15) pp 507–516

  • [21] Matsuo Y Okazaki N Izumi K Nakamura Y Nishimura T Hasida K (2007) Inferring long-term user properties based on users’ location history. In: Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI’ 07) pp 2159–2165

  • [22] Minami K Borisov N (2010) Protecting location privacy against inference attacks. In: Proceedings of the 9th Annual ACM Workshop on Privacy in the Electronic Society (WPES’10) pp 123–126

  • [23] de Montjoye YA Hidalgo AC Verleysen M Blondel VD (2013) Unique in the Crowd: The Privacy Bounds of Human Mobility. Scientific Reports 3(1376):1–5

  • [24] Mulder YD Danezis G Batina L Preneel B (2008) Identification via location-profiling in GSM networks. In: Proceedings of the 7th ACM Workshop on Privacy in the Electronic Society (WPES’08) pp 23–32

  • [25] Murakami T Watanabe H (2016) Localization attacks using matrix and tensor factorization. IEEE Transactions on Information Forensics and Security 11(8):1647–1660

  • [26] Murakami T Kanemura A Hino H (2017) Group sparsity tensor factorization for Re-identification of Open Mobility Traces. IEEE Transactions on Information Forensics and Security 12(3):689–704

  • [27] Murakami T Kanemura A Hino H (2015) Group Sparsity Tensor Factorization for De-anonymization of Mobility Traces. In: Proceedings of the 14th IEEE International Conference on Trust Security and Privacy in Computing and Communications (TrustCom’15) pp 621–629

  • [28] Murphy KP (2012) Machine Learning: A Probabilistic Perspective The MIT Press chap 17 p 619

  • [29] Narain S Vo-Huu TD Block K and Noubir G (2016) Inferring User Routes and Locations using Zero-Permission Mobile Sensors. In: Proceedings of the 2016 IEEE Symposium on Security and Privacy (S&P’16) pp 397–413

  • [30] Please Rob Me - Raising Awareness about Over-sharing (2010) http://pleaserobme.com/

  • [31] Rabiner LR (1989) A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of IEEE 77(2):257–286

  • [32] Rendle S Shmidt-Thieme L (2010) Pairwise interaction tensor factorization for personalized tag recommendation. In: Proceedings of the 3rd ACM International Conference on Web Search and Data Mining (WSDM’10) pp 81–90

  • [33] Rendle S Freudenthaler C Shmidt-Thieme L (2010) Factorizing personalized markov chains for next-basket recommendation. In: Proceedings of the 19th International Conference on World Wide Web (WWW’10) pp 811–820

  • [34] Shekhar S Evans MR Gunturi V Yang K (2012) Spatial big-data challenges intersecting mobility and cloud computing. In: Proceedings of the 11th ACM International Workshop on Data Engineering for Wireless and Mobile Access (MobiDE’12) pp 1–12

  • [35] Shokri R Theodorakopoulos G Boudec JYL Hubaux JP (2011) Quantifying location privacy. In: Proceedings of the 2011 IEEE Symposium on Security and Privacy (S&P’11) pp 247–262

  • [36] Shokri R Theodorakopoulos G Danezis G Hubaux JP Boudec JYL (2011) Quantifying location privacy: The case of sporadic location exposure. In: Proceedings of the 11th International Conference on Privacy Enhancing Technologies (PETS’11) pp 57–76

  • [37] Shokri R Theodorakopoulos G Troncoso C Hubaux JP Boudec JYL (2012) Protecting location privacy: Optimal strategy against localization attacks. Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS’12) pp 617–627

  • [38] Song L Kotz D Jain R He X (2006) Evaluating next-cell predictors with extensive wi-fi mobility data. IEEE Transactions on Mobile Computing 5(12):1633–1649

  • [39] Srivatsa M Hicks M (2012) Deanonymizing mobility traces: Using social networks as a side-channel. In: Proceedings of the 2012 ACM conference on Computer and communications security (CCS’12) pp 628–637

  • [40] Vaccari A (2014) Introducing a new optional feature called nearby friends. http://newsroom.fb.com/news/2014/04/introducing-a-new-optional-feature-called-nearby-friends/

  • [41] Voelcker J (2006) Stalked by satellite: An alarming rise in GPS-enabled harassment. IEEE Spectrum 47(7):15–16

  • [42] Wang N Yao T Wang J Yeung DY (2012) A probabilistic approach to robust matrix factorization. In: Proceedings of the 12th European Conference on Computer Vision (ECCV’12) vol 7578 pp 126–139

  • [43] Webroot Inc (2010) Webroot survey finds geolocation apps prevalent amongst mobile device users but 55% concerned about loss of privacy. http://www.webroot.com/ca/en/company/pressroom/releases/social-networks-mobile-security

  • [44] Xue AY Zhang R Zheng Y Xie X Huang J Xu Z (2013) Destination prediction by sub-trajectory synthesis and privacy protection against such prediction. In: Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE’13) pp 254–265

  • [45] Yang D Zhang D Zheng VW Yu Z (2015) Modeling user activity preference by leveraging user spatial temporal characteristics in lbsns. IEEE Transactions on Systems Man and Cybernetics: Systems 45(1):129–142

  • [46] Zhang S Wang W Ford J Makedon F (2006) Learning from incomplete ratings usig non-negative matrix factorization. In: Proceedings of the 6th SIAM International Conference on Data Mining (SDM’06) pp 548–552

  • [47] Zheng Y Zhang L Xie X Ma WY (2009) Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th International Conference on World Wide Web (WWW’09) pp 791–800

  • [48] Zheng Y Xie X Ma WY (2010) GeoLife: A collaborative social networking service among user location and trajectory. IEEE Data Engineering Bulletin 32(2):32–40

Search
Journal information
Cited By
Metrics
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 369 177 3
PDF Downloads 206 93 3