Berker Ağır, Kévin Huguenin, Urs Hengartner and Jean-Pierre Hubaux
Mobile users increasingly make use of location-based online services enabled by localization systems. Not only do they share their locations to obtain contextual services in return (e.g., ‘nearest restaurant’), but they also share, with their friends, information about the venues (e.g., the type, such as a restaurant or a cinema) they visit. This introduces an additional dimension to the threat to location privacy: location semantics, combined with location information, can be used to improve location inference by learning and exploiting patterns at the semantic level (e.g., people go to cinemas after going to restaurants). Conversely, the type of the venue a user visits can be inferred, which also threatens her semantic location privacy. In this paper, we formalize this problem and analyze the effect of venue-type information on location privacy. We introduce inference models that consider location semantics and semantic privacy-protection mechanisms and evaluate them by using datasets of semantic check-ins from Foursquare, totaling more than a thousand users in six large cities. Our experimental results show that there is a significant risk for users’ semantic location privacy and that semantic information improves inference of user locations.
Alexandra-Mihaela Olteanu, Mathias Humbert, Kévin Huguenin and Jean-Pierre Hubaux
Most popular location-based social networks, such as Facebook and Foursquare, let their (mobile) users post location and co-location (involving other users) information. Such posts bring social benefits to the users who post them but also to their friends who view them. Yet, they also represent a severe threat to the users’ privacy, as co-location information introduces interdependences between users. We propose the first game-theoretic framework for analyzing the strategic behaviors, in terms of information sharing, of users of OSNs. To design parametric utility functions that are representative of the users’ actual preferences, we also conduct a survey of 250 Facebook users and use conjoint analysis to quantify the users’ benefits o f sharing vs. viewing (co)-location information and their preference for privacy vs. benefits. Our survey findings expose the fact that, among the users, there is a large variation, in terms of these preferences. We extensively evaluate our framework through data-driven numerical simulations. We study how users’ individual preferences influence each other’s decisions, we identify several factors that significantly affect these decisions (among which, the mobility data of the users), and we determine situations where dangerous patterns can emerge (e.g., a vicious circle of sharing, or an incentive to over-share) – even when the users share similar preferences.
Mathias Humbert, Kévin Huguenin, Joachim Hugonot, Erman Ayday and Jean-Pierre Hubaux
People increasingly have their genomes sequenced and some of them share their genomic data online. They do so for various purposes, including to find relatives and to help advance genomic research. An individual’s genome carries very sensitive, private information such as its owner’s susceptibility to diseases, which could be used for discrimination. Therefore, genomic databases are often anonymized. However, an individual’s genotype is also linked to visible phenotypic traits, such as eye or hair color, which can be used to re-identify users in anonymized public genomic databases, thus raising severe privacy issues. For instance, an adversary can identify a target’s genome using known her phenotypic traits and subsequently infer her susceptibility to Alzheimer’s disease. In this paper, we quantify, based on various phenotypic traits, the extent of this threat in several scenarios by implementing de-anonymization attacks on a genomic database of OpenSNP users sequenced by 23andMe. Our experimental results show that the proportion of correct matches reaches 23% with a supervised approach in a database of 50 participants. Our approach outperforms the baseline by a factor of four, in terms of the proportion of correct matches, in most scenarios. We also evaluate the adversary’s ability to predict individuals’ predisposition to Alzheimer’s disease, and we observe that the inference error can be halved compared to the baseline. We also analyze the effect of the number of known phenotypic traits on the success rate of the attack. As progress is made in genomic research, especially for genotype-phenotype associations, the threat presented in this paper will become more serious.
Kirill Nikitin, Ludovic Barman, Wouter Lueks, Matthew Underwood, Jean-Pierre Hubaux and Bryan Ford
Most encrypted data formats leak metadata via their plaintext headers, such as format version, encryption schemes used, number of recipients who can decrypt the data, and even the recipients’ identities. This leakage can pose security and privacy risks to users, e.g., by revealing the full membership of a group of collaborators from a single encrypted e-mail, or by enabling an eavesdropper to fingerprint the precise encryption software version and configuration the sender used.
We propose that future encrypted data formats improve security and privacy hygiene by producing Padded Uniform Random Blobs or PURBs: ciphertexts indistinguishable from random bit strings to anyone without a decryption key. A PURB’s content leaks nothing at all, even the application that created it, and is padded such that even its length leaks as little as possible.
Encoding and decoding ciphertexts with no cleartext markers presents efficiency challenges, however. We present cryptographically agile encodings enabling legitimate recipients to decrypt a PURB efficiently, even when encrypted for any number of recipients’ public keys and/or passwords, and when these public keys are from different cryptographic suites. PURBs employ Padmé, a novel padding scheme that limits information leakage via ciphertexts of maximum length M to a practical optimum of O(log log M) bits, comparable to padding to a power of two, but with lower overhead of at most 12% and decreasing with larger payloads.
David Froelicher, Patricia Egger, João Sá Sousa, Jean Louis Raisaro, Zhicong Huang, Christian Mouchet, Bryan Ford and Jean-Pierre Hubaux
Current solutions for privacy-preserving data sharing among multiple parties either depend on a centralized authority that must be trusted and provides only weakest-link security (e.g., the entity that manages private/secret cryptographic keys), or leverage on decentralized but impractical approaches (e.g., secure multi-party computation). When the data to be shared are of a sensitive nature and the number of data providers is high, these solutions are not appropriate. Therefore, we present UnLynx, a new decentralized system for efficient privacy-preserving data sharing. We consider m servers that constitute a collective authority whose goal is to verifiably compute on data sent from n data providers. UnLynx guarantees the confidentiality, unlinkability between data providers and their data, privacy of the end result and the correctness of computations by the servers. Furthermore, to support differentially private queries, UnLynx can collectively add noise under encryption. All of this is achieved through a combination of a set of new distributed and secure protocols that are based on homomorphic cryptography, verifiable shuffling and zero-knowledge proofs. UnLynx is highly parallelizable and modular by design as it enables multiple security/privacy vs. runtime tradeoffs. Our evaluation shows that UnLynx can execute a secure survey on 400,000 personal data records containing 5 encrypted attributes, distributed over 20 independent databases, for a total of 2,000,000 ciphertexts, in 24 minutes.
Anh Pham, Italo Dacosta, Bastien Jacot-Guillarmod, Kévin Huguenin, Taha Hajar, Florian Tramèr, Virgil Gligor and Jean-Pierre Hubaux
In the past few years, we have witnessed a rise in the popularity of ride-hailing services (RHSs), an online marketplace that enables accredited drivers to use their own cars to drive ride-hailing users. Unlike other transportation services, RHSs raise significant privacy concerns, as providers are able to track the precise mobility patterns of millions of riders worldwide. We present the first survey and analysis of the privacy threats in RHSs. Our analysis exposes high-risk privacy threats that do not occur in conventional taxi services. Therefore, we propose PrivateRide, a privacy-enhancing and practical solution that offers anonymity and location privacy for riders, and protects drivers’ information from harvesting attacks. PrivateRide lowers the high-risk privacy threats in RHSs to a level that is at least as low as that of many taxi services. Using real data-sets from Uber and taxi rides, we show that PrivateRide significantly enhances riders’ privacy, while preserving tangible accuracy in ride matching and fare calculation, with only negligible effects on convenience. Moreover, by using our Android implementation for experimental evaluations, we show that PrivateRide’s overhead during ride setup is negligible. In short, we enable privacy-conscious riders to achieve levels of privacy that are not possible in current RHSs and even in some conventional taxi services, thereby offering a potential business differentiator.