Browse

1 - 10 of 287 items :

  • IT-Security and Cryptology x
  • Databases and Data Mining x
Clear All

Abstract

Android parental control applications are used by parents to monitor and limit their children’s mobile behaviour (e.g., mobile apps usage, web browsing, calling, and texting). In order to offer this service, parental control apps require privileged access to system resources and access to sensitive data. This may significantly reduce the dangers associated with kids’ online activities, but it raises important privacy concerns. These concerns have so far been overlooked by organizations providing recommendations regarding the use of parental control applications to the public.

We conduct the first in-depth study of the Android parental control app’s ecosystem from a privacy and regulatory point of view. We exhaustively study 46 apps from 43 developers which have a combined 20M installs in the Google Play Store. Using a combination of static and dynamic analysis we find that: these apps are on average more permissions-hungry than the top 150 apps in the Google Play Store, and tend to request more dangerous permissions with new releases; 11% of the apps transmit personal data in the clear; 34% of the apps gather and send personal information without appropriate consent; and 72% of the apps share data with third parties (including online advertising and analytics services) without mentioning their presence in their privacy policies. In summary, parental control applications lack transparency and lack compliance with regulatory requirements. This holds even for those applications recommended by European and other national security centers.

Abstract

We describe and evaluate an attack that reconstructs the histogram of any target attribute of a sensitive dataset which can only be queried through a specific class of real-world privacy-preserving algorithms which we call bounded perturbation algorithms. A defining property of such an algorithm is that it perturbs answers to the queries by adding zero-mean noise distributed within a bounded (possibly undisclosed) range. Other key properties of the algorithm include only allowing restricted queries (enforced via an online interface), suppressing answers to queries which are only satisfied by a small group of individuals (e.g., by returning a zero as an answer), and adding the same perturbation to two queries which are satisfied by the same set of individuals (to thwart differencing or averaging attacks). A real-world example of such an algorithm is the one deployed by the Australian Bureau of Statistics’ (ABS) online tool called TableBuilder, which allows users to create tables, graphs and maps of Australian census data [30]. We assume an attacker (say, a curious analyst) who is given oracle access to the algorithm via an interface. We describe two attacks on the algorithm. Both attacks are based on carefully constructing (different) queries that evaluate to the same answer. The first attack finds the hidden perturbation parameter r (if it is assumed not to be public knowledge). The second attack removes the noise to obtain the original answer of some (counting) query of choice. We also show how to use this attack to find the number of individuals in the dataset with a target attribute value a of any attribute A, and then for all attribute values a iA. None of the attacks presented here depend on any background information. Our attacks are a practical illustration of the (informal) fundamental law of information recovery which states that “overly accurate estimates of too many statistics completely destroys privacy” [9, 15].

Abstract

Web measurement is a powerful approach to studying various tracking practices that may compromise the privacy of millions of users. Researchers have built several measurement frameworks and performed a few studies to measure web tracking on the desktop environment. However, little is known about web tracking on the mobile environment, and no tool is readily available for performing a comparative measurement study on mobile and desktop environments. In this work, we built a framework called WTPatrol that allows us and other researchers to perform web tracking measurement on both mobile and desktop environments. Using WTPatrol, we performed the first comparative measurement study of web tracking on 23,310 websites that have both mobile version and desktop version web-pages. We conducted an in-depth comparison of the web tracking practices of those websites between mobile and desktop environments from two perspectives: web tracking based on JavaScript APIs and web tracking based on HTTP cookies. Overall, we found that mobile web tracking has its unique characteristics especially due to mobile-specific trackers, and it has become increasingly as prevalent as desktop web tracking. However, the potential impact of mobile web tracking is more severe than that of desktop web tracking because a user may use a mobile device frequently in different places and be continuously tracked. We further gave some suggestions to web users, developers, and researchers to defend against web tracking.

Abstract

Differential privacy (DP) provides formal guarantees that the output of a database query does not reveal too much information about any individual present in the database. While many differentially private algorithms have been proposed in the scientific literature, there are only a few end-to-end implementations of differentially private query engines. Crucially, existing systems assume that each individual is associated with at most one database record, which is unrealistic in practice. We propose a generic and scalable method to perform differentially private aggregations on databases, even when individuals can each be associated with arbitrarily many rows. We express this method as an operator in relational algebra, and implement it in an SQL engine. To validate this system, we test the utility of typical queries on industry benchmarks, and verify its correctness with a stochastic test framework we developed. We highlight the promises and pitfalls learned when deploying such a system in practice, and we publish its core components as open-source software.

Abstract

Small TCP flows make up the majority of web flows. For them, the TCP three-way handshake induces significant delay overhead. The TCP Fast Open (TFO) protocol can significantly decrease this delay via zero round-trip time (0-RTT) handshakes for all TCP handshakes that follow a full initial handshake to the same host. However, this comes at the cost of privacy limitations and also has some performance limitations. In this paper, we investigate the TFP deployment on popular websites and browsers. We found that a client revisiting a web site for the first time fails to use an abbreviated TFO handshake in 40% of all cases due to web server load-balancing using multiple IP addresses. Our analysis further reveals significant privacy problems of the protocol design and implementation. Network-based attackers and online trackers can exploit TFO to track the online activities of users. As a countermeasure, we introduce a novel protocol called TCP Fast Open Privacy (FOP). TCP FOP prevents tracking by network attackers and impedes third-party tracking, while still allowing 0-RTT handshakes as in TFO. As a proof-of-concept, we have implemented the proposed protocol for the Linux kernel and a TLS library. Our measurements indicate that TCP FOP outperforms TLS over TFO when websites are served from multiple IP addresses.

Abstract

Today’s environment of data-driven business models relies heavily on collecting as much personal data as possible. Besides being protected by governmental regulation, internet users can also try to protect their privacy on an individual basis. One of the most famous ways to accomplish this, is to use privacy-enhancing technologies (PETs). However, the number of users is particularly important for the anonymity set of the service. The more users use the service, the more difficult it will be to trace an individual user. There is a lot of research determining the technical properties of PETs like Tor or JonDonym, but the use behavior of the users is rarely considered, although it is a decisive factor for the acceptance of a PET. Therefore, it is an important driver for increasing the user base.

We undertake a first step towards understanding the use behavior of PETs employing a mixed-method approach. We conducted an online survey with 265 users of the anonymity services Tor and JonDonym (124 users of Tor and 141 users of JonDonym). We use the technology acceptance model as a theoretical starting point and extend it with the constructs perceived anonymity and trust in the service in order to take account for the specific nature of PETs. Our model explains almost half of the variance of the behavioral intention to use the two PETs. The results indicate that both newly added variables are highly relevant factors in the path model. We augment these insights with a qualitative analysis of answers to open questions about the users’ concerns, the circumstances under which they would pay money and choose a paid premium tariff (only for JonDonym), features they would like to have and why they would or would not recommend Tor/JonDonym. Thereby, we provide additional insights about the users’ attitudes and perceptions of the services and propose new use factors not covered by our model for future research.

Abstract

Privacy-preserving machine learning (PPML) via Secure Multi-party Computation (MPC) has gained momentum in the recent past. Assuming a minimal network of pair-wise private channels, we propose an efficient four-party PPML framework over rings ℤ2ℓ, FLASH, the first of its kind in the regime of PPML framework, that achieves the strongest security notion of Guaranteed Output Delivery (all parties obtain the output irrespective of adversary’s behaviour). The state of the art ML frameworks such as ABY3 by Mohassel et.al (ACM CCS’18) and SecureNN by Wagh et.al (PETS’19) operate in the setting of 3 parties with one malicious corruption but achieve the weaker security guarantee of abort. We demonstrate PPML with real-time efficiency, using the following custom-made tools that overcome the limitations of the aforementioned state-of-the-art– (a) dot product, which is independent of the vector size unlike the state-of-the-art ABY3, SecureNN and ASTRA by Chaudhari et.al (ACM CCSW’19), all of which have linear dependence on the vector size. (b) Truncation and MSB Extraction, which are constant round and free of circuits like Parallel Prefix Adder (PPA) and Ripple Carry Adder (RCA), unlike ABY3 which uses these circuits and has round complexity of the order of depth of these circuits. We then exhibit the application of our FLASH framework in the secure server-aided prediction of vital algorithms– Linear Regression, Logistic Regression, Deep Neural Networks, and Binarized Neural Networks. We substantiate our theoretical claims through improvement in benchmarks of the aforementioned algorithms when compared with the current best framework ABY3. All the protocols are implemented over a 64-bit ring in LAN and WAN. Our experiments demonstrate that, for MNIST dataset, the improvement (in terms of throughput) ranges from 24 × to 1390 × over LAN and WAN together.

Abstract

The meaning of differential privacy (DP) is tightly bound with the notion of distance on databases, typically defined as the number of changed rows. Considering the semantics of data, this metric may be not the most suitable one, particularly when a distance comes out as larger than the data owner desired (which would undermine privacy). In this paper, we give a mechanism to specify continuous metrics that depend on the locations and amounts of changes in a much more nuanced manner. Our metrics turn the set of databases into a Banach space. In order to construct DP information release mechanisms based on our metrics, we introduce derivative sensitivity, an analogue to local sensitivity for continuous functions. We use this notion in an analysis that determines the amount of noise to be added to the result of a database query in order to obtain a certain level of differential privacy, and demonstrate that derivative sensitivity allows us to employ powerful mechanisms from calculus to perform the analysis for a variety of queries. We have implemented the analyzer and evaluated its efficiency and precision.

Abstract

In order to disseminate information in a social network, it is important to first identify the influential spreaders in the network. Using them as the seed spreaders, the aim is to ensure that the information is cascaded throughout the network. The traditional approach to identifying influential nodes is to determine the top-r ranked nodes in accordance with various ranking methods such as PageRank, k-Shell decomposition, ClusterRank and VoteRank. In the current work, we study the problem of ranking the nodes when the underlying graph is distributedly held by a set of individuals, who consider their share of the data as private information. In particular, we design efficient secure multiparty computation (MPC) protocols for k-Shell decomposition, PageRank and VoteRank. For improved efficiency, we employ the oblivious RAM construct in conjunction with efficient data-oblivious graph data structures. We are the first to propose a secure variant of the VoteRank algorithm. We prove that the proposed protocols are asymptotically more efficient and have lower runtime in practice than the previous best known MPC protocols for computing k-Shell decomposition and PageRank centrality scores.