In this paper, we present a rigorous methodology for quantifying the anonymity provided by Tor against a variety of structural attacks, i.e., adversaries that corrupt Tor nodes and thereby perform eavesdropping attacks to deanonymize Tor users. First, we provide an algorithmic approach for computing the anonymity impact of such structural attacks against Tor. The algorithm is parametric in the considered path selection algorithm and is, hence, capable of reasoning about variants of Tor and alternative path selection algorithms as well. Second, we present formalizations of various instantiations of structural attacks against Tor and show that the computed anonymity impact of each of these adversaries indeed constitutes a worst-case anonymity bound for the cryptographic realization of Tor. Third, we use our methodology to conduct a rigorous, largescale evaluation of Tor’s anonymity which establishes worst-case anonymity bounds against various structural attacks for Tor and for alternative path selection algorithms such as DistribuTor, SelekTOR, and LASTor. This yields the first rigorous anonymity comparison between different path selection algorithms. As part of our analysis, we quantify the anonymity impact of a path selection transition phase, i.e., a small number of users decides to run an alternative algorithm while the vast majority still uses the original one. The source code of our implementation is publicly available.
The decreasing costs of molecular profiling have fueled the biomedical research community with a plethora of new types of biomedical data, enabling a breakthrough towards more precise and personalized medicine. Naturally, the increasing availability of data also enables physicians to compare patients’ data and treatments easily and to find similar patients in order to propose the optimal therapy. Such similar patient queries (SPQs) are of utmost importance to medical practice and will be relied upon in future health information exchange systems. While privacy-preserving solutions have been previously studied, those are limited to genomic data, ignoring the different newly available types of biomedical data.
In this paper, we propose new cryptographic techniques for finding similar patients in a privacy-preserving manner with various types of biomedical data, including genomic, epigenomic and transcriptomic data as well as their combination. We design protocols for two of the most common similarity metrics in biomedicine: the Euclidean distance and Pearson correlation coefficient. Moreover, unlike previous approaches, we account for the fact that certain locations contribute differently to a given disease or phenotype by allowing to limit the query to the relevant locations and to assign them different weights. Our protocols are specifically designed to be highly efficient in terms of communication and bandwidth, requiring only one or two rounds of communication and thus enabling scalable parallel queries. We rigorously prove our protocols to be secure based on cryptographic games and instantiate our technique with three of the most important types of biomedical data – namely DNA, microRNA expression, and DNA methylation. Our experimental results show that our protocols can compute a similarity query over a typical number of positions against a database of 1,000 patients in a few seconds. Finally, we propose and formalize strategies to mitigate the threat of malicious users or hospitals.