Studies of Internet censorship rely on an experimental technique called probing. From a client within each country under investigation, the experimenter attempts to access network resources that are suspected to be censored, and records what happens. The set of resources to be probed is a crucial, but often neglected, element of the experimental design.
We analyze the content and longevity of 758,191 webpages drawn from 22 different probe lists, of which 15 are alleged to be actual blacklists of censored webpages in particular countries, three were compiled using a priori criteria for selecting pages with an elevated chance of being censored, and four are controls. We find that the lists have very little overlap in terms of specific pages. Mechanically assigning a topic to each page, however, reveals common themes, and suggests that handcurated probe lists may be neglecting certain frequently censored topics. We also find that pages on controversial topics tend to have much shorter lifetimes than pages on uncontroversial topics. Hence, probe lists need to be continuously updated to be useful.
To carry out this analysis, we have developed automated infrastructure for collecting snapshots of webpages, weeding out irrelevant material (e.g. site “boilerplate” and parked domains), translating text, assigning topics, and detecting topic changes. The system scales to hundreds of thousands of pages collected.
Available online public/governmental services requiring authentication by citizens have considerably expanded in recent years. This has hindered the usability and security associated with credential management by users and service providers. To address the problem, some countries have proposed nation-scale identification/authentication systems that intend to greatly reduce the burden of credential management, while seemingly offering desirable privacy benefits. In this paper we analyze two such systems: the Federal Cloud Credential Exchange (FCCX) in the United States and GOV.UK Verify in the United Kingdom, which altogether aim at serving more than a hundred million citizens. Both systems propose a brokered identification architecture, where an online central hub mediates user authentications between identity providers and service providers. We show that both FCCX and GOV.UK Verify suffer from serious privacy and security shortcomings, fail to comply with privacy-preserving guidelines they are meant to follow, and may actually degrade user privacy. Notably, the hub can link interactions of the same user across different service providers and has visibility over private identifiable information of citizens. In case of malicious compromise it is also able to undetectably impersonate users. Within the structural design constraints placed on these nation-scale brokered identification systems, we propose feasible technical solutions to the privacy and security issues we identified. We conclude with a strong recommendation that FCCX and GOV.UK Verify be subject to a more in-depth technical and public review, based on a defined and comprehensive threat model, and adopt adequate structural adjustments.
Monero is a privacy-centric cryptocurrency that allows users to obscure their transactions by including chaff coins, called “mixins,” along with the actual coins they spend. In this paper, we empirically evaluate two weaknesses in Monero’s mixin sampling strategy. First, about 62% of transaction inputs with one or more mixins are vulnerable to “chain-reaction” analysis - that is, the real input can be deduced by elimination. Second, Monero mixins are sampled in such a way that they can be easily distinguished from the real coins by their age distribution; in short, the real input is usually the “newest” input. We estimate that this heuristic can be used to guess the real input with 80% accuracy over all transactions with 1 or more mixins. Next, we turn to the Monero ecosystem and study the importance of mining pools and the former anonymous marketplace AlphaBay on the transaction volume. We find that after removing mining pool activity, there remains a large amount of potentially privacy-sensitive transactions that are affected by these weaknesses. We propose and evaluate two countermeasures that can improve the privacy of future transactions.