Open Access

Measuring Disclosure Risk and Data Utility for Flexible Table Generators

Journal of Official Statistics's Cover Image
Journal of Official Statistics
Special Issue on New Techniques and Technologies for Statistics

Cite

Statistical agencies are making increased use of the internet to disseminate census tabular outputs through web-based flexible table-generating servers that allow users to define and generate their own tables. The key questions in the development of these servers are: (1) what data should be used to generate the tables, and (2) what statistical disclosure control (SDC) method should be applied. To generate flexible tables, the server has to be able to measure the disclosure risk in the final output table, apply the SDC method and then iteratively reassess the disclosure risk. SDC methods may be applied either to the underlying data used to generate the tables and/or to the final output table that is generated from original data. Besides assessing disclosure risk, the server should provide a measure of data utility by comparing the perturbed table to the original table. In this article, we examine aspects of the design and development of a flexible table-generating server for census tables and demonstrate a disclosure risk-data utility analysis for comparing SDC methods. We propose measures for disclosure risk and data utility that are based on information theory.

eISSN:
2001-7367
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Mathematics, Probability and Statistics