Prospects for Protecting Business Microdata when Releasing Population Totals via a Remote Server

Open access

Abstract

Many statistical agencies face the challenge of maintaining the confidentiality of respondents while providing as much analytical value as possible from their data. Datasets relating to businesses present particular difficulties because they are likely to contain information about large enterprises that dominate industries and may be more easily identified. Agencies therefore tend to take a cautious approach to releasing business data (e.g., trusted access, remote access and synthetic data). The Australian Bureau of Statistics has developed a remote server, called TableBuilder, which has the capability to allow users to specify and request tables created from business microdata. The tables are confidentialised automatically by perturbing cell values, and the results are returned quickly to the users. The perturbation method is designed to protect against attacks, which are attempts to undo the confidentialisation, such as the well-known differencing attack. This paper considers the risk and utility trade-off when releasing three Australian Bureau of Statistics business collections via its TableBuilder product.

Abrahams, C. and K. Mahony. 2008. “2New Policy and Procedures Governing the Release of Microdata Derived from ONS Social Surveys.” 13th GSS Methodology Conference, London, June 23, 2008. Available at: https://www.ons.gov.uk/ons/media-centre/events/past-events/thirteenth-gss-methodology-conference–23-june-2008 (accessed January 2018).

Chipperfield, J.O. 2014. “Disclosure-Protected Inference with Linked Micro-data using a Remote Analysis Server.” Journal of Official Statistics 30: 123–146. Doi: http://dx.doi.org/10.2478/jos-2014-0007.

Chipperfield, J.O. and C. O’Keefe. 2014. “Disclosure-Protected Inference using Generalised Linear Models.” International Statistical Review 82: 371–391. Doi: https://doi.org/10.1111/insr.12054.

Chipperfield, J.O., D. Gow, and B. Loong. 2016. “The Australian Bureau of Statistics and releasing frequency tables via a remote server.” Statistical Journal of the IAOS 1: 53–64. Doi: https://doi.org/10.3233/SJI-160969.

Domingo-Ferrer, J. and V. Torra. 2001. “Disclosure Protection Methods and Information Loss for Microdata.” In Confidentiality, Disclosure and Data Access: Theory and Practical Applications for Statistical Agencies, edited by P. Doyle, J.I. Lane, J.J.M. Theeuwes, and L. Zayatz, 91–110. Amsterdam: North-Holland.

Dwork, C., F. McSherry, K. Nissim, and A. Smith. 2006. “Calibrating Noise to Sensitivity in Private Data Analysis.” In Theory of Cryptography TCC, edited by S. Halevi and R. Rabin, 265–284. Heidelberg: Springer.

Evans, T., L. Zayatz, and J. Slanta. 1998. “Using Noise for Disclosure Limitation of Establishment Tabular Data.” Journal of Official Statistics 4: 537–551. Available at: https://www.scb.se/contentassets/f6bcee6f397c4fd68db6452fc9643e68/using-noisefor-disclosure-limitation-of-establishment-tabular-data.pdf (accessed January 2019).

González, J.J.S. 2005. “A Unified Mathematical Programming Framework for different Statistical Disclosure Limitation Methods.” Operations Research 53: 819–829. Doi: https://doi.org/10.1287/opre.1040.0202.

Krsinich, F. and A. Piesse. 2002. “Multiplicative Microdata Noise for Confidentialising Tables of Business Data.” Statistics New Zealand. Available at: http://archive.stats.govt.nz/browse_for_stats/businesses/business_characteristics/multiplicative-microdatanoise-for-business-data.aspx (accessed January 2019).

Lucero, J., L. Zayatz, L. Singh, J. You, M. DePersio, and M. Freiman. 2011. “The Current Stage of the Microdata Analysis System at the U.S. Census Bureau.” Proceedings of the World Congress of the International Statistical Institute, 3115–3133. Dublin. Available at: http://2011.isiproceedings.org/papers/650103.pdf (accessed January 2019).

Miranda, J. and L. Vilhuber. 2013. “Looking back on three years of Synthetic LBD Beta.” Cornell University. Available at: http://digitalcommons.ilr.cornell.edu/cgi/viewcontent.cgi?article=1013&context=ldi (accessed January 2019).

O’Keefe, C. and J. Chipperfield. 2013. “A Summary of Attack Methods and Confidentiality Protection Measures for Fully Automated Remote Analysis Systems.” International Statistical Review 81: 426–455. Doi: https://doi.org/10.1111/insr.12021.

Reuter, W.H. and J.M. Museux. 2010. “Establishing an Infrastructure for Remote Access to Microdata at Eurostat.” In Privacy in Statistical Databases, edited by J. Domingo-Ferrer and E. Magkos, 249–257. Berlin, Heidelberg: Springer.

Tambay, J. 2017. “A layered perturbation method for the protection of tabular outputs.” Survey Methodology 43: 31–40. Available at: https://www150.statcan.gc.ca/n1/en/pub/12-001-x/2017001/article/14818-eng.pdf?st=qzA3QL0u (accessed January 2019).

Tambay, J.-L. and J.M. Fillion. 2013. “Strategies for processing tabular data using the G-Confid cell suppression software.” Proceedings of the Survey Research Methods Section. American Statistical Association Joint Statistical Meetings, Montreal, August 3–8, 2013. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.46/2017/7_gconfid.pdf (accessed January 2019).

Thompson, G., S. Broadfoot, and D. Elazar. 2013. “Methodology for the Automatic Confdentialisation of Statistical Outputs from Remote Servers at the Australian Bureau of Statistics.” UNECE Work Session on Statistical Data Confidentiality, Ottawa, October. Available at: https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.46/2013/Topic_1_ABS.pdf (accessed January 2019).

Yancey, W.E., W.E. Winkler, and R.H. Creecy. 2002. “Disclosure Risk Assessment in Perturbative Micro-data Protection.” In Inference Control in Statistical Databases, edited by J. Domingo-Ferrer, 135–151. New York: Springer.

Journal of Official Statistics

The Journal of Statistics Sweden

Journal Information


IMPACT FACTOR 2018: 0,837
5-year IMPACT FACTOR: 0,934

CiteScore 2018: 1.04

SCImago Journal Rank (SJR) 2018: 0.963
Source Normalized Impact per Paper (SNIP) 2018: 1.020

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 160 160 122
PDF Downloads 142 142 117