This special issue on “Systems and Architectures for High-Quality Statistics Production” is a stimulating resource for statistical agencies and private sector data collectors in a challenging time characterized by massive amounts of data, from a variety of sources, available in varying intervals, and with varying quality.
Traditionally, statistical products were created from a single source, most often through surveys or administrative data. However, neither surveys nor administrative data alone can match the data needs of today’s society. In addition, the need to reduce the costs of data production necessitates that multiple sources are used in combination. The need to reduce costs also necessitates the streamlining of production cycles, and the increasing difficulties in data collection itself require such systems to be much more flexible than they have been in the past. Increasingly, these reasons are driving statistical agencies and private data collectors to redesign their entire data production cycle. The examples in this special issue from Statistics Netherlands and Statistics New Zealand demonstrate such developments in government agencies; the example from RTI reflects efforts visible among private sector data collectors. This commentary will highlight some issues of general interest related to organizational challenges, and some that create the basis for reproducible research and are therefore of general interest to the research community.
Bender, S., Dieterich, I., Hartmann, B., and Singula, D. (2011). FDZ-Jahresbericht 2009/2010. Available at: http://doku.iab.de/fdz/reporte/2011/MR_06-11.pdf (accessed February 14, 2013).
Calderwood, L. and Lessof, C. (2009). Enhancing Longitudinal Surveys by Linking to Administrative Data. Methodology of Longitudinal Surveys, P. Lynn (ed.). New York: Wiley.
DDI (2012). What is DDI? Available at: http://www.ddialliance.org/what (accessed February 14, 2013).
Kreuter, F. (ed.) (2013). Improving Surveys with Paradata: Analytic Use of Process Information. New York: Wiley.
Lahiri, P. and Larsen, M. (2005). Regression Analysis with Linked Data. Journal of the American Statistical Association, 100, 222-230.
Peng, R. (2011). Reproducible Research in Computational Science. Science, 334, 1226.
Schnell, R., Bachteler, T., and Reiher, J. (2009). Privacy-Preserving Record Linkage Using Bloom Filters. BMC Medical Informatics and Decision Making, 9(41).
Sakshaug, J., Tutz, V., and Kreuter, F. (2013). Placement, Wording, and Interviewers: Identifying Correlates of Consent to Link Survey and Administrative Data. Survey Research Methods (forthcoming).
Thieme, M. and Miller, P. (2012). The Center for Adaptive Design. Presentation to the National Advisory Committee on Racial, Ethnic, and Other Populations. October 25, 2012, Washington, D.C.