Small Area Model-Based Estimators Using Big Data Sources

Open access


The timely, accurate monitoring of social indicators, such as poverty or inequality, on a finegrained spatial and temporal scale is a crucial tool for understanding social phenomena and policymaking, but poses a great challenge to official statistics. This article argues that an interdisciplinary approach, combining the body of statistical research in small area estimation with the body of research in social data mining based on Big Data, can provide novel means to tackle this problem successfully. Big Data derived from the digital crumbs that humans leave behind in their daily activities are in fact providing ever more accurate proxies of social life. Social data mining from these data, coupled with advanced model-based techniques for fine-grained estimates, have the potential to provide a novel microscope through which to view and understand social complexity. This article suggests three ways to use Big Data together with small area estimation techniques, and shows how Big Data has the potential to mirror aspects of well-being and other socioeconomic phenomena.

Bethlehem, J.G. 2002. “Weighting Nonresponse Adjustments Based on Auxiliary Information.” In Survey Nonresponse, edited by R.M. Groves, D.A. Dillman, J.L. Eltinge, and R.J.A. Little. New York: John Wiley and Sons.

Bethlehem, J. and S. Biffignandi. 2012. Handbook of Web Surveys. Hoboken, NJ: John Wiley and Sons.

Chambers, R.L. and N. Tzavidis. 2006. “M-Quantile Models for Small Area Estimation.” Biometrika 93: 255-268. Doi:

Cheng, C.L. and J.W. Van Ness. 1999. Statistical Regression with Measurement Error. London: Arnold.

Eagle, N., M. Macy, and R. Claxton. 2010. “Network Diversity and Economic Development.” Science 328: 1029-1031. Doi:

European Commission. 2015. EU-SILC USER DATABASE DESCRIPTION Version 2007-1. Luxembourg: EC. Available at: (accessed April 26, 2015).

Eurostat. 2014. Summary Record of 22nd Meeting of the European Statistical System Committee, Riga, September 26, 2014. Available at:,d.d24 (accessed April 26, 2015).

Fabrizi, E., C. Giusti, N. Salvati, and N. Tzavidis. 2014. “Mapping Average Equivalized Income Using Robust Small Area Methods.” Papers in Regional Science 93: 685-701. Available at: (accessed April 2015).

Fay, R. and R. Herriot. 1979. “Estimates of Income for Small Places: An Application of James-Stein Procedures to Census Data.” Journal of the American Statistical Association 74: 269-277. DOI:

Filippucci, C. 2011. “Statistical Sources and Statistical Systems in the Information Society.” Statistica 71: 189-211.

Foster, J., J. Greer, and E. Thorbecke. 1984. “A Class of Decomposable Poverty Measures.” Econometrica 52: 761-766.

Ghosh, M., K. Sinha, and D. Kim. 2006. “Empirical and Hierarchical Bayesian Estimation in Finite Population Sampling Under Structural Measurement Error Models.” Scandinavian Journal of Statistics 33: 591-608.

Giannotti, F., D. Pedreschi, A. Pentland, P. Lukowicz, D. Kossmann, J. Crowley, and D. Helbing. 2012. “A Planetary Nervous System for Social Mining and Collective Awareness.” European Physics Journal - Special Topics 214: 49-75. Doi:

Giusti, C., S. Marchetti, M. Pratesi, and N. Salvati. 2012a. “Semiparametric Fay-Herriot Model Using Penalized Splines.” Journal of the Indian Society of Agricultural Statistics 66: 1-14.

Giusti, C., S. Marchetti, M. Pratesi, and N. Salvati. 2012b. “Robust Small Area Estimation and Oversampling in the Estimation of Poverty Indicators.” Survey Research Methods 6: 155-163.

Hagenaars, A.J.M., K. de Vos, and M.A. Zaidi. 1994. Poverty Statistics in the Late 1980s: Research Based on Micro-data. Luxembourg: Eurostat.

Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd ed. New York: Springer.

Horrigan, M.W. 2013. “Big Data: A Perspective From the BLS.” Amstat News January 2013: 25-27. Available at: (accessed April 26, 2015).

ISTAT 1997. I Sisitemi Locali del Lavoro. Rome: ISTAT. Available at: (accessed April 26, 2015).

Marchetti, S., N. Tzavidis, and M. Pratesi. 2012. “Non-Parametric Bootstrap Mean Squared Error Estimation for M-Quantile Estimators of Small Area Averages, Quantiles and Poverty Indicators.” Computational Statistics and Data Analysis 56: 2889-2902.Doi:

Pappalardo, L., S. Rinzivillo, Z. Qu, D. Pedreschi, and F. Giannotti. 2013. “Understanding the Patterns of Car Travel.” The European Physical Journal - Special Topics 215: 61-73. Doi:

Pentland, A. 2012. “Society’s Nervous System: Building Effective Government, Energy, and Public Health Systems.” Computer 45: 31-38.

Porter, A.T., S.H. Holan, C.K. Wikle, and N. Cressie. 2014. “Spatial Fay-Herriot Models for Small Area Estimation with Functional Covariates.” Spatial Statistics 10: 27-42.Doi:

Pratesi, M., C. Giusti, S. Marchetti, N. Salvati, N. Tzavidis, I. Molina, M. Durban, A. Grane´, J.M. Marı`n, M.H. Veiga, D. Morales, M.D. Esteban, A. Sanchez, L. Santamaria, Y. Marhuenda, A. Perez, M. Pagliarella, C. Ferretti, and J.N.K.

Rao. 2010. SAMPLE Project - Pilot Application. Brussels: European Commission - Directorate General for Research and Innovation. Available at: (accessed April 26, 2015).

Rao, J.N.K. 2003. Small Area Estimation. New York: John Wiley and Sons.

Salvati, N., C. Giusti, and M. Pratesi. 2014. “The Use of Spatial Information for the Estimation of Poverty Indicators at the Small Area Level.” In Poverty and Social Exclusion, New Methods of Analysis, edited by G. Betti and A. Lemmi. London: Routledge.

Tan, P.N., M. Steinbach, and V. Kumar. 2006. Introduction to Data Mining. Boston: Addison-Wesley.

Torabi, M., G.S. Datta, and J.N.K. Rao. 2009. “Empirical Bayes Estimation of Small Area Means under a Nested Error Linear Regression Model with Measurement Errors in the Covariates.” Scandinavian Journal of Statistics 36: 355-368. Doi:

Tzavidis, N., S. Marchetti, and R. Chambers. 2010. “Robust Prediction of Small Area Means and Distributions.” Australian and New Zealand Journal of Statistics 52: 167-186. Doi:

Wolter, K.M. 2007. Introduction to Variance Estimation. New York: Springer. Ybarra, L.M.R. 2003. Small Area Estimation Using Data from Multiple Surveys. Unpublished PhD thesis, Arizona State University.

Ybarra, L.M.R., and S.L. Lohr. 2008. “Small Area Estimation When Auxiliary Information is Measured With Error.” Biometrika 95: 919-931. Doi:

Journal of Official Statistics

The Journal of Statistics Sweden

Journal Information

IMPACT FACTOR 2017: 0.662
5-year IMPACT FACTOR: 1.113

CiteScore 2017: 0.74

SCImago Journal Rank (SJR) 2017: 1.158
Source Normalized Impact per Paper (SNIP) 2017: 0.860

Cited By


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 505 505 71
PDF Downloads 221 221 30