Combining survey and auxiliary data to produce official statistics is gaining interest at federal agencies and among policy makers due to its efficiency. Recent studies have shown the practicality of small area estimation modeling approaches in the context of integrating data from multiple sources to improve estimation at fine levels of aggregation. In this article, agricultural predictions are constructed using a hierarchical Bayes subarea-level model, fit to data available from different sources. Auxiliary data are initially used to complement the survey data and define the prediction space, and then to define covariates for the model. Finally, not-in-sample predictions are constructed using the model output, and benchmarking constraints are imposed on the final set of in-sample and not-in-sample predictions. Unlike most of the studies discussing not-in-sample prediction, this article illustrates a method that uses the data available from multiple sources to define the prediction space. As a consequence, the resulting framework provides a larger set of nationwide predictions as candidate for official statistics, and extrapolation is not of concern. Challenges in developing the methods to combine different data sources are discussed in the context of planted acreage prediction.
If the inline PDF is not rendering correctly, you can download the PDF file here.
Boryan, C., Z. Yang, R. Mueller, and M. Craig. 2011. “Monitoring US agriculture: the US Department of Agriculture, National Agricultural Statistics Service, Cropland Data Layer Program.” Geocarto International 26(5): 341–358. DOI: https://doi.org/10.1080/10106049.2011.562309.
Cruze, N.B., A.L. Erciulescu, B. Nandram, W.J. Barboza, and L.J. Young. 2019. “Producing Official County-Level Agricultural Estimates in the United States: Needs and Challenges.” Statistical Science 34(2): 301–316. DOI: https://doi.org/10.1214/18-STS687.
Erciulescu, A.L., N.B. Cruze, and B. Nandram. 2019. “Model-Based County-Level Crop Estimates Incorporating Auxiliary Sources of Information.” Journal of the Royal Statistical Society, Series A 182: 283–303. DOI: https://doi.org/10/1111/rssa.12390.
Fay, R.E. and R.A. Herriot. 1979. “Estimates of income for small places: an application of James-Stein procedures to census data.” Journal of the American Statistical Association 74(366a): 269–277. DOI: https://doi.org/10.1080/01621459.1979.10482505.
Fuller, W.A. and J.J. Goyeneche. 1998. “Estimation of the state variance component.” Unpublished manuscript.
Geweke, J. 1992. “Evaluating the accuracy of sampling-based approaches to calculating posterior moments.” In Bayesian Statistics 4, edited by J.M. Bernado, J.O. Berger, A.P. Dawid, and A.F.M. Smith. Oxford, UK: Clarendon Press.
Kim, J.K., Z. Wang, Z. Zhu, and N.B. Cruze. 2018. “Combining Survey and Non-Survey Data for Improved Sub-Area Prediction Using a Multi-Level Model.” Journal of Agricultural, Biological, and Environmental Statistics 23(2): 175–189. DOI: https://doi.org/10.1007/s13253-018-0320-2.
National Academies of Sciences, Engineering, and Medicine. 2017. “Improving Crop Estimates by Integrating Multiple Data Sources,” Washington, DC: The National Academies Press. DOI: https://doi.org/10.17226/24892.