Effect of Missing Data on Classification Error in Panel Surveys

Open access


Sensitive outcomes of surveys are plagued by wave nonresponse and measurement error (classification error for categorical outcomes). These types of error can lead to biased estimates and erroneous conclusions if they are not understood and addressed. The National Crime Victimization Survey (NCVS) is a nationally representative rotating panel survey with seven waves measuring property and violent crime victimization. Because not all crime is reported to the police, there is no gold standard measure of whether a respondent was victimized. For panel data, Markov Latent Class Analysis (MLCA) is a model-based approach that uses response patterns across interview waves to estimate false positive and false negative classification probabilities typically applied to complete data.

This article uses Full Information Maximum Likelihood (FIML) to include respondents with partial information in MLCA. The impact of including partial respondents in the MLCA is assessed for reduction of bias in the estimates, model specification differences, and variability in classification error estimates by comparing results from complete case and FIML MLCA models. The goal is to determine the potential of FIML to improve MLCA estimates of classification error. While we apply this process to the NCVS, the approach developed is general and can be applied to any panel survey.

Allison, P.D. 2001. “Missing Data.” In Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-136. Thousand Oaks, CA: Sage.

Allison, P.D. 2012. “Handling Missing Data by Maximum Likelihood.” In Proceedings of SAS Global Forum 2012, Statistics and Data Analysis, April 22–25, 2012. 312. Haverford, PA: SAS Institute. Available at: http://www.statisticalhorizons.com/wp-content/uploads/MissingDataByML.pdf (accessed August 2016).

Bartolucci, F., A. Farcomeni, and F. Pennoni. 2013. Latent Markov Models for Longitudinal Data. Boca Raton, FL: CRC Press.

Berzofsky, M.E., P.P. Biemer, and S.L. Edwards. 2015. “Latent Class Analysis with Missing Data under Complex Sampling: Results of a Simulation Study.” Presented at 60th World Statistics Conference, July 26–31, 2015. Rio de Janeiro, Brazil: World Statistics Conference.

Berzofsky, M. and P.B. Biemer. 2017. “Classification Error in Crime Victimization Surveys: A Markov Latent Class Analysis.” In Total Survey Error in Practice, edited by P.P. Biemer, E. de Leeuw, S. Eckman, B. Edwards, F. Kreuter, L.E. Lyberg, N.C. Tucker, and B.T. West, 387–412. Hoboken, NJ: Wiley.

Biemer, P.P. 2004. “An Analysis of Classification Error for the Revised Current Population Survey Employment Questions.” Survey Methodology 30(2): 127–140.

Biemer, P.P. 2011. Latent Class Analysis of Survey Error. Hoboken, NJ: Wiley.

Di Mari, R., D.L. Oberski, and J.K. Vermunt. 2016. “Bias-Adjusted Three-Step Latent Markov Modeling with Covariates, Structural Equation Modeling.” Structural Equation Modeling 23(5): 649–660. Doi: http://dx.doi.org/10.1080/10705511.2016.1191015.

Dias, J.G., J.K. Vermunt, and S. Ramos. 2008. “Heterogeneous Hidden Markov Models.” In Compstat 2008 Proceedings, August, 2008. City, State: Compstat. Available at: http://members.home.nl/jeroenvermunt/dias2008.pdf (accessed March 2015).

Enders, C.K. 2010. Applied Missing Data Analysis. New York: Guilford Press.

Fay, R.E. 1986. “Causal Models for Patterns of Nonresponse.” Journal of the American Statistical Association 81(394): 354–365. Doi: http://dx.doi.org/10.1080/01621459.1986.10478279.

Fuchs, C. 1982. “Maximum Likelihood Estimation and Model Selection in Contingency Tables with Missing Data.” Journal of the American Statistical Association 77(378): 270–278. Doi: http://dx.doi.org/10.2307/2287230.

Goodman, L.A. 1961. “Statistical Methods for the Mover-Stayer Model.” Journal of the American Statistical Association 56(296): 841–868. Doi: http://dx.doi.org/10.2307/2281999.

Goodman, L.A. 1973. “The Analysis of Multidimensional Contingency Tables when Some Variables are Posterior to Others: A Modified Path Analysis Approach.” Biometrika 60(1): 179–192. Doi: http://dx.doi.org/10.2307/2334920.

Graham, J.W. 2009. “Missing Data Analysis: Making It Work in the Real World.” Annual Review of Psychology 60: 549–576. Doi: http://dx.doi.org/10.1146/annurev.psych.58.110405.085530.

Hart, T.C., C.M. Rennison, and C. Gibson. 2005. “Revisiting Respondent ‘Fatigue Bias’ in the National Crime Victimization Survey.” Journal of Quantitative Criminology 21(3): 345–363. Doi: http://dx.doi.org/10.1007/s10940-005-4275-4.

Hess, S., N. Sanko, J. Dumont, and A. Daly. 2013. “A Latent Variable Approach to Dealing with Missing or Inaccurately Measured Variables: The Case of Income.” In Proceedings of the Third International Choice Modelling Conference, July 3–5, 2013. Sydney, Australia: ICM Conference. Available at: http://www.icmconference.org.uk/index.php/icmc/ICMC2013/paper/viewFile/744/233 (accessed August 2015).

Iannacchione, V. 1982. “Weighted Sequential Hot Deck Imputation Macros.” In Proceedings of the SAS Users Group International Conference, February 14–17, 1982. 759–763. San Francisco, CA. Available at: http://www.sascommunity.org/sugi/SUGI82/Sugi-82-139%20Iannacchione.pdf (accessed March 2015).

Langton, L. and J. Truman. 2015. Criminal Victimization, 2014. Washington, DC: Bureau of Justice Statistics. (NCJ 248973).

Lazarsfeld, P.F. 1950. “The Logical and Mathematical Foundation of Latent Structure Analysis.” In Studies on Social Psychology in World War II, Vol. 4, Measurement and Prediction, edited by S. Stauffer, E.A. Suchman, P.F. Lazarsfeld, S.A. Starr, and J. Clausen. Princeton, NJ: Princeton University Press.

Little, R.J. and D.B. Rubin. 2002. Wiley Series in Probability and Statistics: Statistical Analysis with Missing Data. 2nd ed. Somerset, NJ: Wiley.

Poulsen, C.A. 1982. Latent Structures Analysis with Choice Modeling Applications. Aarhus, Denmark: Aarhus School of Business Administration and Economics.

Rand, M. and S. Catalano. 2007. Criminal Victimization, 2006. Washington, DC: U.S. Department of Justice, Office of Justice Programs. (NCJ 219413).

Rubin, D.B. 1976. “Inference and Missing Data.” Biometrika 63(3): 581–592. Doi: http://dx.doi.org/10.1093/biomet/63.3.581.

Schafer, J.L. and J.W. Graham. 2002. “Missing Data: Our View of the State of the Art.” Psychological Methods 7(2): 147–177. Doi: http://dx.doi.org/10.1037//1082-989x.7.2.147.

Truman, J.L. and R.E. Morgan. 2016. Criminal Victimization, 2015. Washington, DC: Bureau of Justice Statistics. (NCJ 250180).

U.S. Census Bureau. 2014. National Crime Victimization Survey: Technical Documentation. Washington, DC: U.S. Census Bureau. (NCJ 247252).

U.S. Department of Justice. 2015. Bureau of Justice Statistics. National Crime Victimization Survey, 2014. Ann Arbor, MI: Inter-university Consortium for Political and Social Research.

Van de Pol, F. and J. de Leeuw. 1986. “A Latent Markov Model to Correct for Measurement Error.” Sociological Methods & Research 15: 118–141. Doi: http://dx.doi.org/10.1177/0049124186015001009.

Van de Pol, F. and R. Langeheine. 1990. “Mixed Markov Latent Class Models.” In Sociological Methodology, edited by C.C. Clogg, 213–247. Oxford: Blackwell.

Vermunt, J.K. 1997. Log-Linear Models for Event Histories. London: Sage.

Vermunt, J.K. and J. Magidson. 2013. Technical Guide to Latent Gold 5.0: Basic, Advanced, and Syntax. Belmont, MA: Statistical Innovations.

Wiggins, L.M. 1973. Panel Analysis, Latent Probability Models For Attitude And Behavior Processing. Amsterdam: Elsevier SPC.

Journal of Official Statistics

The Journal of Statistics Sweden

Journal Information

IMPACT FACTOR 2017: 0.662
5-year IMPACT FACTOR: 1.113

CiteScore 2017: 0.74

SCImago Journal Rank (SJR) 2017: 1.158
Source Normalized Impact per Paper (SNIP) 2017: 0.860

Cited By


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 208 208 30
PDF Downloads 65 65 6