Detecting Fraudulent Interviewers by Improved Clustering Methods – The Case of Falsifications of Answers to Parts of a Questionnaire

Open access

Abstract

Falsified interviews represent a serious threat to empirical research based on survey data. The identification of such cases is important to ensure data quality. Applying cluster analysis to a set of indicators helps to identify suspicious interviewers when a substantial share of all of their interviews are complete falsifications, as shown by previous research. This analysis is extended to the case when only a share of questions within all interviews provided by an interviewer is fabricated. The assessment is based on synthetic datasets with a priori set properties. These are constructed from a unique experimental dataset containing both real and fabricated data for each respondent. Such a bootstrap approach makes it possible to evaluate the robustness of the method when the share of fabricated answers per interview decreases. The results indicate a substantial loss of discriminatory power in the standard cluster analysis if the share of fabricated answers within an interview becomes small. Using a novel cluster method which allows imposing constraints on cluster sizes, performance can be improved, in particular when only few falsifiers are present. This new approach will help to increase the robustness of survey data by detecting potential falsifiers more reliably.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Althöfer I. and K.-U. Koschnik. 1991. “On the Convergence of Threshold Accepting.” Applied Mathematics and Optimization 24: 183–195. Doi: http://dx.doi.org/10.1007/BF01447741.

  • Baragona R. F. Battaglia and I. Poli. 2011. Evolutionary Statistical Procedures. Statistics and Computing. Heidelberg: Springer.

  • Bredl S. N. Storfinger and N. Menold. 2013. “A Literature Review of Methods to Detect Fabricated Survey Data.” In Interviewers’ Deviations in Surveys - Impact Reasons Detection and Prevention edited by P. Winker N. Menold and R. Porst 3–24. Frankfurt am Main: Peter Lang.

  • Bredl S. P. Winker and K. Kötschau. 2012. “A Statistical Approach to Detect Interviewer Falsification of Survey Data.” Survey Methodology 38: 1–10.

  • Crespi L. 1945. “The Cheater Problem in Polling.” The Public Opinion Quarterly 9: 431–445.

  • De Haas S. and P. Winker. 2014. “Identification of Partial Falsifications in Survey Data.” Statistical Journal of the IAOS 30: 271–281. Doi: http://dx.doi.org/10.3233/SJI-140834.

  • Efron B. 1979. “Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics 7: 1–26. Doi: http://dx.doi.org/10.1214/aos/1176344552.

  • Efron B. 1982. The Jackknife the Bootstrap and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics vol. 38. Doi: http://dx.doi.org/10.1137/1.9781611970319.

  • Finn A. and V. Ranchhod. 2013. “Genuine Fakes: The Prevalence and Implications of Fieldworker Fraud in a Large South African Survey.” SALDRU Working Papers 115 Southern Africa Labour and Development Research Unit University of Cape Town. Available at: http://ideas.repec.org/p/ldr/wpaper/115.html (accessed October 22 2015).

  • Forsman G. and I. Schreiner. 1991. “The Design and Analysis of Reinterview: An Overview.” In Measurement Errors in Surveys edited by P. Biemer R. Groves L. Lyberg N. Mathiowetz and S. Sudman 279–301. Chichester: Wiley. Doi: http://dx.doi.org/10.1002/9781118150382.ch15.

  • Gilli M. D. Maringer and E. Schumann. 2011. Numerical Methods and Optimization in Finance. Waltham MA: Academic Press.

  • Gwartney P. 2013. “Mischief Versus Mistakes: Motivating Interviewers to not Deviate.” In Interviewers’ Deviations in Surveys - Impact Reasons Detection and Prevention edited by P. Winker N. Menold and R. Porst 195–215. Frankfurt am Main: Peter Lang.

  • Hood C. and M. Bushery. 1997. “Getting More Bang from the Reinterviewer Buck: Identifying ‘at Risk’ Interviewers.” In Proceedings of the Survey Research Methods Section: American Statistical Association August 10th to 14th 1997 Anaheim CA 820 – 824. Available at: https://www.amstat.org/sections/srms/Proceedings/papers/1997_141.pdf (accessed October 22 2015).

  • Kemper C. and N. Menold. 2014. “Nuisance or Remedy? The Utility of Stylistic Responding as an Indicator of Data Fabrication in Surveys.” Methodology: European Journal of Research Methods for the Behavioral and Social Sciences 10: 92–99. Doi: http://dx.doi.org/10.1027/1614-2241/a000078.

  • Kemper C. V. Trofimow B. Rammstedt and N. Menold. 2011. “Indicators for the ex post Detection of Faking in Survey Data Constructed from Responses to the Big Five Inventory-10 (BFI-10).” Poster presented at the 11th European Conference on Psychological Assessment date of conference Riga Latvia. Available at: http://www.ecpa11.lu.lv/files/KemperChristoph.pdf (accessed October 22 2015).

  • Krosnick J. and D. Alwin. 1987. “An Evaluation of a Cognitive Theory of Response Order Effects in Survey Measurement.” Public Opinion Quarterly 51: 201–219. Doi: http://dx.doi.org/10.1086/269029.

  • Matthews B. 1975. “Comparison of the Predicted and Observed Secondary Structure of t4 Phage Lysozyme.” Biochimica et Biophysica Acta 405: 442–451. Doi: http://dx.doi.org/10.1016/0005-2795(7590109-9).

  • Menold N. and C. Kemper. 2014. “How Do Real and Falsified Data Differ? Psychology of Survey Response as a Source of Falsification Indicators in Face-to-Face Surveys.” International Journal of Public Opinion Research 26: 41–65. Doi: http://dx.doi.org/10.1093/ijpor/edt017.

  • Menold N. P. Winker N. Storfinger and C. Kemper. 2013. “A Method for ex-post Identification of Falsifications in Survey Data.” In Interviewers’ Deviations in Surveys – Impact Reasons Detection and Prevention edited by P. Winker N. Menold and R. Porst 25–47. Frankfurt am Main: Peter Lang.

  • Messick S. 1967. “The Psychology of Acquiescence an Interpretation of Research Evidence.” In Response Set in Personality Assessment edited by I. Berg. Chicago: Aldine Publishing Company. Doi: http://dx.doi.org/10.1002/j.2333-8504.1966.tb00357.x.

  • Porras J. and N. English. 2004. “Data-Driven Approaches to Identifying Interviewer Data Falsification: The Case of Health Surveys.” In Proceedings of the Survey Research Methods Section: American Statistical Association August 8th to 12th 2004 Toronto 4223–4228. Available at: http://www.amstat.org/sections/srms/Proceedings/y2004/files/Jsm2004-000879.pdf (accessed October 23 2015).

  • Reuband K.-H. 1990. “Interviews die keine sind ‘Erfolge’ und ‘Mißerfolge’ beim Fälschen von Interviews.” Kölner Zeitschrift für Soziologie und Sozialpsychologie 42: 706–733.

  • Schäfer C. J. Schräpler K. Müller and G. Wagner. 2005. “Automatic Identification of Faked and Fraudulent Interviews in the German SOEP.” Schmollers Jahrbuch 125: 183–193.

  • Storfinger N. and M. Opper. 2011. “Datenbasierte Indikatoren für potentiell abweichendes Interviewerverhalten.” Discussion Paper 58 ZEU September 2011 Giessen. Available at: http://geb.uni-giessen.de/geb/volltexte/2012/8559/pdf/ZeuDiscPap58.pdf (accessed October 23 2015).

  • Storfinger N. and P. Winker. 2013. “Assessing the Performance of Clustering Methods in Falsification Using Bootstrap.” In Interviewers’ Deviations in Surveys - Impact Reasons Detection and Prevention edited by P. Winker N. Menold and R. Porst 49–65. Frankfurt am Main: Peter Lang.

  • Tourangeau R. K. Rasinski J. Jobe B. Jared T. Smith and W. Pratt. 1997. “Sources of Error in a Survey on Sexual Behavior.” Journal of Official Statistics 13: 341–365.

  • Verbiest N. K. Vermeulen and A. Teresdai. 2015. “Evaluation of Classification Methods.” In Data Classification – Algorithms and Applications edited by C. Aggarwal 633–655. Boca Raton FL: CRC Press.

  • Winker P. 2001. Optimization Heuristics in Econometrics: Applications of Threshold Accepting. Chichester: Wiley.

Search
Journal information
Impact Factor

IMPACT FACTOR 2018: 0.837
5-year IMPACT FACTOR: 0.934

CiteScore 2018: 1.04

SCImago Journal Rank (SJR) 2018: 0.963
Source Normalized Impact per Paper (SNIP) 2018: 1.020

Metrics
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 319 184 9
PDF Downloads 178 118 3