Estimation of True Quantiles from Quantitative Data Obfuscated with Additive Noise

Debolina Ghatak 1  and Bimal Roy 1
  • 1 Indian Statistical Institute, Applied Statistics Unit, , 700108, Kolkata, India

Abstract

Privacy protection and data security have recently received a substantial amount of attention due to the increasing need to protect various sensitive information like credit card data and medical data. There are various ways to protect data. Here, we address ways that may as well retain its statistical uses to some extent. One such way is to mask a data with additive or multiplicative noise and revert to certain desired parameters of the original distribution from the knowledge of the noise distribution and masked data. In this article, we discuss the estimation of any desired quantile of a quantitative data set masked with additive noise. We also propose a method to choose appropriate parameters for the noise distribution and discuss advantages of this method over some existing methods.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Fan, J. 1991. “On the Optimal Rates of Convergence for Nonparametric Deconvolution Problems.” The Annals of Statistics 19(3): 1257–1272. Available at: http://www.jstor.org/stable/2241949 (accessed December 2017).

  • Fuller, W.A. 1993. “Masking Procedures for Microdata Disclosure Limitation.” Journal of Official Statistics 9(3): 383–406. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/masking-procedures-for-microdata-disclosure-limitation.pdf (accessed December 2017).

  • Kim, H.J. and A.F. Karr. 2013. The Effect of Statistical Disclosure Limitation on Parameter Estimation for a Finite Population. NISS, October.

  • Meister, A. 2009. Deconvolution Problems in Nonparametric Statistics. Berlin Heidelberg: Springer Verlag.

  • Mukherjee, S. and G.T. Duncan. 1997. Disclosure Limitation through Additive Noise Data Masking: Analysis of Skewed Sensitive Data. Disclosure Limitation through Additive Noise Data Masking: Analysis of Skewed Sensitive Data. IEEE.

  • Polyanin, A.D. and A.V. Manzhirov. 2008. Handbook of Integral Equations. Chapman and Hall/CRC.

  • Poole, W.K. 1974. “Estimation of the Distribution Function of a Continuous Type

  • Random Variable Through Randomized Response.” Journal of the American Statistical Association 69(348): 1002–1005.

  • Sinha, B., T.K. Nayak, and L. Zayatz. 2011. “Privacy Protection and Quantile Estimation from Noise Multiplied Data.” Sankhya B 73: 297–315. Doi: https://doi.org/10.1007/s13571-011-0030-z.

  • Zayatz, L., K.T. Nayak, and B.K. Sinha. 2011. “Statistical Properties of Multiplicative Noise Masking for Confidentiality Protection.” Journal of Official Statistics 27(2): 527–544. Available at: https://www.scb.se/contentassets/ca21efb41fee47d293bbee-5bf7be7fb3/statistical-properties-of-multiplicative-noise-masking-for-confidentiality-protection.pdf (accessed December 2017).

OPEN ACCESS

Journal + Issues

Search