Monte Carlo and Reconstruction Membership Inference Attacks against Generative Models

Open access


We present two information leakage attacks that outperform previous work on membership inference against generative models. The first attack allows membership inference without assumptions on the type of the generative model. Contrary to previous evaluation metrics for generative models, like Kernel Density Estimation, it only considers samples of the model which are close to training data records. The second attack specifically targets Variational Autoencoders, achieving high membership inference accuracy. Furthermore, previous work mostly considers membership inference adversaries who perform single record membership inference. We argue for considering regulatory actors who perform set membership inference to identify the use of specific datasets for training. The attacks are evaluated on two generative model architectures, Generative Adversarial Networks (GANs) and Variational Autoen-coders (VAEs), trained on standard image datasets. Our results show that the two attacks yield success rates superior to previous work on most data sets while at the same time having only very mild assumptions. We envision the two attacks in combination with the membership inference attack type formalization as especially useful. For example, to enforce data privacy standards and automatically assessing model quality in machine learning as a service setups. In practice, our work motivates the use of GANs since they prove less vulnerable against information leakage attacks while producing detailed samples.

[1] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, M. Kudlur, J. Levenberg, R. Monga, S. Moore, D. G. Murray, B. Steiner, P. Tucker, V. Vasudevan, P. Warden, M. Wicke, Y. Yu, and X. Zheng. Tensorflow: A system for large-scale machine learning. In Proc. of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI), pages 265–283, Berkeley, CA, USA, 2016. USENIX Assoc.

[2] S. Arora, R. Ge, Y. Liang, T. Ma, and Y. Zhang. Generalization and equilibrium in generative adversarial nets (gans). In International Conference on Machine Learning, pages 224–232, 2017.

[3] S. R. Bowman, L. Vilnis, O. Vinyals, A. M. Dai, R. Jozefowicz, and S. Bengio. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349, 2015.

[4] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In Proc. of the 2005 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 1, pages 886–893, Piscataway, NJ, USA, 2005. IEEE.

[5] C. Donahue, J. McAuley, and M. Puckette. Synthesizing audio with generative adversarial networks. arXiv preprint arXiv:1802.04208, 2018.

[6] R. Ebrahimzadeh and M. Jampour. Efficient handwritten digit recognition based on histogram of oriented gradients and svm. International Journal of Computer Applications, 104(9), 2014.

[7] M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In Proc. of the 22nd ACM SIGSAC Conference on Computer and Communications Security (CCS), pages 1322–1333, New York, NY, USA, 2015. ACM.

[8] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Proc. of Advances in Neural Information Processing Systems 27 (NIPS), pages 2672–2680. NIPS Foundation, 2014.

[9] J. Hayes, L. Melis, G. Danezis, and E. De Cristofaro. Logan: Evaluating privacy leakage of generative models using generative adversarial networks. arXiv preprint arXiv:1705.07663, 2017.

[10] J. Hayes, L. Melis, G. Danezis, and E. De Cristofaro. LOGAN: Membership Inference Attacks Against Generative Models. Proceedings on Privacy Enhancing Technologies (PoPETs), 2019(1), 2019.

[11] L. Huang, A. D. Joseph, B. Nelson, B. I. P. Rubinstein, and J. D. Tygar. Adversarial machine learning. In AISec, 2011.

[12] D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.

[13] A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical Report, University of Toronto, 2009.

[14] K. S. Liu, B. Li, and J. Gao. Generative model: Membership attack, generalization and diversity. CoRR, abs/1805.09898, 2018.

[15] D. G. Lowe. Object recognition from local scale-invariant features. In Computer vision, 1999. The proceedings of the seventh IEEE international conference on, volume 2, pages 1150–1157. Ieee, 1999.

[16] M. Mozaffari-Kermani, S. Sur-Kolay, A. Raghunathan, and N. K. Jha. Systematic poisoning attacks on and defenses for machine learning in healthcare. IEEE journal of biomedical and health informatics, 19(6):1893–1905, 2015.

[17] A. B. Owen. Monte Carlo theory, methods and examples. 2013.

[18] E. Parzen. On estimation of a probability density function and mode. The annals of mathematical statistics, 33(3):1065–1076, 1962.

[19] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.

[20] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.

[21] R. Shokri, M. Stronati, C. Song, and V. Shmatikov. Membership inference attacks against machine learning models. In Proc. of the 2017 IEEE Symposium on Security and Privacy (S&P), pages 3–18, Piscataway, NJ, USA, 2017. IEEE.

[22] Sky News. The guardian view on google’s nhs grab: legally inappropriate, 2017.

[23] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958, 2014.

[24] The Guardian Online. The guardian view on google’s nhs grab: legally inappropriate, 2017.

[25] L. Theis, A. van den Oord, and M. Bethge. A note on the evaluation of generative models. In Proc. of the 4th International Conference on Learning Representations (ICLR), 2016.

[26] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart. Stealing machine learning models via prediction apis. In Proc. of the 2016 USENIX Security Symposium, pages 601–618, Berkeley, CA, USA, 2016. USENIX Assoc.

[27] Y. Wu, Y. Burda, R. Salakhutdinov, and R. Grosse. On the quantitative analysis of decoder-based generative models. arXiv preprint arXiv:1611.04273, 2016.

[28] H. Xiao, K. Rasul, and R. Vollgraf. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.

[29] C. Yang, Q. Wu, H. Li, and Y. Chen. Generative poisoning attack method against neural networks. arXiv preprint arXiv:1703.01340, 2017.

[30] S. Yeom, M. Fredrikson, and S. Jha. The unintended consequences of overfitting: Training data inference attacks. arXiv preprint arXiv:1709.01604, 2017.

Journal Information


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 28 28 28
PDF Downloads 28 28 28