Private Evaluation of Decision Trees using Sublinear Cost

Anselme Tueno 1 , Florian Kerschbaum 2 , and Stefan Katzenbeisser 3
  • 1 SAP SE,
  • 2 University of Waterloo,
  • 3 TU Darmstadt,


Decision trees are widespread machine learning models used for data classification and have many applications in areas such as healthcare, remote diagnostics, spam filtering, etc. In this paper, we address the problem of privately evaluating a decision tree on private data. In this scenario, the server holds a private decision tree model and the client wants to classify its private attribute vector using the server’s private model. The goal is to obtain the classification while preserving the privacy of both – the decision tree and the client input. After the computation, only the classification result is revealed to the client, while nothing is revealed to the server. Many existing protocols require a constant number of rounds. However, some of these protocols perform as many comparisons as there are decision nodes in the entire tree and others transform the whole plaintext decision tree into an oblivious program, resulting in higher communication costs. The main idea of our novel solution is to represent the tree as an array. Then we execute only d – the depth of the tree – comparisons. Each comparison is performed using a small garbled circuit, which output secret-shares of the index of the next node. We get the inputs to the comparison by obliviously indexing the tree and the attribute vector. We implement oblivious array indexing using either garbled circuits, Oblivious Transfer or Oblivious RAM (ORAM). Using ORAM, this results in the first protocol with sub-linear cost in the size of the tree. We implemented and evaluated our solution using the different array indexing procedures mentioned above. As a result, we are not only able to provide the first protocol with sublinear cost for large trees, but also reduce the communication cost for the large real-world data set “Spambase” from 18 MB to 1[triangleright]2 MB and the computation time from 17 seconds to less than 1 second in a LAN setting, compared to the best related work.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] A. Aly and M. V. Vyve. Securely solving classical network flow problems. In ICISC, pages 205–221, 2014.

  • [2] G. Asharov, Y. Lindell, T. Schneider, and M. Zohner. More efficient oblivious transfer and extensions for faster secure computation. In CCS, pages 535–548, New York, NY, USA, 2013. ACM.

  • [3] G. Asharov, Y. Lindell, T. Schneider, and M. Zohner. More efficient oblivious transfer extensions. J. Cryptology, 30(3):805–858, 2017.

  • [4] M. Barni, P. Failla, V. Kolesnikov, R. Lazzeretti, A.-R. Sadeghi, and T. Schneider. Secure evaluation of private linear branching programs with medical applications. In ESORICS, pages 424–439, Berlin, Heidelberg, 2009. Springer-Verlag.

  • [5] D. Beaver. Commodity-based cryptography (extended abstract). In STOC, pages 446–455, New York, NY, USA, 1997. ACM.

  • [6] M. Bellare, V. T. Hoang, S. Keelveedhi, and P. Rogaway. Efficient garbling from a fixed-key blockcipher. In SP, pages 478–492, Washington, DC, USA, 2013. IEEE Computer Society.

  • [7] A. Ben-David, N. Nisan, and B. Pinkas. Fairplaymp: A system for secure multi-party computation. In CCS, pages 257–266, New York, NY, USA, 2008. ACM.

  • [8] M. Ben-Or, S. Goldwasser, and A. Wigderson. Completeness theorems for non-cryptographic fault-tolerant distributed computation. In STOC, pages 1–10, 1988.

  • [9] M. Blanton, A. Steele, and M. Alisagari. Data-oblivious graph algorithms for secure computation and outsourcing. In ASIACCS, pages 207–218, 2013.

  • [10] D. Bogdanov, S. Laur, and J. Willemson. Sharemind: A framework for fast privacy-preserving computations. In ESORICS, pages 192–206, 2008.

  • [11] R. Bost, R. A. Popa, S. Tu, and S. Goldwasser. Machine learning classification over encrypted data. In NDSS, 2015.

  • [12] J. Brickell, D. E. Porter, V. Shmatikov, and E. Witchel. Privacy-preserving remote diagnostics. In CCS, pages 498–507, New York, NY, USA, 2007. ACM.

  • [13] S. S. Burra, E. Larraia, J. B. Nielsen, P. S. Nordholt, C. Orlandi, E. Orsini, P. Scholl, and N. P. Smart. High performance multi-party computation for binary circuits based on oblivious transfer. IACR Cryptology ePrint Archive, 2015:472, 2015.

  • [14] J. Catlett. Overpruning large decision trees. In IJCAI, pages 764–769, San Francisco, CA, USA, 1991. Morgan Kaufmann Publishers Inc.

  • [15] D. Chaum, C. Crépeau, and I. Damgard. Multiparty unconditionally secure protocols. In STOC, pages 11–19, 1988.

  • [16] M. D. Cock, R. Dowsley, C. Horst, R. Katti, A. C. A. Nascimento, S. C. Newman, and W. Poon. Efficient and private scoring of decision trees, support vector machines and logistic regression models based on pre-computation. IACR Cryptology ePrint Archive, 2016:736, 2016.

  • [17] R. Cramer, I. Damgård, and J. B. Nielsen. Multiparty computation from threshold homomorphic encryption. In EURO-CRYPT, pages 280–299, 2001.

  • [18] I. Damgård, M. Geisler, and M. Krøigaard. Efficient and secure comparison for on-line auctions. In ACISP, pages 416–430, 2007.

  • [19] I. Damgård, M. Geisler, M. Krøigaard, and J. B. Nielsen. Asynchronous multiparty computation: Theory and implementation. In PKC, pages 160–179, 2009.

  • [20] I. Damgård, M. Keller, E. Larraia, V. Pastro, P. Scholl, and N. P. Smart. Practical covertly secure MPC for dishonest majority - or: Breaking the SPDZ limits. In ESORICS, pages 1–18, 2013.

  • [21] I. Damgård, V. Pastro, N. P. Smart, and S. Zakarias. Multi-party computation from somewhat homomorphic encryption. In CRYPTO, pages 643–662, 2012.

  • [22] I. Damgård and R. Thorbek. Efficient conversion of secret-shared values between different fields. IACR Cryptology ePrint Archive, 2008:221, 2008.

  • [23] D. Demmler, T. Schneider, and M. Zohner. ABY - A framework for efficient mixed-protocol secure two-party computation. In NDSS, 2015.

  • [24] J. Doerner and A. Shelat. Scaling oram for secure computation. In CCS, pages 523–535, 2017.

  • [25] Y. Ejgenberg, M. Farbstein, M. Levy, and Y. Lindell. SCAPI: the secure computation application programming interface. IACR Cryptology ePrint Archive, 2012:629, 2012.

  • [26] M. Franz, A. Holzer, S. Katzenbeisser, C. Schallhart, and H. Veith. CBMC-GC: an ANSI C compiler for secure two-party computations. In CC ‘14, pages 244–249, 2014.

  • [27] M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In CCS, pages 1322–1333, 2015.

  • [28] O. Goldreich. Towards a theory of software protection and simulation by oblivious rams. In STOC, pages 182–194, New York, NY, USA, 1987. ACM.

  • [29] O. Goldreich. Foundations of Cryptography: Volume 2, Basic Applications. Cambridge University Press, New York, NY, USA, 2004.

  • [30] O. Goldreich and R. Ostrovsky. Software protection and simulation on oblivious rams. J. ACM, 43(3):431–473, May 1996.

  • [31] S. D. Gordon, J. Katz, V. Kolesnikov, F. Krell, T. Malkin, M. Raykova, and Y. Vahlis. Secure two-party computation in sublinear (amortized) time. In CCS, pages 513–524, 2012.

  • [32] T. Graepel, K. Lauter, and M. Naehrig. Ml confidential: Machine learning on encrypted data. In Proceedings of the 15th International Conference on Information Security and Cryptology, ICISC’12, pages 1–21, Berlin, Heidelberg, 2013. Springer-Verlag.

  • [33] C. Hazay and Y. Lindell. Efficient Secure Two-Party Protocols: Techniques and Constructions. Springer-Verlag New York, Inc., New York, NY, USA, 1st edition, 2010.

  • [34] W. Henecka, S. Kögl, A. Sadeghi, T. Schneider, and I. Wehrenberg. TASTY: tool for automating secure two-party computations. In CCS, pages 451–462, 2010.

  • [35] E. Hesamifard, H. Takabi, M. Ghasemi, and C. Jones. Privacy-preserving machine learning in cloud. In Proceedings of the 2017 on Cloud Computing Security Workshop, CCSW ‘17, pages 39–43, New York, NY, USA, 2017. ACM.

  • [36] Y. Ishai, J. Kilian, K. Nissim, and E. Petrank. Extending oblivious transfers efficiently. In CRYPTO, pages 145–161, 2003.

  • [37] A. Jarrous and B. Pinkas. Secure hamming distance based computation and its applications. In ACNS, pages 107–124, 2009.

  • [38] M. Keller, E. Orsini, and P. Scholl. Actively secure OT extension with optimal overhead. In CRYPTO, pages 724–741, 2015.

  • [39] M. Keller, E. Orsini, and P. Scholl. Mascot: Faster malicious arithmetic secure computation with oblivious transfer. In CCS, pages 830–842, 2016.

  • [40] M. Keller and P. Scholl. Efficient, oblivious data structures for MPC. In ASIACRYPT, pages 506–525, 2014.

  • [41] V. Kolesnikov and R. Kumaresan. Improved OT extension for transferring short secrets. In CRYPTO, pages 54–70. Springer, 2013.

  • [42] V. Kolesnikov, A. Sadeghi, and T. Schneider. Improved garbled circuit building blocks and applications to auctions and computing minima. In CANS, pages 1–20, 2009.

  • [43] V. Kolesnikov and T. Schneider. Improved garbled circuit: Free XOR gates and applications. In ICALP, pages 486–498, 2008.

  • [44] V. Kolesnikov and T. Schneider. A practical universal circuit construction and secure evaluation of private functions. In FC, pages 83–97, 2008.

  • [45] Y. Lindell and B. Pinkas. Privacy preserving data mining. In CRYPTO, volume 1880, pages 36–54, Berlin and New York, 2000. Springer.

  • [46] Y. Lindell and B. Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177–206, 2002.

  • [47] Y. Lindell and B. Pinkas. Secure multiparty computation for privacy-preserving data mining. IACR Cryptology ePrint Archive, 2008:197, 2008.

  • [48] Y. Lindell and B. Pinkas. A proof of security of yao’s protocol for two-party computation. J. Cryptol., 22(2):161–188, Apr. 2009.

  • [49] C. Liu, X. S. Wang, K. Nayak, Y. Huang, and E. Shi. Oblivm: A programming framework for secure computation. In SP, pages 359–376, 2015.

  • [50] D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella. Fairplay— a secure two-party computation system. In SSYM, pages 20–20, Berkeley, CA, USA, 2004. USENIX Association.

  • [51] P. Mohassel, S. S. Sadeghian, and N. P. Smart. Actively secure private function evaluation. In ASIACRYPT, pages 486–505, 2014.

  • [52] P. Mohassel and Y. Zhang. Secureml: A system for scalable privacy-preserving machine learning. In 2017 IEEE Symposium on Security and Privacy, SP 2017, San Jose, CA, USA, May 22-26, 2017, pages 19–38, 2017.

  • [53] M. Naor and B. Pinkas. Efficient oblivious transfer protocols. In SODA, pages 448–457, Philadelphia, PA, USA, 2001. Society for Industrial and Applied Mathematics.

  • [54] M. Naor and B. Pinkas. Computationally secure oblivious transfer. Journal of Cryptology, 18:1–35, Jan 2005.

  • [55] J. B. Nielsen and C. Orlandi. LEGO for two-party secure computation. In TCC, pages 368–386, 2009.

  • [56] A. Patra, P. Sarkar, and A. Suresh. Fast actively secure OT extension for short secrets. In NDSS. The Internet Society, 2017.

  • [57] B. Pinkas, T. Schneider, N. P. Smart, and S. C. Williams. Secure two-party computation is practical. IACR Cryptology ePrint Archive, 2009:314, 2009.

  • [58] P. Pullonen, D. Bogdanov, and T. Schneider. The design and implementation of a two-party protocol suite for share-mind 3, Sept. 2012.

  • [59] E. Shi, T.-H. H. Chan, E. Stefanov, and M. Li. Oblivious ram with o((logn)3) worst-case cost. In ASIACRYPT, pages 197–214, 2011.

  • [60] R. K. H. Tai, J. P. K. Ma, Y. Zhao, and S. S. M. Chow. Privacy-preserving decision trees evaluation via linear functions. In ESORICS, pages 494–512, 2017.

  • [61] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Risten-part. Stealing machine learning models via prediction apis. In USENIX, pages 601–618, 2016.

  • [62] X. Wang, T. H. Chan, and E. Shi. Circuit ORAM: on tightness of the goldreich-ostrovsky lower bound. In CCS, pages 850–861, 2015.

  • [63] X. S. Wang, Y. Huang, T.-H. H. Chan, A. Shelat, and E. Shi. Scoram: Oblivious ram for secure computation. In CCS, pages 191–202, 2014.

  • [64] X. S. Wang, K. Nayak, C. Liu, T.-H. H. Chan, E. Shi, E. Stefanov, and Y. Huang. Oblivious data structures. In CCS, pages 215–226, New York, NY, USA, 2014. ACM.

  • [65] I. H. Witten, E. Frank, and M. A. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kauf-mann Publishers Inc., San Francisco, CA, USA, 3rd edition, 2011.

  • [66] D. J. Wu, T. Feng, M. Naehrig, and K. Lauter. Privately evaluating decision trees and random forests. PoPETs, 2016(4):335–355, 2016.

  • [67] X. Wu, M. Fredrikson, S. Jha, and J. F. Naughton. A methodology for formalizing model-inversion attacks. In CSF, pages 355–370, 2016.

  • [68] A. C. Yao. Protocols for secure computations. In SFCS, pages 160–164, Washington, DC, USA, 1982. IEEE Computer Society.

  • [69] S. Zahur, M. Rosulek, and D. Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In EUROCRYPT, pages 220–250, 2015.


Journal + Issues