Decision trees and random forests are widely used classifiers in machine learning. Service providers often host classification models in a cloud service and provide an interface for clients to use the model remotely. While the model is sensitive information of the server, the input query and prediction results are sensitive information of the client. This motivates the need for private decision tree evaluation, where the service provider does not learn the client’s input and the client does not learn the model except for its size and the result.
In this work, we identify the three phases of private decision tree evaluation protocols: feature selection, comparison, and path evaluation. We systematize constant-round protocols for each of these phases to identify the best available instantiations using the two main paradigms for secure computation: garbling techniques and homomorphic encryption. There is a natural tradeoff between runtime and communication considering these two paradigms: garbling techniques use fast symmetric-key operations but require a large amount of communication, while homomorphic encryption is computationally heavy but requires little communication. Our contributions are as follows: Firstly, we systematically review and analyse state-of-the-art protocols for the three phases of private decision tree evaluation. Our methodology allows us to identify novel combinations of these protocols that provide better tradeoffs than existing protocols. Thereafter, we empirically evaluate all combinations of these protocols by providing communication and runtime measures, and provide recommendations based on the identified concrete tradeoffs.
If the inline PDF is not rendering correctly, you can download the PDF file here.
[ALSZ13] Gilad Asharov, Yehuda Lindell, Thomas Schneider, and Michael Zohner. More efficient oblivious transfer and extensions for faster secure computation. In ACM Computer and Communications Security (CCS’13), pages 535–548. ACM, 2013.
[ALSZ15] Gilad Asharov, Yehuda Lindell, Thomas Schneider, and Michael Zohner. More efficient oblivious transfer extensions with security for malicious adversaries. In Advances in Cryptology – EUROCRYPT’15, volume 9056 of LNCS, pages 673–701. Springer, 2015.
[ALSZ17] Gilad Asharov, Yehuda Lindell, Thomas Schneider, and Michael Zohner. More efficient oblivious transfer extensions. J. Cryptology, 30(3):805–858, 2017.
[AM18] Bushra A. AlAhmadi and Ivan Martinovic. Malclassifier: Malware family classification using network flow sequence behaviour. In 2018 APWG Symposium on Electronic Crime Research (eCrime’18), pages 1–13. IEEE, 2018.
[Bar15] Jeff Barr. Amazon machine learning – make data-driven decisions at scale. aws.amazon.com/blogs/aws/amazon-machine-learning-make-data-drivendecisions-at-scale, 2015. Accessed: 2018-08-19.
[BBB+16] Elaine B. Barker, William C. Barker, William E. Burr, W. Timothy Polk, and Miles E. Smid. Sp 800-57. recommendation for key management, part 1: General (revised). Technical report, Gaithersburg, MD, United States, 2016.
[BDK+18] Niklas Büscher, Daniel Demmler, Stefan Katzenbeisser, David Kretzmer, and Thomas Schneider. HyCC: Compilation of hybrid protocols for practical secure computation. In ACM Computer and Communications Security (CCS’18), pages 847–861. ACM, 2018.
[Bea95] Donald Beaver. Precomputing oblivious transfer. In Advances in Cryptology – CRYPTO’95, volume 963 of LNCS, pages 97–109. Springer, 1995.
[BFK+09] Mauro Barni, Pierluigi Failla, Vladimir Kolesnikov, Riccardo Lazzeretti, Ahmad-Reza Sadeghi, and Thomas Schneider. Secure evaluation of private linear branching programs with medical applications. In European Symposium on Research in Computer Security (ESORICS’09), volume 5789 of LNCS, pages 424–439. Springer, 2009.
[BFL+11] Mauro Barni, Pierluigi Failla, Riccardo Lazzeretti, Ahmad-Reza Sadeghi, and Thomas Schneider. Privacy-preserving ECG classification with branching programs and neural networks. IEEE Transactions on Information Forensics and Security, 6(2):452–468, 2011.
[BFR+18] Ferdinand Brasser, Tommaso Frassetto, Korbinian Riedhammer, Ahmad-Reza Sadeghi, Thomas Schneider, and Christian Weinert. Voiceguard: Secure and private speech processing. In Annual Conference of the International Speech Communication Association (INTERSPEECH’18), pages 1303–1307. ISCA, 2018.
[BHKR13] Mihir Bellare, Viet Tung Hoang, Sriram Keelveedhi, and Phillip Rogaway. Efficient garbling from a fixed-key blockcipher. In IEEE Symposium on Security and Privacy (S&P’13), pages 478–492. IEEE, 2013.
[Big18] Inc. BigML. Machine learning made easy, beautiful and understandable. https://bigml.com/, 2018. Accessed: 2018-08-24.
[BJL12] Dan Bogdanov, Roman Jagomägis, and Sven Laur. A universal toolkit for cryptographically secure privacy-preserving data mining. In Pacific Asia Workshop on Intelligence and Security Informatics (PAISI’12), volume 7299 of LNCS, pages 112–126. Springer, 2012.
[BMR90] Donald Beaver, Silvio Micali, and Phillip Rogaway. The round complexity of secure protocols (extended abstract). In ACM Symposium on Theory of Computing (STOC’90), pages 503–513. ACM, 1990.
[BPSW07] Justin Brickell, Donald E. Porter, Vitaly Shmatikov, and Emmett Witchel. Privacy-preserving remote diagnostics. In ACM Computer and Communications Security (CCS’07), pages 498–507. ACM, 2007.
[BPTG15] Raphael Bost, Raluca Ada Popa, Stephen Tu, and Shafi Goldwasser. Machine learning classification over encrypted data. In Network and Distributed System Security Symposium (NDSS’15). The Internet Society, 2015.
[BS09] Justin Brickell and Vitaly Shmatikov. Privacy-preserving classifier learning. In Financial Cryptography and Data Security (FC’09), volume 5628 of LNCS, pages 128–147. Springer, 2009.
[BSR18] Diogo Barradas, Nuno Santos, and Luís Rodrigues. Effective detection of multimedia protocol tunneling using machine learning. In USENIX Security Symposium’18, pages 169–185. USENIX, 2018.
[CDH+17] Martine De Cock, Rafael Dowsley, Caleb Horst, Raj Katti, Anderson C. A. Nascimento, Wing-Sea Poon, and Stacey C. Truex. Efficient and private scoring of decision trees, support vector machines and logistic regression models based on pre-computation. IEEE Transactions on Dependable and Secure Computing, To appear., 2017.
[CO18] Michele Ciampi and Claudio Orlandi. Combining private set-intersection with secure two-party computation. In Security and Cryptography for Networks (SCN’18), volume 11035 of Lecture Notes in Computer Science, pages 464–482. Springer, 2018.
[DCBA14] Manuel Fernández Delgado, Eva Cernadas, Senén Barro, and Dinani Gomes Amorim. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1):3133–3181, 2014.
[DGK07] Ivan Damgård, Martin Geisler, and Mikkel Krøigaard. Efficient and secure comparison for on-line auctions. In Australasian Conference on Information Security and Privacy (ACISP’07), volume 4586 of LNCS, pages 416–430. Springer, 2007.
[DGK08] Ivan Damgård, Martin Geisler, and Mikkel Krøigaard. Homomorphic encryption and secure comparison. International Journal of Applied Cryptography, 1(1):22–31, 2008.
[DGK09] Ivan Damgård, Martin Geisler, and Mikkel Krøigaard. A correction to ’Efficient and secure comparison for on-line auctions’. International Journal of Advanced Computer Technology (IJACT), 1(4):323–324, 2009.
[DJ01] Ivan Damgård and Mads Jurik. A generalisation, a simplification and some applications of Paillier’s probabilistic public-key system. In Public Key Cryptography (PKC’01), volume 1992 of LNCS, pages 119–136. Springer, 2001.
[DSZ15] Daniel Demmler, Thomas Schneider, and Michael Zohner. ABY - A framework for efficient mixed-protocol secure two-party computation. In Network and Distributed System Security Symposium (NDSS’15). The Internet Society, 2015.
[ElG85] Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Advances in Cryptology – CRYPTO’85, volume 196 of LNCS, pages 10–18. Springer, 1985.
[FW12] Pui Kuen Fong and Jens H. Weber-Jahnke. Privacy preserving decision tree learning using unrealized data sets. IEEE Transactions on Knowledge and Data Engineering, 24(2):353–364, 2012.
[GBC+97] Michael D. Garris, James L. Blue, Gerald T. Candela, Patrick J. Grother, Stanley Janet, and Charles L. Wilson. NIST form-based handprint recognition system (release 2.0). Interagency/Internal Report (NISTIR) - 5959, 1997.
[GKG+18] Srishti Gupta, Abhinav Khattar, Arpit Gogia, Ponnurangam Kumaraguru, and Tanmoy Chakraborty. Collective classification of spam campaigners on Twitter: A hierarchical meta-path based approach. In World Wide Web Conference on World Wide Web (WWW’18), pages 529–538. ACM, 2018.
[GKS17] Daniel Günther, Ágnes Kiss, and Thomas Schneider. More efficient universal circuit constructions. In Advances in Cryptology – ASIACRYPT’17, volume 10625 of LNCS, pages 443–470. Springer, 2017.
[HEK12] Yan Huang, David Evans, and Jonathan Katz. Private set intersection: Are garbled circuits better than custom protocols? In Network and Distributed System Security Symposium (NDSS’12). The Internet Society, 2012.
[IKNP03] Yuval Ishai, Joe Kilian, Kobbi Nissim, and Erez Petrank. Extending oblivious transfers efficiently. In Advances in Cryptology – CRYPTO’03, volume 2729 of LNCS, pages 145–161. Springer, 2003.
[IP07] Yuval Ishai and Anat Paskin. Evaluating branching programs on encrypted data. In Theory of Cryptography Conference (TCC’07), volume 4392 of LNCS, pages 575–594. Springer, 2007.
[IR89] Russell Impagliazzo and Steven Rudich. Limits on the provable consequences of one-way permutations. In ACM Symposium on Theory of Computing (STOC’89), pages 44–61. ACM, 1989.
[JS18] Marc Joye and Fariborz Salehi. Private yet efficient decision tree evaluation. In Data and Applications Security and Privacy (DBSec’18), volume 10980 of LNCS, pages 243–259. Springer, 2018.
[JSD+18] Mika Juuti, Sebastian Szyller, Alexey Dmitrenko, Samuel Marchal, and N. Asokan. PRADA: protecting against DNN model stealing attacks. CoRR, abs/1805.02628, 2018.
[JVC18] Chiraag Juvekar, Vinod Vaikuntanathan, and Anantha Chandrakasan. GAZELLE: A low latency framework for secure neural network inference. In USENIX Security Symposium’18, pages 1651–1669. USENIX, 2018.
[KM11] Jonathan Katz and Lior Malka. Constant-round private function evaluation with linear complexity. In Advances in Cryptology – ASIACRYPT’11, volume 7073 of LNCS, pages 556–571. Springer, 2011.
[KMAM18] Manish Kesarwani, Bhaskar Mukhoty, Vijay Arya, and Sameep Mehta. Model extraction warning in mlaas paradigm. In Annual Computer Security Applications Conference (ACSAC’18), pages 371–380. ACM, 2018.
[KOS15] Marcel Keller, Emmanuela Orsini, and Peter Scholl. Actively secure OT extension with optimal overhead. In Advances in Cryptology – CRYPTO’15, volume 9215 of LNCS, pages 724–741. Springer, 2015.
[KS08a] Vladimir Kolesnikov and Thomas Schneider. Improved garbled circuit: Free XOR gates and applications. In International Colloquium on Automata, Languages and Programming (ICALP’08), volume 5126 of LNCS, pages 486–498. Springer, 2008.
[KS08b] Vladimir Kolesnikov and Thomas Schneider. A practical universal circuit construction and secure evaluation of private functions. In Financial Cryptography and Data Security (FC’08), volume 5143 of LNCS, pages 83–97. Springer, 2008.
[KS16] Ágnes Kiss and Thomas Schneider. Valiant’s universal circuit is practical. In Advances in Cryptology – EUROCRYPT’16, volume 9665 of LNCS, pages 699–728. Springer, 2016.
[KSS09] Vladimir Kolesnikov, Ahmad-Reza Sadeghi, and Thomas Schneider. Improved garbled circuit building blocks and applications to auctions and computing minima. In Cryptology and Network Security (CANS’09), volume 5888 of LNCS, pages 1–20. Springer, 2009.
[Lic18] Moshe Lichman. UCI machine learning repository. https://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science, 2018. Accessed: 2018-08-24.
[LJLA17] Jian Liu, Mika Juuti, Yao Lu, and N. Asokan. Oblivious neural network predictions via MiniONN transformations. In ACM Computer and Communications Security (CCS’17), pages 619–631. ACM, 2017.
[LP00] Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. In Advances in Cryptology – CRYPTO’00, volume 1880 of LNCS, pages 36–54. Springer, 2000.
[LP02] Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. Journal of Cryptology, 15(3):177–206, 2002.
[LZS18] Wenjie Lu, Jun-Jie Zhou, and Jun Sakuma. Non-interactive and output expressive private comparison from homomorphic encryption. In ACM Asia Conference on Computer and Communications Security (AsiaCCS’18), pages 67–74. ACM, 2018.
[MAAG15] Michael J. Mayhew, Michael Atighetchi, Aaron Adler, and Rachel Greenstadt. Use of machine learning in big data analytics for insider threat detection. In IEEE Military Communications Conference (MILCOM’15), pages 915–922. IEEE, 2015.
[MLJ17] Inc. MLJAR. MLJAR: Machine learning for all. https://mljar.com/, 2016-2017. Accessed: 2018-08-24.
[MS13] Payman Mohassel and Seyed Saeed Sadeghian. How to hide circuits in MPC an efficient framework for private function evaluation. In Advances in Cryptology – EUROCRYPT’13, volume 7881 of LNCS, pages 557–574. Springer, 2013.
[MSS14] Payman Mohassel, Seyed Saeed Sadeghian, and Nigel P. Smart. Actively secure private function evaluation. In Advances in Cryptology – ASIACRYPT’14, volume 8874 of LNCS, pages 486–505. Springer, 2014.
[MZ17] Payman Mohassel and Yupeng Zhang. SecureML: A system for scalable privacy-preserving machine learning. In IEEE Symposium on Security and Privacy (S&P’17), pages 19–38. IEEE, 2017.
[NIZ+16] Ana Nika, Asad Ismail, Ben Y. Zhao, Sabrina Gaito, Gian Paolo Rossi, and Haitao Zheng. Understanding and predicting data hotspots in cellular networks. Mobile Networks and Applications (MONET), 21(3):402–413, 2016.
[OTGM18] Rebekah Overdorf, Carmela Troncoso, Rachel Greenstadt, and Damon McCoy. Under the underground: Predicting private interactions in underground forums. CoRR, abs/1805.04494, 2018.
[Pai99] Pascal Paillier. Public-key cryptosystems based on composite degree residuosity classes. In Advances in Cryptology – EUROCRYPT’99, volume 1592 of LNCS, pages 223–238. Springer, 1999.
[PMG+17] Nicolas Papernot, Patrick D. McDaniel, Ian J. Good-fellow, Somesh Jha, Z. Berkay Celik, and Ananthram Swami. Practical black-box attacks against machine learning. In ACM Asia Conference on Computer and Communications Security (AsiaCCS’17), 2017, pages 506–519. ACM, 2017.
[RMD18] Alejandro Rago, Claudia Marcos, and J. Andres Diaz-Pace. Using semantic roles to improve text classification in the requirements domain. Language Resources and Evaluation, 52(3):801–837, 2018.
[RWT+18] M. Sadegh Riazi, Christian Weinert, Oleksandr Tkachenko, Ebrahim M. Songhori, Thomas Schneider, and Farinaz Koushanfar. Chameleon: A hybrid secure computation framework for machine learning applications. In ACM Asia Conference on Computer and Communications Security (AsiaCCS’18), pages 707–721. ACM, 2018.
[Sch08] Thomas Schneider. Practical secure function evaluation. Master’s thesis, Friedrich-Alexander University Erlangen-Nürnberg, Germany, February 27, 2008.
[SS08] Ahmad-Reza Sadeghi and Thomas Schneider. Generalized universal circuits for secure evaluation of private functions with application to data classification. In Information Security and Cryptology (ICISC’08), volume 5461 of LNCS, pages 336–353. Springer, 2008.
[SSSS17] Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In IEEE Symposium on Security and Privacy (S&P’17), pages 3–18. IEEE, 2017.
[TASF09] Ajay Kumar Tanwani, M. Jamal Afridi, M. Zubair Shafiq, and Muddassar Farooq. Guidelines to select machine learning scheme for classification of biomedical datasets. In Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO’09), volume 5483 of LNCS, pages 128–139. Springer, 2009.
[Ten98] Michel Tenenhaus. La régression PLS: théorie et pratique. Editions technip, 1998.
[TKK19] Anselme Tueno, Florian Kerschbaum, and Stefan Katzenbeisser. Private evaluation of decision trees using sublinear cost. Proceedings on Privacy Enhancing Technologies (PoPETs), 2019(1):266–286, 2019.
[TMZC17] Raymond K. H. Tai, Jack P. K. Ma, Yongjun Zhao, and Sherman S. M. Chow. Privacy-preserving decision trees evaluation via linear functions. In European Symposium on Research in Computer Security (ESORICS’17), volume 10493 of LNCS, pages 494–512. Springer, 2017.
[TZJ+16] Florian Tramèr, Fan Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. Stealing machine learning models via prediction APIs. In USENIX Security Symposium’16, pages 601–618. USENIX, 2016.
[Val76] Leslie G. Valiant. Universal circuits (preliminary report). In ACM Symposium on Theory of Computing (STOC’76), pages 196–203. ACM, 1976.
[VC05] Jaideep Vaidya and Chris Clifton. Privacy-preserving decision trees over vertically partitioned data. In Data and Applications Security (DBSec’05), volume 3654 of LNCS, pages 139–152. Springer, 2005.
[VCKP08] Jaideep Vaidya, Chris Clifton, Murat Kantarcioglu, and A. Scott Patterson. Privacy-preserving decision trees over vertically partitioned data. ACM Transactions on Knowledge Discovery from Data (TKDD), 2(3):14:1–14:27, 2008.
[Wak68] Abraham Waksman. A permutation network. Journal of the ACM, 15(1):159–163, 1968.
[WFNL16] David J. Wu, Tony Feng, Michael Naehrig, and Kristin E. Lauter. Privately evaluating decision trees and random forests. Proceedings on Privacy Enhancing Technologies (PoPETs), 2016(4):335–355, 2016.
[WGC18] Sameer Wagh, Divya Gupta, and Nishanth Chandran. SecureNN: Efficient and private neural network training. Cryptology ePrint Archive, 2018/442, 2018.
[Wis18] Wise.io. Machine learning for the industrial internet of things. wise.io, 2018. Accessed: 2018-08-24.
[Yao82] Andrew C.-C. Yao. Protocols for secure computations (extended abstract). In Foundations of Computer Science (FOCS’82), pages 160–164. IEEE, 1982.
[Yao86] Andrew C.-C. Yao. How to generate and exchange secrets (extended abstract). In Foundations of Computer Science (FOCS’86), pages 162–167. IEEE, 1986.
[YGL17] Fengpeng Yuan, Xianyi Gao, and Janne Lindqvist. How busy are you?: Predicting the interruptibility intensity of mobile users. In Conference on Human Factors in Computing Systems (CHI’17), pages 5346–5360. ACM, 2017.
[ZRE15] Samee Zahur, Mike Rosulek, and David Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In Advances in Cryptology – EUROCRYPT’15, volume 9057 of LNCS, pages 220–250. Springer, 2015.