Decision trees and random forests are widely used classifiers in machine learning. Service providers often host classification models in a cloud service and provide an interface for clients to use the model remotely. While the model is sensitive information of the server, the input query and prediction results are sensitive information of the client. This motivates the need for private decision tree evaluation, where the service provider does not learn the client’s input and the client does not learn the model except for its size and the result.
In this work, we identify the three phases of private decision tree evaluation protocols: feature selection, comparison, and path evaluation. We systematize constant-round protocols for each of these phases to identify the best available instantiations using the two main paradigms for secure computation: garbling techniques and homomorphic encryption. There is a natural tradeoff between runtime and communication considering these two paradigms: garbling techniques use fast symmetric-key operations but require a large amount of communication, while homomorphic encryption is computationally heavy but requires little communication. Our contributions are as follows: Firstly, we systematically review and analyse state-of-the-art protocols for the three phases of private decision tree evaluation. Our methodology allows us to identify novel combinations of these protocols that provide better tradeoffs than existing protocols. Thereafter, we empirically evaluate all combinations of these protocols by providing communication and runtime measures, and provide recommendations based on the identified concrete tradeoffs.
If the inline PDF is not rendering correctly, you can download the PDF file here.
[ALSZ13] Gilad Asharov Yehuda Lindell Thomas Schneider and Michael Zohner. More efficient oblivious transfer and extensions for faster secure computation. In ACM Computer and Communications Security (CCS’13) pages 535–548. ACM 2013.
[ALSZ15] Gilad Asharov Yehuda Lindell Thomas Schneider and Michael Zohner. More efficient oblivious transfer extensions with security for malicious adversaries. In Advances in Cryptology – EUROCRYPT’15 volume 9056 of LNCS pages 673–701. Springer 2015.
[ALSZ17] Gilad Asharov Yehuda Lindell Thomas Schneider and Michael Zohner. More efficient oblivious transfer extensions. J. Cryptology 30(3):805–858 2017.
[AM18] Bushra A. AlAhmadi and Ivan Martinovic. Malclassifier: Malware family classification using network flow sequence behaviour. In 2018 APWG Symposium on Electronic Crime Research (eCrime’18) pages 1–13. IEEE 2018.
[Bar15] Jeff Barr. Amazon machine learning – make data-driven decisions at scale. aws.amazon.com/blogs/aws/amazon-machine-learning-make-data-drivendecisions-at-scale 2015. Accessed: 2018-08-19.
[BBB+16] Elaine B. Barker William C. Barker William E. Burr W. Timothy Polk and Miles E. Smid. Sp 800-57. recommendation for key management part 1: General (revised). Technical report Gaithersburg MD United States 2016.
[BDK+18] Niklas Büscher Daniel Demmler Stefan Katzenbeisser David Kretzmer and Thomas Schneider. HyCC: Compilation of hybrid protocols for practical secure computation. In ACM Computer and Communications Security (CCS’18) pages 847–861. ACM 2018.
[Bea95] Donald Beaver. Precomputing oblivious transfer. In Advances in Cryptology – CRYPTO’95 volume 963 of LNCS pages 97–109. Springer 1995.
[BFK+09] Mauro Barni Pierluigi Failla Vladimir Kolesnikov Riccardo Lazzeretti Ahmad-Reza Sadeghi and Thomas Schneider. Secure evaluation of private linear branching programs with medical applications. In European Symposium on Research in Computer Security (ESORICS’09) volume 5789 of LNCS pages 424–439. Springer 2009.
[BFL+11] Mauro Barni Pierluigi Failla Riccardo Lazzeretti Ahmad-Reza Sadeghi and Thomas Schneider. Privacy-preserving ECG classification with branching programs and neural networks. IEEE Transactions on Information Forensics and Security 6(2):452–468 2011.
[BFR+18] Ferdinand Brasser Tommaso Frassetto Korbinian Riedhammer Ahmad-Reza Sadeghi Thomas Schneider and Christian Weinert. Voiceguard: Secure and private speech processing. In Annual Conference of the International Speech Communication Association (INTERSPEECH’18) pages 1303–1307. ISCA 2018.
[BHKR13] Mihir Bellare Viet Tung Hoang Sriram Keelveedhi and Phillip Rogaway. Efficient garbling from a fixed-key blockcipher. In IEEE Symposium on Security and Privacy (S&P’13) pages 478–492. IEEE 2013.
[Big18] Inc. BigML. Machine learning made easy beautiful and understandable. https://bigml.com/ 2018. Accessed: 2018-08-24.
[BJL12] Dan Bogdanov Roman Jagomägis and Sven Laur. A universal toolkit for cryptographically secure privacy-preserving data mining. In Pacific Asia Workshop on Intelligence and Security Informatics (PAISI’12) volume 7299 of LNCS pages 112–126. Springer 2012.
[BMR90] Donald Beaver Silvio Micali and Phillip Rogaway. The round complexity of secure protocols (extended abstract). In ACM Symposium on Theory of Computing (STOC’90) pages 503–513. ACM 1990.
[BPSW07] Justin Brickell Donald E. Porter Vitaly Shmatikov and Emmett Witchel. Privacy-preserving remote diagnostics. In ACM Computer and Communications Security (CCS’07) pages 498–507. ACM 2007.
[BPTG15] Raphael Bost Raluca Ada Popa Stephen Tu and Shafi Goldwasser. Machine learning classification over encrypted data. In Network and Distributed System Security Symposium (NDSS’15). The Internet Society 2015.
[BS09] Justin Brickell and Vitaly Shmatikov. Privacy-preserving classifier learning. In Financial Cryptography and Data Security (FC’09) volume 5628 of LNCS pages 128–147. Springer 2009.
[BSR18] Diogo Barradas Nuno Santos and Luís Rodrigues. Effective detection of multimedia protocol tunneling using machine learning. In USENIX Security Symposium’18 pages 169–185. USENIX 2018.
[CDH+17] Martine De Cock Rafael Dowsley Caleb Horst Raj Katti Anderson C. A. Nascimento Wing-Sea Poon and Stacey C. Truex. Efficient and private scoring of decision trees support vector machines and logistic regression models based on pre-computation. IEEE Transactions on Dependable and Secure Computing To appear. 2017.
[CO18] Michele Ciampi and Claudio Orlandi. Combining private set-intersection with secure two-party computation. In Security and Cryptography for Networks (SCN’18) volume 11035 of Lecture Notes in Computer Science pages 464–482. Springer 2018.
[DCBA14] Manuel Fernández Delgado Eva Cernadas Senén Barro and Dinani Gomes Amorim. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research 15(1):3133–3181 2014.
[DGK07] Ivan Damgård Martin Geisler and Mikkel Krøigaard. Efficient and secure comparison for on-line auctions. In Australasian Conference on Information Security and Privacy (ACISP’07) volume 4586 of LNCS pages 416–430. Springer 2007.
[DGK08] Ivan Damgård Martin Geisler and Mikkel Krøigaard. Homomorphic encryption and secure comparison. International Journal of Applied Cryptography 1(1):22–31 2008.
[DGK09] Ivan Damgård Martin Geisler and Mikkel Krøigaard. A correction to ’Efficient and secure comparison for on-line auctions’. International Journal of Advanced Computer Technology (IJACT) 1(4):323–324 2009.
[DJ01] Ivan Damgård and Mads Jurik. A generalisation a simplification and some applications of Paillier’s probabilistic public-key system. In Public Key Cryptography (PKC’01) volume 1992 of LNCS pages 119–136. Springer 2001.
[DSZ15] Daniel Demmler Thomas Schneider and Michael Zohner. ABY - A framework for efficient mixed-protocol secure two-party computation. In Network and Distributed System Security Symposium (NDSS’15). The Internet Society 2015.
[ElG85] Taher ElGamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Advances in Cryptology – CRYPTO’85 volume 196 of LNCS pages 10–18. Springer 1985.
[FW12] Pui Kuen Fong and Jens H. Weber-Jahnke. Privacy preserving decision tree learning using unrealized data sets. IEEE Transactions on Knowledge and Data Engineering 24(2):353–364 2012.
[GBC+97] Michael D. Garris James L. Blue Gerald T. Candela Patrick J. Grother Stanley Janet and Charles L. Wilson. NIST form-based handprint recognition system (release 2.0). Interagency/Internal Report (NISTIR) - 5959 1997.
[GKG+18] Srishti Gupta Abhinav Khattar Arpit Gogia Ponnurangam Kumaraguru and Tanmoy Chakraborty. Collective classification of spam campaigners on Twitter: A hierarchical meta-path based approach. In World Wide Web Conference on World Wide Web (WWW’18) pages 529–538. ACM 2018.
[GKS17] Daniel Günther Ágnes Kiss and Thomas Schneider. More efficient universal circuit constructions. In Advances in Cryptology – ASIACRYPT’17 volume 10625 of LNCS pages 443–470. Springer 2017.
[HEK12] Yan Huang David Evans and Jonathan Katz. Private set intersection: Are garbled circuits better than custom protocols? In Network and Distributed System Security Symposium (NDSS’12). The Internet Society 2012.
[IK17] Aleksandar Ilic and Oleksandr Kuvshynov. Evaluating boosted decision trees for billions of users. https://code.facebook.com/posts/975025089299409/evaluating-boosted-decision-trees-for-billions-ofusers 2017. Accessed: 2018-08-19.
[IKNP03] Yuval Ishai Joe Kilian Kobbi Nissim and Erez Petrank. Extending oblivious transfers efficiently. In Advances in Cryptology – CRYPTO’03 volume 2729 of LNCS pages 145–161. Springer 2003.
[IP07] Yuval Ishai and Anat Paskin. Evaluating branching programs on encrypted data. In Theory of Cryptography Conference (TCC’07) volume 4392 of LNCS pages 575–594. Springer 2007.
[IR89] Russell Impagliazzo and Steven Rudich. Limits on the provable consequences of one-way permutations. In ACM Symposium on Theory of Computing (STOC’89) pages 44–61. ACM 1989.
[JS18] Marc Joye and Fariborz Salehi. Private yet efficient decision tree evaluation. In Data and Applications Security and Privacy (DBSec’18) volume 10980 of LNCS pages 243–259. Springer 2018.
[JSD+18] Mika Juuti Sebastian Szyller Alexey Dmitrenko Samuel Marchal and N. Asokan. PRADA: protecting against DNN model stealing attacks. CoRR abs/1805.02628 2018.
[JVC18] Chiraag Juvekar Vinod Vaikuntanathan and Anantha Chandrakasan. GAZELLE: A low latency framework for secure neural network inference. In USENIX Security Symposium’18 pages 1651–1669. USENIX 2018.
[KM11] Jonathan Katz and Lior Malka. Constant-round private function evaluation with linear complexity. In Advances in Cryptology – ASIACRYPT’11 volume 7073 of LNCS pages 556–571. Springer 2011.
[KMAM18] Manish Kesarwani Bhaskar Mukhoty Vijay Arya and Sameep Mehta. Model extraction warning in mlaas paradigm. In Annual Computer Security Applications Conference (ACSAC’18) pages 371–380. ACM 2018.
[KOS15] Marcel Keller Emmanuela Orsini and Peter Scholl. Actively secure OT extension with optimal overhead. In Advances in Cryptology – CRYPTO’15 volume 9215 of LNCS pages 724–741. Springer 2015.
[KS08a] Vladimir Kolesnikov and Thomas Schneider. Improved garbled circuit: Free XOR gates and applications. In International Colloquium on Automata Languages and Programming (ICALP’08) volume 5126 of LNCS pages 486–498. Springer 2008.
[KS08b] Vladimir Kolesnikov and Thomas Schneider. A practical universal circuit construction and secure evaluation of private functions. In Financial Cryptography and Data Security (FC’08) volume 5143 of LNCS pages 83–97. Springer 2008.
[KS16] Ágnes Kiss and Thomas Schneider. Valiant’s universal circuit is practical. In Advances in Cryptology – EUROCRYPT’16 volume 9665 of LNCS pages 699–728. Springer 2016.
[KSS09] Vladimir Kolesnikov Ahmad-Reza Sadeghi and Thomas Schneider. Improved garbled circuit building blocks and applications to auctions and computing minima. In Cryptology and Network Security (CANS’09) volume 5888 of LNCS pages 1–20. Springer 2009.
[Lic18] Moshe Lichman. UCI machine learning repository. https://archive.ics.uci.edu/ml. Irvine CA: University of California School of Information and Computer Science 2018. Accessed: 2018-08-24.
[LJLA17] Jian Liu Mika Juuti Yao Lu and N. Asokan. Oblivious neural network predictions via MiniONN transformations. In ACM Computer and Communications Security (CCS’17) pages 619–631. ACM 2017.
[LP00] Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. In Advances in Cryptology – CRYPTO’00 volume 1880 of LNCS pages 36–54. Springer 2000.
[LP02] Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. Journal of Cryptology 15(3):177–206 2002.
[LZS18] Wenjie Lu Jun-Jie Zhou and Jun Sakuma. Non-interactive and output expressive private comparison from homomorphic encryption. In ACM Asia Conference on Computer and Communications Security (AsiaCCS’18) pages 67–74. ACM 2018.
[MAAG15] Michael J. Mayhew Michael Atighetchi Aaron Adler and Rachel Greenstadt. Use of machine learning in big data analytics for insider threat detection. In IEEE Military Communications Conference (MILCOM’15) pages 915–922. IEEE 2015.
[MLJ17] Inc. MLJAR. MLJAR: Machine learning for all. https://mljar.com/ 2016-2017. Accessed: 2018-08-24.
[MS13] Payman Mohassel and Seyed Saeed Sadeghian. How to hide circuits in MPC an efficient framework for private function evaluation. In Advances in Cryptology – EUROCRYPT’13 volume 7881 of LNCS pages 557–574. Springer 2013.
[MSS14] Payman Mohassel Seyed Saeed Sadeghian and Nigel P. Smart. Actively secure private function evaluation. In Advances in Cryptology – ASIACRYPT’14 volume 8874 of LNCS pages 486–505. Springer 2014.
[MZ17] Payman Mohassel and Yupeng Zhang. SecureML: A system for scalable privacy-preserving machine learning. In IEEE Symposium on Security and Privacy (S&P’17) pages 19–38. IEEE 2017.
[NIZ+16] Ana Nika Asad Ismail Ben Y. Zhao Sabrina Gaito Gian Paolo Rossi and Haitao Zheng. Understanding and predicting data hotspots in cellular networks. Mobile Networks and Applications (MONET) 21(3):402–413 2016.
[OTGM18] Rebekah Overdorf Carmela Troncoso Rachel Greenstadt and Damon McCoy. Under the underground: Predicting private interactions in underground forums. CoRR abs/1805.04494 2018.
[Pai99] Pascal Paillier. Public-key cryptosystems based on composite degree residuosity classes. In Advances in Cryptology – EUROCRYPT’99 volume 1592 of LNCS pages 223–238. Springer 1999.
[PMG+17] Nicolas Papernot Patrick D. McDaniel Ian J. Good-fellow Somesh Jha Z. Berkay Celik and Ananthram Swami. Practical black-box attacks against machine learning. In ACM Asia Conference on Computer and Communications Security (AsiaCCS’17) 2017 pages 506–519. ACM 2017.
[RG16] Carl Rabeler and Craig Guyer. Microsoft decision trees algorithm. https://docs.microsoft.com/enus/sql/analysis-services/data-mining/microsoftdecision-trees-algorithm 2016. Accessed: 2018-08-19.
[RMD18] Alejandro Rago Claudia Marcos and J. Andres Diaz-Pace. Using semantic roles to improve text classification in the requirements domain. Language Resources and Evaluation 52(3):801–837 2018.
[RWT+18] M. Sadegh Riazi Christian Weinert Oleksandr Tkachenko Ebrahim M. Songhori Thomas Schneider and Farinaz Koushanfar. Chameleon: A hybrid secure computation framework for machine learning applications. In ACM Asia Conference on Computer and Communications Security (AsiaCCS’18) pages 707–721. ACM 2018.
[Sch08] Thomas Schneider. Practical secure function evaluation. Master’s thesis Friedrich-Alexander University Erlangen-Nürnberg Germany February 27 2008.
[Ser18] Amazon Web Services. Data privacy. https://aws.amazon.com/compliance/data-privacy-faq 2018. Accessed: 2018-08-19.
[SS08] Ahmad-Reza Sadeghi and Thomas Schneider. Generalized universal circuits for secure evaluation of private functions with application to data classification. In Information Security and Cryptology (ICISC’08) volume 5461 of LNCS pages 336–353. Springer 2008.
[SSSS17] Reza Shokri Marco Stronati Congzheng Song and Vitaly Shmatikov. Membership inference attacks against machine learning models. In IEEE Symposium on Security and Privacy (S&P’17) pages 3–18. IEEE 2017.
[TASF09] Ajay Kumar Tanwani M. Jamal Afridi M. Zubair Shafiq and Muddassar Farooq. Guidelines to select machine learning scheme for classification of biomedical datasets. In Evolutionary Computation Machine Learning and Data Mining in Bioinformatics (EvoBIO’09) volume 5483 of LNCS pages 128–139. Springer 2009.
[Ten98] Michel Tenenhaus. La régression PLS: théorie et pratique. Editions technip 1998.
[TKK19] Anselme Tueno Florian Kerschbaum and Stefan Katzenbeisser. Private evaluation of decision trees using sublinear cost. Proceedings on Privacy Enhancing Technologies (PoPETs) 2019(1):266–286 2019.
[TMZC17] Raymond K. H. Tai Jack P. K. Ma Yongjun Zhao and Sherman S. M. Chow. Privacy-preserving decision trees evaluation via linear functions. In European Symposium on Research in Computer Security (ESORICS’17) volume 10493 of LNCS pages 494–512. Springer 2017.
[TZJ+16] Florian Tramèr Fan Zhang Ari Juels Michael K. Reiter and Thomas Ristenpart. Stealing machine learning models via prediction APIs. In USENIX Security Symposium’16 pages 601–618. USENIX 2016.
[Val76] Leslie G. Valiant. Universal circuits (preliminary report). In ACM Symposium on Theory of Computing (STOC’76) pages 196–203. ACM 1976.
[VC05] Jaideep Vaidya and Chris Clifton. Privacy-preserving decision trees over vertically partitioned data. In Data and Applications Security (DBSec’05) volume 3654 of LNCS pages 139–152. Springer 2005.
[VCKP08] Jaideep Vaidya Chris Clifton Murat Kantarcioglu and A. Scott Patterson. Privacy-preserving decision trees over vertically partitioned data. ACM Transactions on Knowledge Discovery from Data (TKDD) 2(3):14:1–14:27 2008.
[Wak68] Abraham Waksman. A permutation network. Journal of the ACM 15(1):159–163 1968.
[WFNL16] David J. Wu Tony Feng Michael Naehrig and Kristin E. Lauter. Privately evaluating decision trees and random forests. Proceedings on Privacy Enhancing Technologies (PoPETs) 2016(4):335–355 2016.
[WGC18] Sameer Wagh Divya Gupta and Nishanth Chandran. SecureNN: Efficient and private neural network training. Cryptology ePrint Archive 2018/442 2018.
[Wis18] Wise.io. Machine learning for the industrial internet of things. wise.io 2018. Accessed: 2018-08-24.
[Yao82] Andrew C.-C. Yao. Protocols for secure computations (extended abstract). In Foundations of Computer Science (FOCS’82) pages 160–164. IEEE 1982.
[Yao86] Andrew C.-C. Yao. How to generate and exchange secrets (extended abstract). In Foundations of Computer Science (FOCS’86) pages 162–167. IEEE 1986.
[YGL17] Fengpeng Yuan Xianyi Gao and Janne Lindqvist. How busy are you?: Predicting the interruptibility intensity of mobile users. In Conference on Human Factors in Computing Systems (CHI’17) pages 5346–5360. ACM 2017.
[ZRE15] Samee Zahur Mike Rosulek and David Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In Advances in Cryptology – EUROCRYPT’15 volume 9057 of LNCS pages 220–250. Springer 2015.