Decision trees are widespread machine learning models used for data classification and have many applications in areas such as healthcare, remote diagnostics, spam filtering, etc. In this paper, we address the problem of privately evaluating a decision tree on private data. In this scenario, the server holds a private decision tree model and the client wants to classify its private attribute vector using the server’s private model. The goal is to obtain the classification while preserving the privacy of both – the decision tree and the client input. After the computation, only the classification result is revealed to the client, while nothing is revealed to the server. Many existing protocols require a constant number of rounds. However, some of these protocols perform as many comparisons as there are decision nodes in the entire tree and others transform the whole plaintext decision tree into an oblivious program, resulting in higher communication costs. The main idea of our novel solution is to represent the tree as an array. Then we execute only d – the depth of the tree – comparisons. Each comparison is performed using a small garbled circuit, which output secret-shares of the index of the next node. We get the inputs to the comparison by obliviously indexing the tree and the attribute vector. We implement oblivious array indexing using either garbled circuits, Oblivious Transfer or Oblivious RAM (ORAM). Using ORAM, this results in the first protocol with sub-linear cost in the size of the tree. We implemented and evaluated our solution using the different array indexing procedures mentioned above. As a result, we are not only able to provide the first protocol with sublinear cost for large trees, but also reduce the communication cost for the large real-world data set “Spambase” from 18 MB to 1[triangleright]2 MB and the computation time from 17 seconds to less than 1 second in a LAN setting, compared to the best related work.
Secret sharing schemes are desirable across a variety of real-world settings due to the security and privacy properties they can provide, such as availability and separation of privilege. However, transitioning secret sharing schemes from theoretical research to practical use must account for gaps in achieving these properties that arise due to the realities of concrete implementations, threat models, and use cases. We present a formalization and analysis, using Ellison’s notion of ceremonies, that demonstrates how simple variations in use cases of secret sharing schemes result in the potential loss of some security properties, a result that cannot be derived from the analysis of the underlying cryptographic protocol alone. Our framework accounts for such variations in the design and analysis of secret sharing implementations by presenting a more detailed user-focused process and defining previously overlooked assumptions about user roles and actions within the scheme to support analysis when designing such ceremonies. We identify existing mechanisms that, when applied to an appropriate implementation, close the security gaps we identified. We present our implementation including these mechanisms and a corresponding security assessment using our framework.
Encrypting data before sending it to the cloud protects it against attackers, but requires the cloud to compute on encrypted data. Trusted modules, such as SGX enclaves, promise to provide a secure environment in which data can be decrypted and then processed. However, vulnerabilities in the executed program, which becomes part of the trusted code base (TCB), give attackers ample opportunity to execute arbitrary code inside the enclave. This code can modify the dataflow of the program and leak secrets via SGX side-channels. Since any larger code base is rife with vulnerabilities, it is not a good idea to outsource entire programs to SGX enclaves. A secure alternative relying solely on cryptography would be fully homomorphic encryption. However, due to its high computational complexity it is unlikely to be adopted in the near future. Researchers have made several proposals for transforming programs to perform encrypted computations on less powerful encryption schemes. Yet current approaches do not support programs making control-flow decisions based on encrypted data.
We introduce the concept of dataflow authentication (DFAuth) to enable such programs. DFAuth prevents an adversary from arbitrarily deviating from the dataflow of a program. Our technique hence offers protections against the side-channel attacks described above. We implemented DFAuth using a novel authenticated homomorphic encryption scheme, a Java bytecode-tobytecode compiler producing fully executable programs, and an SGX enclave running a small and program-independent TCB. We applied DFAuth to an existing neural network that performs machine learning on sensitive medical data. The transformation yields a neural network with encrypted weights, which can be evaluated on encrypted inputs in 0.86 s.