My Genome Belongs to Me: Controlling Third Party Computation on Genomic Data

Open access

Abstract

An individual’s genetic information is possibly the most valuable personal information. While knowledge of a person’s DNA sequence can facilitate the diagnosis of several heritable diseases and allow personalized treatment, its exposure comes with significant threats to the patient’s privacy. Currently known solutions for privacy-respecting computation require the owner of the DNA to either be heavily involved in the execution of a cryptographic protocol or to completely outsource the access control to a third party. This motivates the demand for cryptographic protocols which enable computation over encrypted genomic data while keeping the owner of the genome in full control. We envision a scenario where data owners can exercise arbitrary and dynamic access policies, depending on the intended use of the analysis results and on the credentials of who is conducting the analysis. At the same time, data owners are not required to maintain a local copy of their entire genetic data and do not need to exhaust their computational resources in an expensive cryptographic protocol.

In this work, we present METIS, a system that assists the computation over encrypted data stored in the cloud while leaving the decision on admissible computations to the data owner. It is based on garbled circuits and supports any polynomially-computable function. A critical feature of our system is that the data owner is free from computational overload and her communication complexity is independent of the size of the input data and only linear in the size of the circuit’s output. We demonstrate the practicality of our approach with an implementation and an evaluation of several functions over real datasets.

[1] Breast cancer risk factors - genetics. http://www.breastcancer.org/risk/factors/genetics.

[2] Python cryptography toolkit (pycrypto). https://pypi.python.org/pypi/pycrypto. Accessed: 2017-05-18.

[3] Research – 23andme. https://www.23andme.com/enint/research/. [Online; accessed 28-May-2018].

[4] Researchkit. http://researchkit.org/. [Online; accessed 28-May-2018].

[5] Initial sequencing and analysis of the human genome. Nature, 409(6822):860–921, 02 2001.

[6] Gail-Joon Ahn, Moti Yung, and Ninghui Li, editors. ACM CCS 14, Scottsdale, AZ, USA, November 3–7, 2014. ACM Press.

[7] Gilad Asharov, Yehuda Lindell, Thomas Schneider, and Michael Zohner. More efficient oblivious transfer and extensions for faster secure computation. In Ahmad-Reza Sadeghi, Virgil D. Gligor, and Moti Yung, editors, ACM CCS 13, pages 535–548, Berlin, Germany, November 4–8, 2013. ACM Press.

[8] Erman Ayday, Emiliano De Cristofaro, Jean-Pierre Hubaux, and Gene Tsudik. Whole genome sequencing: Revolutionary medicine or privacy nightmare? Computer, 48(2):58–66, 2015.

[9] Erman Ayday, Jean Louis Raisaro, and Jean-Pierre Hubaux. Privacy-enhancing technologies for medical tests using genomic data. Technical report, 2012.

[10] Erman Ayday, Jean Louis Raisaro, Paul J McLaren, Jacques Fellay, and Jean-Pierre Hubaux. Privacy-preserving computation of disease risk by using genomic, clinical, and environmental data. In HealthTech, 2013.

[11] Pierre Baldi, Roberta Baronio, Emiliano De Cristofaro, Paolo Gasti, and Gene Tsudik. Countering gattaca: efficient and secure testing of fully-sequenced human genomes. In Proceedings of the 18th ACM conference on Computer and communications security, pages 691–702. ACM, 2011.

[12] Donald Beaver, Silvio Micali, and Phillip Rogaway. The round complexity of secure protocols. In Proceedings of the twenty-second annual ACM symposium on Theory of computing, pages 503–513. ACM, 1990.

[13] Mihir Bellare, Viet Tung Hoang, and Phillip Rogaway. Adaptively secure garbling with applications to one-time programs and secure outsourcing. Cryptology ePrint Archive, Report 2012/564, 2012. http://eprint.iacr.org/2012/564.

[14] Mihir Bellare, Viet Tung Hoang, and Phillip Rogaway. Foundations of garbled circuits. Cryptology ePrint Archive, Report 2012/265, 2012. http://eprint.iacr.org/2012/265.

[15] Mihir Bellare, Viet Tung Hoang, and Phillip Rogaway. Foundations of garbled circuits. In Yu et al. [69], pages 784–796.

[16] Mihir Bellare and Phillip Rogaway. Random oracles are practical: A paradigm for designing efficient protocols. In V. Ashby, editor, ACM CCS 93, pages 62–73, Fairfax, Virginia, USA, November 3–5, 1993. ACM Press.

[17] Ran Canetti and Juan A. Garay, editors. CRYPTO 2013, Part II, volume 8043 of LNCS, Santa Barbara, CA, USA, August 18–22, 2013. Springer, Heidelberg, Germany.

[18] Ran Canetti, Oded Goldreich, and Shai Halevi. The random oracle methodology, revisited (preliminary version). In 30th ACM STOC, pages 209–218, Dallas, TX, USA, May 23–26, 1998. ACM Press.

[19] Henry Carter, Charles Lever, and Patrick Traynor. Whitewash: Outsourcing garbled circuit generation for mobile devices. In Proceedings of the 30th Annual Computer Security Applications Conference, pages 266–275. ACM, 2014.

[20] Henry Carter, Benjamin Mood, Patrick Traynor, and Kevin Butler. Secure outsourced garbled circuit evaluation for mobile devices. In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13), pages 289–304, Washington, D.C., 2013. USENIX.

[21] J.Lawrence Carter and Mark N. Wegman. Universal classes of hash functions. Journal of Computer and System Sciences, 18(2):143 – 154, 1979.

[22] Seung Geol Choi, Jonathan Katz, Ranjit Kumaresan, and Hong-Sheng Zhou. On the security of the Free-XOR technique. Cryptology ePrint Archive, Report 2011/510, 2011. http://eprint.iacr.org/2011/510.

[23] Peter JA Cock, Christopher J Fields, Naohisa Goto, Michael L Heuer, and Peter M Rice. The sanger fastq file format for sequences with quality scores, and the solexa/illumina fastq variants. Nucleic acids research, 38(6):1767–1771, 2010.

[24] Francis S Collins, Lisa D Brooks, and Aravinda Chakravarti. A dna polymorphism discovery resource for research on human genetic variation. Genome research, 8(12):1229–1231, 1998.

[25] Ronald Cramer and Victor Shoup. Universal hash proofs and a paradigm for adaptive chosen ciphertext secure public-key encryption. In Lars R. Knudsen, editor, EUROCRYPT 2002, volume 2332 of LNCS, pages 45–64, Amsterdam, The Netherlands, April 28 – May 2, 2002. Springer, Heidelberg, Germany.

[26] Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A Albers, Eric Banks, Mark A DePristo, Robert E Handsaker, Gerton Lunter, Gabor T Marth, Stephen T Sherry, et al. The variant call format and vcftools. Bioinformatics, 27(15):2156–2158, 2011.

[27] Cynthia Dwork. Differential privacy: A survey of results. In International Conference on Theory and Applications of Models of Computation, pages 1–19. Springer, 2008.

[28] Keith B Frikken. Practical private dna string searching and matching through efficient oblivious automata evaluation. In IFIP Annual Conference on Data and Applications Security and Privacy, pages 81–94. Springer, 2009.

[29] Craig Gentry, Shai Halevi, and Vinod Vaikuntanathan. i-Hop homomorphic encryption and rerandomizable Yao circuits. In Tal Rabin, editor, CRYPTO 2010, volume 6223 of LNCS, pages 155–172, Santa Barbara, CA, USA, August 15–19, 2010. Springer, Heidelberg, Germany.

[30] Ran Gilad-Bachrach, Kim Laine, Kristin Lauter, Peter Rindal, and Mike Rosulek. Secure data exchange: A marketplace in the cloud. Cryptology ePrint Archive, Report 2016/620, 2016. http://eprint.iacr.org/2016/620.

[31] Oded Goldreich, Shafi Goldwasser, and Silvio Micali. How to construct random functions. Journal of the ACM, 33(4):792–807, October 1986.

[32] Yan Huang, David Evans, Jonathan Katz, and Lior Malka. Faster secure two-party computation using garbled circuits. In USENIX Security Symposium, volume 201, 2011.

[33] Yan Huang, Jonathan Katz, and David Evans. Quid-proquo-tocols: Strengthening semi-honest protocols with dual execution. In Security and Privacy (SP), 2012 IEEE Symposium on, pages 272–284. IEEE, 2012.

[34] Yan Huang, Jonathan Katz, and David Evans. Efficient secure two-party computation using symmetric cut-and-choose. In Canetti and Garay [17], pages 18–35.

[35] Yuval Ishai, Joe Kilian, Kobbi Nissim, and Erez Petrank. Extending oblivious transfers efficiently. In Dan Boneh, editor, CRYPTO 2003, volume 2729 of LNCS, pages 145–161, Santa Barbara, CA, USA, August 17–21, 2003. Springer, Heidelberg, Germany.

[36] Thomas P Jakobsen, Jesper Buus Nielsen, and Claudio Orlandi. A framework for outsourcing of secure computation. In Proceedings of the 6th edition of the ACM Workshop on Cloud Computing Security, pages 81–92. ACM, 2014.

[37] Mark A Jensen, Vincent Ferretti, Robert L Grossman, and Louis M Staudt. The nci genomic data commons as an engine for precision medicine. Blood, 130(4):453–459, 2017.

[38] Somesh Jha, Louis Kruger, and Vitaly Shmatikov. Towards practical privacy for genomic computation. In Security and Privacy, 2008. SP 2008. IEEE Symposium on, pages 216–230. IEEE, 2008.

[39] Lynn B Jorde and Stephen P Wooding. Genetic variation, classification and’race’. Nature genetics, 36:S28–S33, 2004.

[40] Madhu Kalia. Personalized oncology: recent advances and future challenges. Metabolism, 62:S11–S14, 2013.

[41] Seny Kamara, Payman Mohassel, and Mariana Raykova. Outsourcing multi-party computation. IACR Cryptology ePrint Archive, 2011:272, 2011.

[42] Seny Kamara, Payman Mohassel, and Ben Riva. Salus: a system for server-aided secure function evaluation. In Yu et al. [69], pages 797–808.

[43] Murat Kantarcioglu, Wei Jiang, Ying Liu, and Bradley Malin. A cryptographic approach to securely share and query genomic sequences. IEEE Transactions on information technology in biomedicine, 12(5):606–617, 2008.

[44] Nikolaos Karvelas, Andreas Peter, Stefan Katzenbeisser, Erik Tews, and Kay Hamacher. Privacy-preserving whole genome sequence processing through proxy-aided oram. In Proceedings of the 13th Workshop on Privacy in the Electronic Society, WPES ‘14, pages 1–10, New York, NY, USA, 2014. ACM.

[45] David J Kaufman, Juli Murphy-Bollinger, Joan Scott, and Kathy L Hudson. Public opinion about the importance of privacy in biobank research. The American Journal of Human Genetics, 85(5):643–654, 2009.

[46] Jane Kaye, Liam Curren, Nick Anderson, Kelly Edwards, Stephanie M Fullerton, Nadja Kanellopoulou, David Lund, Daniel G MacArthur, Deborah Mascalzoni, James Shepherd, et al. From patients to partners: participant-centric initiatives in biomedical research. Nature Reviews Genetics, 13(5):371, 2012.

[47] Miran Kim and Kristin Lauter. Private genome analysis through homomorphic encryption. Cryptology ePrint Archive, Report 2015/965, 2015. http://eprint.iacr.org/2015/965.

[48] Vladimir Kolesnikov, Payman Mohassel, and Mike Rosulek. FleXOR: Flexible garbling for XOR gates that beats free-XOR. In Juan A. Garay and Rosario Gennaro, editors, CRYPTO 2014, Part II, volume 8617 of LNCS, pages 440–457, Santa Barbara, CA, USA, August 17–21, 2014. Springer, Heidelberg, Germany.

[49] Vladimir Kolesnikov and Thomas Schneider. Improved garbled circuit: Free xor gates and applications. Automata, Languages and Programming, pages 486–498, 2008.

[50] Vladimir Kolesnikov and Thomas Schneider. Improved garbled circuit: Free XOR gates and applications. In Luca Aceto, Ivan Damgård, Leslie Ann Goldberg, Magnús M. Halldórsson, Anna Ingólfsdóttir, and Igor Walukiewicz, editors, ICALP 2008, Part II, volume 5126 of LNCS, pages 486–498, Reykjavik, Iceland, July 7–11, 2008. Springer, Heidelberg, Germany.

[51] Benjamin Kreuter, Abhi Shelat, Benjamin Mood, and Kevin RB Butler. Pcf: A portable circuit format for scalable two-party secure computation. In Usenix Security, volume 13, pages 321–336, 2013.

[52] Benjamin Kreuter, Abhi Shelat, and Chih-Hao Shen. Billion-gate secure computation with malicious adversaries. In USENIX Security Symposium, volume 12, pages 285–300, 2012.

[53] Yehuda Lindell. Fast cut-and-choose based protocols for malicious and covert adversaries. In Canetti and Garay [17], pages 1–17.

[54] Yehuda Lindell and Benny Pinkas. Privacy preserving data mining. In Mihir Bellare, editor, CRYPTO 2000, volume 1880 of LNCS, pages 36–54, Santa Barbara, CA, USA, August 20–24, 2000. Springer, Heidelberg, Germany.

[55] Yehuda Lindell and Benny Pinkas. A proof of security of yao’s protocol for two-party computation. Journal of cryptology, 22(2):161–188, 2009.

[56] Dahlia Malkhi, Noam Nisan, Benny Pinkas, Yaron Sella, et al. Fairplay-secure two-party computation system. In USENIX Security Symposium, volume 4. San Diego, CA, USA, 2004.

[57] Neil A. Miller, Emily G. Farrow, Margaret Gibson, Laurel K. Willig, Greyson Twist, Byunggil Yoo, Tyler Marrs, Shane Corder, Lisa Krivohlavek, Adam Walter, Josh E. Petrikin, Carol J. Saunders, Isabelle Thiffault, Sarah E. Soden, Laurie D. Smith, Darrell L. Dinwiddie, Suzanne Herd, Julie A. Cakici, Severine Catreux, Mike Ruehle, and Stephen F. Kingsmore. A 26-hour system of highly sensitive whole genome sequencing for emergency management of genetic diseases. Genome Medicine, 7(1):100, 2015.

[58] Benjamin Mood, Debayan Gupta, Kevin R. B. Butler, and Joan Feigenbaum. Reuse it or lose it: More efficient secure computation through reuse of encrypted values. In Ahn et al. [6], pages 582–596.

[59] Moni Naor and Benny Pinkas. Oblivious transfer and polynomial evaluation. In 31st ACM STOC, pages 245–254, Atlanta, GA, USA, May 1–4, 1999. ACM Press.

[60] Moni Naor, Benny Pinkas, and Reuban Sumner. Privacy preserving auctions and mechanism design. In EC, pages 129–139, 1999.

[61] Moni Naor and Omer Reingold. Number-theoretic constructions of efficient pseudo-random functions. In 38th FOCS, pages 458–467, Miami Beach, Florida, October 19–22, 1997. IEEE Computer Society Press.

[62] Muhammad Naveed, Shashank Agrawal, Manoj Prabhakaran, XiaoFeng Wang, Erman Ayday, Jean-Pierre Hubaux, and Carl A. Gunter. Controlled functional encryption. In Ahn et al. [6], pages 1280–1291.

[63] Boris Pasche and Devin Absher. Whole-genome sequencing: a step closer to personalized medicine. JAMA, 305(15):1596–1597, 2011.

[64] Chris Peikert, Vinod Vaikuntanathan, and Brent Waters. A framework for efficient and composable oblivious transfer. In David Wagner, editor, CRYPTO 2008, volume 5157 of LNCS, pages 554–571, Santa Barbara, CA, USA, August 17–21, 2008. Springer, Heidelberg, Germany.

[65] Juan Ramón Troncoso-Pastoriza, Stefan Katzenbeisser, and Mehmet Celik. Privacy preserving error resilient dna searching through oblivious automata. In Proceedings of the 14th ACM conference on Computer and communications security, pages 519–528. ACM, 2007.

[66] Xiao Shaun Wang, Yan Huang, Yongan Zhao, Haixu Tang, XiaoFeng Wang, and Diyue Bu. Efficient genome-wide, privacy-preserving similar patient query based on private edit distance. In Indrajit Ray, Ninghui Li, and Christopher Kruegel:, editors, ACM CCS 15, pages 492–503, Denver, CO, USA, October 12–16, 2015. ACM Press.

[67] Mick Watson. Illuminating the future of dna sequencing. Genome biology, 15(2):108, 2014.

[68] Andrew Chi-Chih Yao. How to generate and exchange secrets (extended abstract). In 27th FOCS, pages 162–167, Toronto, Ontario, Canada, October 27–29, 1986. IEEE Computer Society Press.

[69] Ting Yu, George Danezis, and Virgil D. Gligor, editors. ACM CCS 12, Raleigh, NC, USA, October 16–18, 2012. ACM Press.

[70] Samee Zahur and David Evans. Obliv-c: A language for extensible data-oblivious computation. IACR Cryptology ePrint Archive, 2015:1153, 2015.

[71] Samee Zahur, Mike Rosulek, and David Evans. Two halves make a whole - reducing data transfer in garbled circuits using half gates. In Elisabeth Oswald and Marc Fischlin, editors, EUROCRYPT 2015, Part II, volume 9057 of LNCS, pages 220–250, Sofia, Bulgaria, April 26–30, 2015. Springer, Heidelberg, Germany.

Journal Information

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 2147 2147 57
PDF Downloads 145 145 39