Third party apps that work on top of personal cloud services, such as Google Drive and Drop-box, require access to the user’s data in order to provide some functionality. Through detailed analysis of a hundred popular Google Drive apps from Google’s Chrome store, we discover that the existing permission model is quite often misused: around two-thirds of analyzed apps are over-privileged, i.e., they access more data than is needed for them to function. In this work, we analyze three different permission models that aim to discourage users from installing over-privileged apps. In experiments with 210 real users, we discover that the most successful permission model is our novel ensemble method that we call Far-reaching Insights. Far-reaching Insights inform the users about the data-driven insights that apps can make about them (e.g., their topics of interest, collaboration and activity patterns etc.) Thus, they seek to bridge the gap between what third parties can actually know about users and users’ perception of their privacy leakage. The efficacy of Far-reaching Insights in bridging this gap is demonstrated by our results, as Far-reaching Insights prove to be, on average, twice as effective as the current model in discouraging users from installing over-privileged apps. In an effort to promote general privacy awareness, we deployed PrivySeal, a publicly available privacy-focused app store that uses Far-reaching Insights. Based on the knowledge extracted from data of the store’s users (over 115 gigabytes of Google Drive data from 1440 users with 662 installed apps), we also delineate the ecosystem for 3rd party cloud apps from the standpoint of developers and cloud providers. Finally, we present several general recommendations that can guide other future works in the area of privacy for the cloud. To the best of our knowledge, ours is the first work that tackles the privacy risk posed by 3rd party apps on cloud platforms in such depth.
If the inline PDF is not rendering correctly, you can download the PDF file here.
 “Google compute engine down by 10%, 240 million drive users,” http://thenextweb.com/google/2014/10/01/googleannounces-10-price-cut-compute-engine-instances-googledrive-passed-240m-active-users/, accessed: 2015-08-16.
 “Health apps run into privacy snags,” http://www.ft.com/cms/s/0/b709cf4a-12dd-11e3-a05e-00144feabdc0.html, accessed: 2015-08-16.
 B. B. Anderson, C. B. Kirwan, J. L. Jenkins, D. Eargle, S. Howard, and A. Vance, “How polymorphic warnings reduce habituation in the brain: Insights from an fmri study,” in Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, ser. CHI ’15. New York, NY, USA: ACM, 2015, pp. 2883-2892.
 M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander, “Optics: Ordering points to identify the clustering structure,” in Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ser. SIGMOD ’99. New York, NY, USA: ACM, 1999, pp. 49-60.
 S. B. Barnes, “A privacy paradox: Social networking in the united states,” First Monday, vol. 11, no. 9, Sep. 2006.
 D. Bates, M. Mächler, B. Bolker, and S. Walker, “Fitting linear mixed-effects models using lme4,” Journal of Statistical Software, vol. 67, no. 1, pp. 1-48, 2015.
 P. H. Chia, Y. Yamamoto, and N. Asokan, “Is this app safe?: A large scale study on application permissions and risk signals,” in Proceedings of the 21st International Conference on World Wide Web, ser. WWW ’12. New York, NY, USA: ACM, 2012, pp. 311-320.
 D. Downey, M. Broadhead, and O. Etzioni, “Locating complex named entities in web text,” in Proceedings of the 20th International Joint Conference on Artifical Intelligence, ser. IJCAI’07. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2007, pp. 2733-2739.
 W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth, “Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones,” in Proceedings of the 9th USENIX Conference on Operating Systems Design and Implementation, ser. OSDI’10. Berkeley, CA, USA: USENIX Association, 2010, pp. 1-6.
 A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner, “Android permissions demystified,” in Proceedings of the 18th ACM Conference on Computer and Communications Security, ser. CCS ’11. New York, NY, USA: ACM, 2011, pp. 627-638.
 A. P. Felt, K. Greenwood, and D. Wagner, “The effectiveness of application permissions,” in Proceedings of the 2Nd USENIX Conference on Web Application Development, ser. WebApps’11. Berkeley, CA, USA: USENIX Association, 2011, pp. 7-7.
 Y. Gurevich, E. Hudis, and J. M. Wing, “Inverse privacy (revised),” Tech. Rep. MSR-TR-2015-37, May 2015.
 M. Harbach, M. Hettig, S. Weber, and M. Smith, “Using personal examples to improve risk communication for security & privacy decisions,” in Proceedings of the 32Nd Annual ACM Conference on Human Factors in Computing Systems, ser. CHI ’14. New York, NY, USA: ACM, 2014, pp. 2647-2656.
 H. Harkous, R. Rahman, and K. Aberer, “C3p: Contextaware crowdsourced cloud privacy,” in Privacy Enhancing Technologies. Springer, 2014, pp. 102-122.
 T. Hothorn, F. Bretz, and P. Westfall, “Simultaneous inference in general parametric models,” Biometrical Journal, vol. 50, no. 3, pp. 346-363, 2008.
 M. Huber, M. Mulazzani, S. Schrittwieser, and E. Weippl, “Appinspect: Large-scale evaluation of social networking apps,” in Proceedings of the First ACM Conference on Online Social Networks, ser. COSN ’13. New York, NY, USA: ACM, 2013, pp. 143-154.
 Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” arXiv preprint arXiv:1408.5093, 2014.
 P. G. Kelley, L. F. Cranor, and N. Sadeh, “Privacy as part of the app decision-making process,” in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’13. New York, NY, USA: ACM, 2013, pp. 3393-3402.
 J. Klontz, B. Klare, S. Klum, A. Jain, and M. Burge, “Open source biometric recognition,” in Biometrics: Theory, Applications and Systems (BTAS), 2013 IEEE Sixth International Conference on, Sept 2013, pp. 1-8.
 A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097-1105.
 R. Pandita, X. Xiao, W. Yang, W. Enck, and T. Xie, “Whyper: Towards automating risk assessment of mobile applications,” in Proceedings of the 22Nd USENIX Conference on Security, ser. SEC’13. Berkeley, CA, USA: USENIX Association, 2013, pp. 527-542.
 F. Schaub, R. Balebako, A. L. Durity, and L. F. Cranor, “A design space for effective privacy notices,” in Proceedings of the Eleventh Symposium on Usable Privacy and Security, 2015.
 G. Tsoumakas, I. Katakis, and I. Vlahavas, “Mining multi-label data,” in Data Mining and Knowledge Discovery Handbook, O. Maimon and L. Rokach, Eds. Springer US, 2010, pp. 667-685.
 Y. Wang, P. G. Leon, K. Scott, X. Chen, A. Acquisti, and L. F. Cranor, “Privacy nudges for social media: An exploratory facebook study,” in Proceedings of the 22Nd International Conference on World Wide Web, ser. WWW ’13 Companion. Republic and Canton of Geneva, Switzerland: International World Wide Web Conferences Steering Committee, 2013, pp. 763-770.
 M. S. Wogalter, B. M. Racicot, M. J. Kalsher, and S. N. Simpson, “Personalization of warning signs: The role of perceived relevance on behavioral compliance,” International Journal of Industrial Ergonomics, vol. 14, no. 3, pp. 233 -242, 1994.
 C. Zhu, F. Wen, and J. Sun, “A rank-order distance based clustering algorithm for face tagging,” in Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, ser. CVPR ’11. Washington, DC, USA: IEEE Computer Society, 2011, pp. 481-488.