Qiaozhi Wang, Hao Xue, Fengjun Li, Dongwon Lee and Bo Luo
With the growing popularity of online social networks, a large amount of private or sensitive information has been posted online. In particular, studies show that users sometimes reveal too much information or unintentionally release regretful messages, especially when they are careless, emotional, or unaware of privacy risks. As such, there exist great needs to be able to identify potentially-sensitive online contents, so that users could be alerted with such findings. In this paper, we propose a context-aware, text-based quantitative model for private information assessment, namely PrivScore, which is expected to serve as the foundation of a privacy leakage alerting mechanism. We first solicit diverse opinions on the sensitiveness of private information from crowdsourcing workers, and examine the responses to discover a perceptual model behind the consensuses and disagreements. We then develop a computational scheme using deep neural networks to compute a context-free PrivScore (i.e., the “consensus” privacy score among average users). Finally, we integrate tweet histories, topic preferences and social contexts to generate a personalized context-aware PrivScore. This privacy scoring mechanism could be employed to identify potentially-private messages and alert users to think again before posting them to OSNs.