Deep learning has been successful in various domains including image recognition, speech recognition and natural language processing. However, the research on its application in graph mining is still in an early stage. Here we present Model R, a neural network model created to provide a deep learning approach to the link weight prediction problem. This model uses a node embedding technique that extracts node embeddings (knowledge of nodes) from the known links’ weights (relations between nodes) and uses this knowledge to predict the unknown links’ weights. We demonstrate the power of Model R through experiments and compare it with the stochastic block model and its derivatives. Model R shows that deep learning can be successfully applied to link weight prediction and it outperforms stochastic block model and its derivatives by up to 73% in terms of prediction accuracy. We analyze the node embeddings to confirm that closeness in embedding space correlates with stronger relationships as measured by the link weight. We anticipate this new approach will provide effective solutions to more graph mining tasks.
 A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates et al, Deep speech: Scaling up end-to-end speech recognition, arXiv preprint arXiv:1412.5567, 2014.
 K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv:1409.1556, 2014.
 K. Yao, G. Zweig, M.-Y. Hwang, Y. Shi, and D. Yu, Recurrent neural networks for language understanding. in INTERSPEECH, 2013, pp. 2524–2528.
 O. Barkan and N. Koenigstein, Item2vec: neural item embedding for collaborative filtering, in Machine Learning for Signal Processing (MLSP), 2016 IEEE 26th International Workshop on. IEEE, 2016, pp. 1–6.
 A. Grover and J. Leskovec, node2vec: Scalable feature learning for networks, in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2016, pp. 855–864.
 D. Liben-Nowell and J. Kleinberg, The link-prediction problem for social networks, Journal of the Association for Information Science and Technology, vol. 58, no. 7, pp. 1019–1031, 2007.
 M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, Link prediction using supervised learning, in SDM06: Workshop on Link Analysis, Counterterrorism and Security, 2006.
 J. Zhao, L. Miao, J. Yang, H. Fang, Q.-M. Zhang, M. Nie, P. Holme, and T. Zhou, Prediction of links and weights in networks by reliable routes, Scientific Reports, vol. 5, 2015.
 L. A. Adamic and E. Adar, Friends and neighbors on the web, Social Networks, vol. 25, no. 3, pp. 211–230, 2003.
 T. Murata and S. Moriyasu, Link prediction of social networks based on weighted proximity measures, in IEEE/WIC/ACM International Conference on Web Intelligence (WI). IEEE, 2007, pp. 85–88.
 H. A. Taha, Operations Research: An Introduction (For VTU). Pearson Education India, 1982.
 P. W. Holland, K. B. Laskey, and S. Leinhardt, Stochastic blockmodels: First steps, Social Networks, vol. 5, no. 2, pp. 109–137, 1983.
 C. Aicher, A. Z. Jacobs, and A. Clauset, Learning latent block structure in weighted networks, Journal of Complex Networks, p. cnu026, 2014.
 T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, Distributed representations of words and phrases and their compositionality, in Advances in Neural Information Processing Systems, 2013, pp. 3111–3119.
 T. Mikolov, W.-t. Yih, and G. Zweig, Linguistic regularities in continuous space word representations. in NAACL HLT, vol. 13, 2013, pp. 746–751.
 Q. Le and T. Mikolov, Distributed representations of sentences and documents, in Proceedings of the 31st International Conference on Machine Learning (ICML-14), 2014, pp. 1188–1196.
 B. Perozzi, R. Al-Rfou, and S. Skiena, Deepwalk: Online learning of social representations, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2014, pp. 701–710.
 P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders, in Proceedings of the 25th International Conference on Machine Learning. ACM, 2008, pp. 1096–1103.
 G. E. Hinton and R. R. Salakhutdinov, Reducing the dimensionality of data with neural networks, Science, vol. 313, no. 5786, pp. 504–507, 2006.
 X. Feng, Y. Zhang, and J. Glass, Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014, pp. 1759–1763.
 C.-Y. Liou, W.-C. Cheng, J.-W. Liou, and D.-R. Liou, Autoencoder for words, Neurocomputing, vol. 139, pp. 84–96, 2014.
 O. Abdel-Hamid, A.-r. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, Convolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 22, no. 10, pp. 1533–1545, 2014.
 A. Van den Oord, S. Dieleman, and B. Schrauwen, Deep content-based music recommendation, in Advances in Neural Information Processing Systems, 2013, pp. 2643–2651.
 A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
 R. Collobert and J. Weston, A unified architecture for natural language processing: Deep neural networks with multitask learning, in Proceedings of the 25th International Conference on Machine Learning. ACM, 2008, pp. 160–167.
 A. M. Elkahky, Y. Song, and X. He, A multi-view deep learning approach for cross domain user modeling in recommendation systems, in Proceedings of the 24th International Conference on World Wide Web. ACM, 2015, pp. 278–288.
 R. Socher, Y. Bengio, and C. D. Manning, Deep learning for nlp (without magic), in Tutorial Abstracts of ACL 2012. Association for Computational Linguistics, 2012, pp. 5–5.
 R. Socher, J. Bauer, C. D. Manning, and A. Y. Ng, Parsing with compositional vector grammars. in ACL (1), 2013, pp. 455–465.
 R. Socher, A. Perelygin, J. Y.Wu, J. Chuang, C. D. Manning, A. Y. Ng, C. Potts et al., Recursive deep models for semantic compositionality over a sentiment treebank, in Proceedings of the Conference on Empirical Methods on Natural Language Processing (EMNLP), vol. 1631, 2013, p. 1642.
 Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil, A latent semantic model with convolutional-pooling structure for information retrieval, in Proceedings of the 23rd ACM International Conference on Information and Knowledge Management. ACM, 2014, pp. 101–110.
 D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Learning representations by back-propagating errors, Cognitive Modeling, vol. 5, no. 3, p. 1, 1988.
 Y. A. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller, Efficient backprop, in Neural Networks: Tricks of the Trade. Springer, 2012, pp. 9–48.
 J. Mairal, F. Bach, J. Ponce, and G. Sapiro, Online learning for matrix factorization and sparse coding, The Journal of Machine Learning Research, vol. 11, pp. 19–60, 2010.
 S. Smale and D.-X. Zhou, Learning theory estimates via integral operators and their approximations, Constructive Approximation, vol. 26, no. 2, pp. 153–172, 2007.
 Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436–444, 2015.
 V. Colizza, R. Pastor-Satorras, and A. Vespignani, Reaction–diffusion processes and metapopulation models in heterogeneous networks, Nature Physics, vol. 3, no. 4, pp. 276–282, 2007.
 R. K. Pan, K. Kaski, and S. Fortunato, World citation and collaboration networks: uncovering the role of geography in science, Scientific Reports, vol. 2, 2012.
 M. A. Porter, P. J. Mucha, M. E. Newman, and C. M. Warmbrand, A network analysis of committees in the us house of representatives, Proceedings of the National Academy of Sciences of the United States of America, vol. 102, no. 20, pp. 7057–7062, 2005.
 T. Opsahl and P. Panzarasa, Clustering in weighted networks, Social Networks, vol. 31, no. 2, pp. 155–163, 2009.
 M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., Tensorflow: A system for large-scale machine learning. in Operating Systems Design and Implementation (OSDI), vol. 16, 2016, pp. 265–283.
 F. M. Harper and J. A. Konstan, The movielens datasets: History and context, ACM Transactions on Interactive Intelligent Systems (TiiS), vol. 5, no. 4, p. 19, 2015.