An English Neural Network that Learns Texts, Finds Hidden Knowledge, and Answers Questions

Yuanzhi Ke 1  and Masafumi Hagiwara 1
  • 1 Graduate School of Science and Technology, Keio University, Japan


In this paper, a novel neural network is proposed, which can automatically learn and recall contents from texts, and answer questions about the contents in either a large corpus or a short piece of text. The proposed neural network combines parse trees, semantic networks, and inference models. It contains layers corresponding to sentences, clauses, phrases, words and synonym sets. The neurons in the phrase-layer and the word-layer are labeled with their part-of-speeches and their semantic roles. The proposed neural network is automatically organized to represent the contents in a given text. Its carefully designed structure and algorithms make it able to take advantage of the labels and neurons of synonym sets to build the relationship between the sentences about similar things. The experiments show that the proposed neural network with the labels and the synonym sets has the better performance than the others that do not have the labels or the synonym sets while the other parts and the algorithms are the same. The proposed neural network also shows its ability to tolerate noise, to answer factoid questions, and to solve single-choice questions in an exercise book for non-native English learners in the experiments.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • [1] E. Brill, A simple rule-based part of speech tagger, In Proceedings of the Workshop on Speech and Natural Language, HLT’91, pp. 112-116, Association for Computational Linguistics, Stroudsburg, PA, USA, 1992

  • [2] C. D. Manning and H. Sch¨utze, Foundations of statistical natural language processing, MIT press, 1999

  • [3] C. E. Shannon, A mathematical theory of communication, SIGMOBILE Mob. Comput. Commun. Rev., vol. 5, no. 1, pp. 3-55, 2001

  • [4] N. Chomsky, Three models for the description of language, Information Theory, IRE Transactions on, vol. 2, no. 3, pp.113-124, 1956

  • [5] W. A. Gale, K. W. Church, and D. Yarowsky, Work on statistical methods for word sense disambiguation, In Working Notes of the AAAI Fall Symposium on Probabilistic Approaches to Natural Language, vol. 54, p. 60. 1992

  • [6] J. Kupiec, Robust part-of-speech tagging using a hidden markov model, Computer Speech & Language, vol. 6, no. 3, pp. 225 - 242, 1992

  • [7] H. Schmid, Probabilistic part-of-speech tagging using decision trees, in Proceedings of the international conference on new methods in language processing, vol. 12, pp. 44-49. Citeseer, 1994

  • [8] A. Ratnaparkhi et al, A maximum entropy model for part-of-speech tagging, in Proceedings of the conference on empirical methods in natural language processing, vol. 1, pp. 133-142. Philadelphia, USA, 1996

  • [9] P. F. Brown, V. J. D. Pietra, S. A. D. Pietra, and R. L. Mercer, The mathematics of statistical machine translation: Parameter estimation, Computational linguistics, vol. 19, no. 2, pp. 263-311, 1993

  • [10] H. Hotta, M. Kittaka, and M. Hagiwara,Word vectorization using relations among words for neural network, IEEJ Transactions on Electronics, Information and Systems, vol. 130, pp. 75-82, 2010

  • [11] G. Tsatsaronis, I. Varlamis, and M. Vazirgiannis, Text relatedness based on a word thesaurus, Journal of Artificial Intelligence Research, vol. 37, no. 1, pp. 1-40, 2010

  • [12] G. Tsatsaronis, I. Varlamis, and M. Vazirgiannis, Word sense disambiguation with semantic networks, In Text, Speech and Dialogue, pp. 219-226. Springer, 2008

  • [13] R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, Natural language processing (almost) from scratch, The Journal of Machine Learning Research, vol. 12, pp. 2493-2537, 2011

  • [14] T. Sagara and M. Hagiwara, Natural language neural network and its application to questionanswering system, Neurocomputing, vol. 142, pp. 201 - 208, 2014

  • [15] L. Dong, F. Wei, M. Zhou, and K. Xu, Question answering over Freebase with multi-column convolutional neural networks, in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 260-269. Association for Computational Linguistics, Beijing, China, July 2015

  • [16] D. Bahdanau, K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate, CoRR, vol. abs/1409.0473, 2014

  • [17] J. E. Hummel and K. J. Holyoak, Distributed representations of structure: A theory of analogical access and mapping, Psychological Review, vol. 104, no. 3, p. 427, 1997

  • [18] J. E. Hummel and K. J. Holyoak, A symbolicconnectionist theory of relational inference and generalization, Psychological review, vol. 110, no. 2, p. 220, 2003

  • [19] J. E. Hummel and K. J. Holyoak, Relational reasoning in a neurally plausible cognitive architecture an overview of the LISA project, Current Directions in Psychological Science, vol. 14, no. 3, pp. 153-157, 2005

  • [20] M. Saito and M. Hagiwara, Natural language processing neural network for analogical inference, In The 2010 International Joint Conference on Neural Networks, pp.1-7, 2010

  • [21] T. Kudo and H. Kazawa, Web Japanese N-gram version 1, Gengo Shigen Kyokai, vol. 14, 2007

  • [22] M. Fukuda, S. Nobesawa, and I. Tahara, Knowledge representation for query-answer, In Forum on Information Technology, vol. 3, pp. 233-236, Information Processing Society of Japan, 2004

  • [23] S. Ikehara, M. Miyazaki, S. Shirai, A. Yokoo, H. Nakaiwa, K. Ogura, Y. Ooyama, and Y. Hayashi, GoiTaikei-A Japanese Lexicon, Iwanami Shoten, 1997

  • [24] T. Kudo, K. Yamamoto, and Y. Matsumoto, Applying conditional random fields to Japanese morphological analysis, in Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, vol. 2004, pp. 230-237. 2004

  • [25] G. A. Miller, WordNet: a lexical database for english, Communications of the ACM, vol. 38, no. 11, pp. 39-41, 1995

  • [26] G. Miller and C. Fellbaum, WordNet: An electronic lexical database, 1998

  • [27] M. P. Marcus, M. A. Marcinkiewicz, and B. Santorini, Building a large annotated corpus of english: The Penn Treebank, Comput. Linguist., vol. 19, no. 2, pp. 313-330, June 1993

  • [28] M. Marcus, G. Kim, M. A. Marcinkiewicz, R. MacIntyre, A. Bies, M. Ferguson, K. Katz, and B. Schasberger, The Penn Treebank: Annotating predicate argument structure, in Proceedings of the Workshop on Human Language Technology, HLT ’94, pp. 114-119, Association for Computational Linguistics, Stroudsburg, PA, USA, 1994

  • [29] P. K. Martha and M. Palmer, From Treebank to Propbank, in Proceedings of the International Conference on Language Resources and Evaluation 2002, Las Palmas, Canary Islands, Spain, 2002

  • [30] P. Kingsbury, M. Palmer, and M. Marcus, Adding semantic annotation to the penn treebank, in Proceedings of the Human Language Technology Conference, pp. 252-256, Citeseer, 2002

  • [31] M. Palmer, D. Gildea, and P. Kingsbury, The proposition bank: An annotated corpus of semantic roles, Comput. Linguist., vol. 31, no. 1, pp. 71-106, March 2005

  • [32] P. E.Woodford, The test of english for international communication (TOEIC), 1980

  • [33] National Institute of Standards and Technology, NIST TREC Document Database: Disk 4, Retrieved June 25, 2016, from

  • [34] National Institute of Standards and Technology, NIST TREC Document Database: Disk 5, Retrieved June 25, 2016, from

  • [35] E. M. Voorhees et al. The TREC-8 question answering track report, in Proceedings of the 8th Text Retreval Conference, vol. 99, pp.77-82. NIST, Gaithersburg, MD, 1999

  • [36] L. Loungheed, Longman preparation series for the new TOEIC test: More practice tests, 2006

  • [37] T. S. Committee et al., TOEIC program data & analysis, 2014

  • [38] E. T. Service, Examinee handbook listening & reading, 2008


Journal + Issues