From Distributional Semantics to Conceptual Spaces: A Novel Computational Method for Concept Creation

  • 1 School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Road, London E1 4NS, UK


We investigate the relationship between lexical spaces and contextually-defined conceptual spaces, offering applications to creative concept discovery. We define a computational method for discovering members of concepts based on semantic spaces: starting with a standard distributional model derived from corpus co-occurrence statistics, we dynamically select characteristic dimensions associated with seed terms, and thus a subspace of terms defining the related concept. This approach performs as well as, and in some cases better than, leading distributional semantic models on a WordNet-based concept discovery task, while also providing a model of concepts as convex regions within a space with interpretable dimensions. In particular, it performs well on more specific, contextualized concepts; to investigate this we therefore move beyond WordNet to a set of human empirical studies, in which we compare output against human responses on a membership task for novel concepts. Finally, a separate panel of judges rate both model output and human responses, showing similar ratings in many cases, and some commonalities and divergences which reveal interesting issues for computational concept discovery.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Allott, N., and Textor, M. 2012. Lexical Pragmatic Adjustment and the Nature of Ad Hoc Concepts. International Review of Pragmatics 4(2).

  • Arora, S.; Li, Y.; Liang, Y.; Ma, T.; and Risteski, A. 2015. Random Walks on Context Spaces: Towards an Explanation of the Mysteries of Semantic Word Embeddings. CoRR abs/1502.03520.

  • Baroni, M., and Lenci, A. 2010. Distributional Memory: A General Framework for Corpus Based Semantics. Computational Linguistics 36(4):673–721.

  • Baroni, M.; Dinu, G.; and Kruszewski, G. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 238–247. Baltimore, Maryland: Association for Computational Linguistics.

  • Barsalou, L. W. 1993. Flexibility, Structure, and Linguistic Vagary in Concepts: Manifestations of Compositional System of Perceptual Symbols. In Collins, A. F.; Gathercole, S. E.; Conway, M. A.; and Morris, P. E., eds., Theories of Memory. Hove: Lawrence Erlbaum Associates. 29–101.

  • Bengio, Y.; Ducharme, R.; Vincent, P.; and Jauvin, C. 2003. A Neural Probabilistic Language Model. Journal of Machine Learning Research 3:1137–1155.

  • Blei, D. M.; Ng, A. Y.; and Jordan, M. I. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3:993–1022.

  • Boden, M. A. 1990. The Creative Mind: Myths and Mechanisms. London: Weidenfeld and Nicolson.

  • Brown, P. F.; deSouza, P. V.; Mercer, R. L.; Della Pietra, V. J.; and Lai, J. C. 1992. Class-Based n-gram Models of Natural Language. Computational Linguistics 18(4):467–479.

  • Carston, R. 2010. Lexical Pragmatics, Ad Hoc Concepts and Metaphor. Italian Journal of Linguistics 22(1):153–180.

  • Cimiano, P.; Staab, S.; and Tane, J. 2003. Automatic acquisition of taxonomies from text: FCA meets NLP. In In Proceedings of ECML/PKDD Workshop on Adaptive Text Extraction and Mining.

  • Clark, A. 2006. Language, Embodiment, and the Cognitive Niche. Trends in Cognitive Sciences 10(8).

  • Clark, S. 2015. Vector Space Models of Lexical Meaning. In Lappin, S., and Fox, C., eds., The Handbook of Contemporary Semantic Theory. Wiley-Blackwell.

  • Collobert, R., and Weston, J. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25 th International Conference on Machine Learning.

  • Davidson, D. 1974. On the Very Idea of a Conceptual Scheme. In Proceedings and Addresses of the American Philosophical Association, volume 47, 5–20.

  • Deerwester, S.; Dumais, S. T.; Furnas, G. W.; Landauer, T. K.; and Harshman, R. 1990. Indexing by Latent Semantic Analysis. Jounal of the American Society for Information Science 41(6):391–407.

  • Fauconnier, G., and Turner, M. 1998. Conceptual Integration Networks. Cognitive Science 22(4):133–187.

  • Fellbaum, C., ed. 1998. WordNet: An Electronic Lexical Database. MIT Press.

  • Gärdenfors, P. 2000. Conceptual Space: The Geometry of Thought. Cambridge, MA: The MIT Press.

  • Gibson, J. J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Miffline.

  • Grefenstette, E., and Sadrzadeh, M. 2011. Experimental Support for a Categorical Compositional Distributional Model of Meaning. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing.

  • Harris, Z. 1957. Co-Occurrence and Transformation in Linguistic Structure. Language 33(3):283–340.

  • Hassan, S., and Mihalcea, R. 2011. Semantic Relatedness Using Salient Semantic Analysis. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence.

  • Hill, F.; Korhonen, A.; and Bentz, C. 2014. A Quantitative Empirical Analysis of the Abstract/Concrete Distinction. Cognitive Science 38:162–177.

  • Huang, E. H.; Socher, R.; Manning, C. D.; and Ng, A. Y. 2012. Improving Word Representations via Global Context and Multiple Word Prototypes. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, volume Long Papers: Volume 1, 873–882.

  • Kanejiya, D.; Kumar, A.; and Prasad, S. 2003. Automatic Evaluation of Students Answers using Syntactically Enhanced LSA. In Proceedings of the HLT-NAACL 03 workshop on Building educational applications using natural language processing, 53–60.

  • Koestler, A. 1964. The Act of Creation. London, UK: Hutchinson.

  • Lapesa, G., and Evert, S. 2013. Evaluating Neighbor Rank and Distance Measures as Predictors of Semantic Priming. In Proceedings of the ACL Workshop on Cognitive Modeling and Computational Linguistics.

  • Levy, O.; Goldberg, Y.; and Dagan, I. 2015. Improving Distributional Similarity with Lessons Learned from Word Embeddings. Transactions of the Association for Computational Linguistics 3.

  • Lund, K., and Burgess, K. 1996. Producing High-Dimensional Semantic Spaces from Lexical Co-Occurrence. Behavior Research Methods, Instruments, and Computers 28(2):203–208.

  • Manning, C., and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. MIT Press.

  • Mikolov, T.; Chen, K.; Corrado, G.; and Dean, J. 2013. Efficient Estimation of Word Representations in Vector Space. In Proceedings of ICLR Workshop.

  • Mikolov, T.; Yih, W.-T.; and Zweig, G. 2013. Linguistic Regularities in Continuous Space Word Representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 246–251.

  • Milajevs, D.; Kartsaklis, D.; Sadrzadeh, M.; and Purver, M. 2014. Evaluating Neural Word Representations in Tensor-Based Compositional Settings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 708–719. Doha, Qatar: Association for Computational Linguistics.

  • Moffat, D., and Kelly, M. 2006. An investigation into people’s bias against computational creativity in music composition. In Proceedings of the International Joint Workshop on Computational Creativity.

  • Padó, S., and Lapata, M. 2007. Dependency-Based Construction of Semantic Space Models. Computational Linguistics 33(2):161–199.

  • Pearce, M. T.; Müllensiefen, D.; and Wiggins, G. A. 2010. The role of expectation and probabilistic learning in auditory boundary perception: A model comparison. Perception 39(10):1367–1391.

  • Pennington, J.; Socher, R.; and Manning, C. D. 2014. GloVe: Global Vectors for Word Representation. In Conference on Empirical Methods in Natural Language Processing.

  • Ritchie, G. 2007. Some empirical criteria for attributing creativity to a computer program. Minds and Machines 17(1):67–99.

  • Rychlý, P., and Kilgarriff, A. 2007. An efficient algorithm for building a distributional thesaurus (and other Sketch Engine developments). In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, 41–44. Prague, Czech Republic: Association for Computational Linguistics.

  • Schütze, H. 1992. Dimensions of Meaning. In Proc. ACM/IEEE Conference, 787–796.

  • Snow, R.; Jurafsky, D.; and Ng, A. Y. 2006. Semantic Taxonomy Induction from Heterogenous Evidence. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, 801–808. Sydney, Australia: Association for Computational Linguistics.

  • Socher, R.; Huval, B.; Manning, C. D.; and Ng, A. Y. 2012. Semantic Compositionality Through Recursive Matrix-vector Spaces. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL ’12, 1201–1211. Stroudsburg, PA, USA: Association for Computational Linguistics.

  • Turney, P. D., and Pantel, P. 2010. From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research (37):141–188.

  • Widdows, D. 2004. Geometry and Meaning. Stanford, CA: CSLI Publications.

  • Wiggins, G. A. 2006. A Preliminary Framework for Description, Analysis and Comparison of Creative Systems. Journal of Knowledge Based Systems 19(7):449–458.

  • Yogatama, D.; Faruqui, M.; Dyer, C.; and Smith, N. A. 2015. Learning Word Representations with Hierarchical Sparse Coding. In Proceedings of the 32nd International Conference on Machine Learning.


Journal + Issues