A Bio-Inspired Integration Method for Object Semantic Representation

Open access

Abstract

We have two motivations. Firstly, semantic gap is a tough problem puzzling almost all sub-fields of Artificial Intelligence. We think semantic gap is the conflict between the abstractness of high-level symbolic definition and the details, diversities of low-level stimulus. Secondly, in object recognition, a pre-defined prototype of object is crucial and indispensable for bi-directional perception processing. On the one hand this prototype was learned from perceptional experience, and on the other hand it should be able to guide future downward processing. Human can do this very well, so physiological mechanism is simulated here. We utilize a mechanism of classical and non-classical receptive field (nCRF) to design a hierarchical model and form a multi-layer prototype of an object. This also is a realistic definition of concept, and a representation of denoting semantic. We regard this model as the most fundamental infrastructure that can ground semantics. Here a AND-OR tree is constructed to record prototypes of a concept, in which either raw data at low-level or symbol at high-level is feasible, and explicit production rules are also available. For the sake of pixel processing, knowledge should be represented in a data form; for the sake of scene reasoning, knowledge should be represented in a symbolic form. The physiological mechanism happens to be the bridge that can join them together seamlessly. This provides a possibility for finding a solution to semantic gap problem, and prevents discontinuity in low-order structures.

[1] G. Carneiro, A.B. Chan, P.J. Moreno, and placeN. Vasconcelos, Supervised learning of semantic classes for image annotation and retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence (2007) 394-410.

[2] J.H. Su, C.L. Chou, C.Y. Lin, and V.S. Tseng, Effective Semantic Annotation by Image-to-Concept Distribution Model. Multimedia, IEEE Transactions on 13 (2011) 530-538.

[3] J. Fan, Y. Gao, H. Luo, and R. Jain, Mining multilevel image semantics via hierarchical classification. Multimedia, IEEE Transactions on 10 (2008) 167-187.

[4] J.R.R. Uijlings, A.W.M. Smeulders, and R.J.H. Scha, Real-time visual concept classification. Multimedia, IEEE Transactions on 12 (2010) 665-681.

[5] Y.G. Jiang, J. Yang, C.W. Ngo, and A.G. Hauptmann, Representations of keypoint-based semantic concept detection: A comprehensive study. Multimedia, IEEE Transactions on 12 (2010) 42-53.

[6] S. Edelman, Computational theories of object recognition. Trends in Cognitive Sciences 1 (1997) 296-304.

[7] B.J. Stankiewicz, and J.E. Hummel, MetriCat: A representation for basic and subordinate-level classification, Lawrence Erlbaum, 1996, pp. 254.

[8] I. Biederman, Recognition-by-components: A theory of human image understanding. Psychological review 94 (1987) 115.

[9] E.I. Le Dong, A Topology Preserving Approach for Image Classification. (2007).

[10] K. Engel, and K.D. Toennies, Hierarchical vibrations for part-based recognition of complex objects. Pattern Recognition 43 (2010) 2681-2691.

[11] F. Chen, H. Yu, and R. Hu, Simultaneous variational image segmentation and object recognition via shape sparse representation, IEEE, pp. 3057-3060.

[12] H. Liu, W. Liu, and L.J. Latecki, Convex shape decomposition. (2010).

[13] Y. Li, and J. Feng, Sparse representation shape model, IEEE, pp. 2733-2736.

[14] J. Vogel, and B. Schiele, Semantic modeling of natural scenes for content-based image retrieval. International Journal of Computer Vision 72 (2007) 133-157.

[15] B. Peng, L. Zhang, D. Zhang, and J. Yang, Image segmentation by iterated region merging with localized graph cuts. Pattern Recognition 44 (10-11), 25272538

[16] C. Huang, Q. Liu, and S. Yu, Regions of interest extraction from color image based on visual saliency. The Journal of Supercomputing1-14.

[17] K.S. Kim, M.J. Lee, J.W. Lee, T.W. Oh, and H.Y. Lee, Region-based tampering detection and recovery using homogeneity analysis in quality-sensitive imaging. Computer Vision and Image Understanding 115 (2011) 1308-1323.

[18] R. Farrahi Moghaddam, and M. Cheriet, Beyond pixels and regions: A non local patch means (NLPM) method for content-level restoration, enhancement, and reconstruction of degraded document images. Pattern Recognition 44 (2), 363-374

[19] C. Xiao, M. Liu, Y. Nie, and Z. Dong, Fast Exact Nearest Patch Matching for Patch-based Image Editing and Processing. IEEE Transactions on Visualization and Computer Graphics 17 (2011) 1122-1134

[20] Z. Xu, and J. Sun, Image inpainting by patch propagation using patch sparsity. Image Processing, IEEE Transactions on 19 (2010) 1153-1165.

[21] C.W. Fang, and J.J.J. Lien, Rapid image completion system using multiresolution patch-based directional and nondirectional approaches. Image Processing, IEEE Transactions on 18 (2009) 2769-2779.

[22] H. Shvaytser, Learnable and nonlearnable visual concepts. Pattern Analysis and Machine Intelligence, IEEE Transactions on 12 (1990) 459-466.

[23] P. Mylonas, E. Spyrou, Y. Avrithis, and S. Kollias, Using visual context and region semantics for highlevel concept detection. Multimedia, IEEE Transactions on 11 (2009) 229-243.

[24] J.W. Hsieh, and W.E.L. Grimson, Spatial template extraction for image retrieval by region matching. Image Processing, IEEE Transactions on 12 (2003) 1404-1415.

[25] D. Marr, and H.K. Nishihara, Representation and recognition of the spatial organization of threedimensional shapes. Proceedings of the Royal Society of London. Series B. Biological Sciences 200 (1978) 269.

[26] J.T. Devlin, P.M. Matthews, and M.F.S. Rushworth, Semantic processing in the left inferior prefrontal cortex: a combined functional magnetic resonance imaging and transcranial magnetic stimulation study. Journal of Cognitive Neuroscience 15 (2003) 71-84.

[27] J. Allman, F. Miezin, and E. McGuinness, Stimulus specific responses from beyond the classical receptive field: neurophysiological mechanisms for localglobal comparisons in visual neurons. Annual Review of Neuroscience 8 (1985) 407-430.

[28] D.J. Heeger, Normalization of cell responses in cat striate cortex. Visual Neuroscience 9 (1992) 181-197.

[29] X.L.I. YANG, F. GAO, and S.M. WU, Modulation of horizontal cell function by GABAA and GABAC receptors in dark-and light-adapted tiger salamander retina. Visual neuroscience 16 (1999) 967-979.

[30] Q.F.L. Chaoyi, Mathematical simulation of disinhibitory properties of concentric receptive field [J]. Acta Biophysica Sinica 11 (1995) 214-220.

[31] D. Fitzpatrick, Seeing beyond the receptive field in primary visual cortex. Current Opinion in Neurobiology 10 (2000) 438-443.

[32] A.M. Sillito, K.L. Grieve, H.E. Jones, J. Cudeiro, and J. Davis, Visual cortical mechanisms detecting focal orientation discontinuities. Nature 378 (1995) 492-496.

[33] H.R. Wilson, and R. Humanski, Spatial frequency adaptation and contrast gain control. Vision Research 33 (1993) 1133-1149.

[34] G. Krieger, and C. Zetzsche, Nonlinear image operators for the evaluation of local intrinsic dimensionality. IEEE Transactions On Image Processing 5 (1996), 1026-1042.

[35] Hui Wei, X-M Wang, L.L.Lai, A Compact Image Representation Model Based on Both nCRF and Reverse Control Mechanisms. IEEE Transactions on Neural Network and Learning Systems, Vol.23 (1), (2012), 150-162.

[36] Hui Wei, X-M Wang, A Neural Circuit Model for nCRF’s Dynamic Adjustment and its Application on Image Representation, 2011 International Joint Conference on Neural Networks, San Jose, California , July 31 - August 5, (2011), 421-429.

[37] J. Fan, Y. Gao, H. Luo, and R. Jain, Mining multilevel image semantics via hierarchical classification. Multimedia, IEEE Transactions on 10 (2008) 167-187.

[38] X.F. Ren, and J. Malik, Learning a classification model for segmentation. Ninth IEEE International Conference on Computer Vision, Vols I and II, Proceedings (2003) 10-17.

[39] A. Levinshtein, A. Stere, K.N. Kutulakos, D.J. Fleet, S.J. Dickinson, and K. Siddiqi, TurboPixels: Fast Superpixels Using Geometric Flows. IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (2009) 2290-2297.

[40] G. Mori, Guiding model search using segmentation, Computer Vision, 2005. ICCV 2005, (2005), pp. 1417-1423 Vol. 2.

[41] S. Belongie, J. Malik, and J. Puzicha, Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence (2002) 509-522.

[42] P. Arbelez, M. Maire, C. Fowlkes, and J. Malik, Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence (2010) 898-916.

[43] S. GAP, Visual-Concept Search Solved?. Computer 43 (2010) 76-78

[44] J. Luo, A.E. Savakis, and A. Singhal, A Bayesian network-based framework for semantic image understanding. Pattern Recognition 38 (2005) 919-934.

[45] S.J. Russell, P. Norvig, J.F. Canny, J.M. Malik, and D.D. Edwards, Artificial intelligence: a modern approach, Prentice hall Englewood Cliffs, NJ, (1995).

[46] X.Wang, and H.Wei, An Integration Model Based on Non-classical Receptive Fields, Springer, (2009), pp. 451-459.

[47] H.Wei, A Homogenous Associative Memory Neural Network Based on Structure Learning and Iterative Self-Mapping. Journal of Software Vol.13 (3) (2002) 438-446.

[48] Serge Belongie, Jitendra Malik, Jan Puzicha, Shape matching and object recognition using shape contexts, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.24, No.24, (2002), 509-522

Journal of Artificial Intelligence and Soft Computing Research

The Journal of Polish Neural Network Society, the University of Social Sciences in Lodz & Czestochowa University of Technology

Journal Information

CiteScore 2017: 5.00

SCImago Journal Rank (SJR) 2017: 0.492
Source Normalized Impact per Paper (SNIP) 2017: 2.813

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 96 96 18
PDF Downloads 27 27 5