Mimicking speaker’s lip movement on a 3D head model using cosine function fitting

Open access

Abstract

Real-time mimicking of human facial movement on a 3D head model is a challenge which has attracted attention of many researchers. In this research work we propose a new method for enhancing the capturing of the shape of lips. We present an automatic lip movement tracking method which employs a cosine function to interpolate between extracted lip features in order to make the detection more accurate. In order to test the proposed method, mimicking lip movements of a speaker on a 3D head model is studied. Microsoft Kinect II is used in order to capture videos and both RGB and depth information are used to locate the mouth of a speaker followed by fitting a cosine function in order to track the changes of the features extracted from the lips.

REFERENCES

  • [1] S.-H. Jeng, H.Y.M. Liao, C.C. Han, M.Y. Chern, and Y.T. Liu, “Facial feature detection using geometrical face model: an efficient approach”, Pattern Recognition 31 (3), 273–282 (1998).

  • [2] Y. Wang, C.-S. Chua, and Y.-K. Ho, “Facial feature detection and face recognition from 2d and 3d images”, Pattern Recognition Letters 23 (10), 1191–1202 (2002).

  • [3] M. Dantone, J. Gall, G. Fanelli, and L. Van Gool, “Real-time facial feature detection using conditional regression forests”, IEEE Conference on Computer Vision and Pattern Recognition, 2578–2585 (2012).

  • [4] V. Le, J. Brandt, Z. Lin, L. Bourdev, and T.S. Huang, “Interactive facial feature localization”, Computer Vision–ECCV 2012, Springer, 679–692 (2012).

  • [5] J. Wang, R. Xiong, and J. Chu, “Facial feature points detecting based on gaussian mixture models”, Pattern Recognition Letters 53, 62–68 (2015).

  • [6] J. Kim and J. Chung, “Untangling polygonal and polyhedral meshes via mesh optimization”, Engineering with Computers 31 (3), 617–629 (2015).

  • [7] R.-L. Hsu, M. Abdel-Mottaleb, and A.K. Jain, “Face detection in color images”, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (5), 696–706 (2002).

  • [8] S. Agrawal and P. Khatri, “Facial expression detection techniques: Based on viola and jones algorithm and principal component analysis”, 5th International Conference on Advanced Computing & Communication Technologies, 108–112 (2015).

  • [9] Y. Wu and Q. Ji, “Learning the deep features for eye detection in uncontrolled conditions”, 22nd International Conference on Pattern Recognition, 455–459 (2014).

  • [10] I. Lüsi, S. Escarela, and G. Anbarjafari, “Sase: Rgb-depth database for human head pose estimation”, Computer Vision–ECCV 2016 Workshops, Springer, 325–336 (2016).

  • [11] K.-H. Seo, W. Kim, C. Oh, and J.-J. Lee, “Face detection and facial feature extraction using color snake”, Proceedings of IEEE International Symposium on Industrial Electronics, 457–462 (2002).

  • [12] F. Song, D. Zhang, D. Mei, and Z. Guo, “A multiple maximum scatter difference discriminant criterion for facial feature extraction”, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 37 (6), 1599–1606 (2007).

  • [13] F. Song, D. Zhang, Q. Chen, and J. Wang, “Face recognition based on a novel linear discriminant criterion”, Pattern Analysis and Applications 10 (3), 165–174 (2007).

  • [14] W. Qi, Y. Sheng, and L. Xian-Wei, “A fast mouth detection algorithm based on face organs”, 2nd International Conference on Power Electronics and Intelligent Transportation System, 250–252 (2009).

  • [15] S. Siatras, N. Nikolaidis, M. Krinidis, and I. Pitas, “Visual lip activity detection and speaker detection using mouth region intensities”, IEEE Transactions on Circuits and Systems for Video Technology 19 (1), 133–137 (2009).

  • [16] J. Neyman and E. S. Pearson, On the Problem of the Most Efficient Tests of Statistical Hypotheses, Springer, 1992.

  • [17] Y. Xiong, B. Fang, and F. Quek, “Detection of mouth movements and its applications to cross-modal analysis of planning meetings”, International Conference on Multimedia Information Networking and Security, 225–229 (2009).

  • [18] A. Caplier, “Lip detection and tracking”, Proceedings 11th International Conference on Image Analysis and Processing, 8–13 (2001).

  • [19] R. E. Kalman, “A new approach to linear filtering and prediction problems”, Journal of Fluids Engineering 82 (1), 35–45 (1960).

  • [20] A. Barmpoutis, “Tensor body: Real-time reconstruction of the human body and avatar synthesis from rgb-d”, IEEE Transactions on Cybernetics 43 (5), 1347–1356 (2013).

  • [21] L. Yang, L. Zhang, H. Dong, A. Alelaiwi, and A. El Saddik, “Evaluating and Improving the Depth Accuracy of Kinect for Windows v2”, Sensors Journal 15 (8), 4275–4285 (2015).

  • [22] P. Viola and M.J. Jones, “Robust real-time face detection”, International Journal of Computer Vision 57 (2), 137–154 (2004).

  • [23] J. Canny, “A computational approach to edge detection”, IEEE Transactions on Pattern Analysis and Machine Intelligence 6, 679–698 (1986).

  • [24] S. Suzuki et al., “Topological structural analysis of digitized binary images by border following”, Computer Vision, Graphics, and Image Processing 30 (1), 32–46 (1985).

  • [25] J. Shi and C. Tomasi, “Good features to track”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 593–600 (1994).

  • [26] C. Harris and M. Stephens, “A combined corner and edge detector”, Alvey Vision Conference 15, Citeseer, p. 50 (1988).

  • [27] H. Demirel, G. Anbarjafari, and M.N.S. Jahromi, “Image equalization based on singular value decomposition”, 23rd International Symposium on Computer and Information Sciences, 1–5 (2008).

  • [28] G. Anbarjafari, A. Jafari, M.N.S. Jahromi, C. Ozcinar, and H. Demirel, “Image illumination enhancement with an objective no-reference measure of illumination assessment based on gaussian distribution mapping”, Engineering Science and Technology (2015).

  • [29] T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active appearance models”, IEEE Transactions on Pattern Analysis & Machine Intelligence 6, 681–685 (2001).

  • [30] Y. Chen, C. Hua, and R. Bai, “Sequentially adaptive active appearance model with regression-based online reference appearance template”, Journal of Visual Communication and Image Representation 35, 198–208 (2016).

Bulletin of the Polish Academy of Sciences Technical Sciences

The Journal of Polish Academy of Sciences

Journal Information


IMPACT FACTOR 2016: 1.156
5-year IMPACT FACTOR: 1.238

CiteScore 2016: 1.50

SCImago Journal Rank (SJR) 2016: 0.457
Source Normalized Impact per Paper (SNIP) 2016: 1.239

Metrics

All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 31 31 23
PDF Downloads 6 6 6