Hand gesture recognition based on free-form contours and probabilistic inference

Open access

Hand gesture recognition based on free-form contours and probabilistic inference

A computer vision system is described that captures color image sequences, detects and recognizes static hand poses (i.e., "letters") and interprets pose sequences in terms of gestures (i.e., "words"). The hand object is detected with a double-active contour-based method. A tracking of the hand pose in a short sequence allows detecting "modified poses", like diacritic letters in national alphabets. The static hand pose set corresponds to hand signs of a thumb alphabet. Finally, by tracking hand poses in a longer image sequence, the pose sequence is interpreted in terms of gestures. Dynamic Bayesian models and their inference methods (particle filter and Viterbi search) are applied at this stage, allowing a bi-driven control of the entire system.

If the inline PDF is not rendering correctly, you can download the PDF file here.

  • Arulampalam M. S. Maskell S. and Gordon N. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking IEEE Transactions on Signal Processing 50(2): 174-188.

  • Baum L. Petrie T. Soules G. and Weiss N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains Annal Mathematics Statistics 41(1): 164-171.

  • Emambakhsh M. Ebrahimnezhad H. and Sedaaghi M. H. (2010). Integrated region-based segmentation using color components and texture features with prior shape knowledge International Journal of Applied Mathematics and Computer Science 20(4): 711-726 DOI: 10.2478/v10006-010-0054-y.

  • Flasiński M. and Myśliński S. (2010). On the use of graph parsing for recognition of isolated hand postures of Polish sign language Pattern Recognition 43(6): 2249-2264.

  • Fu C.-S. Cho W. and Essig S. (2000). Hierarchical colour image region segmentation for content-based image retrieval system IEEE Transactions on Image Processing 9(1): 156-162.

  • Gonzalez R. C. and Wintz P. (1987). Digital Image Processing Addison-Wesley Reading MA.

  • Kapuściński T. (2006). The Recognition of the Polish Sign Language in a Vision System Ph.D. thesis University of Zielona Góra Zielona Góra (in Polish).

  • Kasprzak W. (2009). Image and Speech Signal Recognition WUT Press Warsaw (in Polish).

  • Kasprzak W. and Skrzyński P. (2006). Hand image interpretation based on double active contour tracking in T. Zielińska and C. Zieliński (Eds.) ROMANSY 16. Robot Design Dynamics and Control CISM Courses and Lectures Vol. 487 Springer Wien/New York NY pp. 439-446.

  • Kass M. Witkin A. and Terzopoulos D. (1998). Snakes. Active contour models International Journal of Computer Vision 1(4): 321-331.

  • Marnik J. (2003). The recognition of characters from the Polish finger alphabet Technical report StatSoft Polska Cracow http://www.statsoft.pl/czytelnia/badanianaukowe/d0ogol/marnik.pdf

  • Murphy K. (2002). Dynamic Bayesian Networks: Representation Inference and Learning Ph.D. thesis University of California Berkeley CA.

  • Murphy K. P. (1998). Switching Kalman filters Technical report DEC/Compaq Cambridge Research Labs Cambridge MA http://www.cs.berkeley.edu/~murphyk/Articles/skf.ps.gz

  • Niemann H. (2000). Klassifikation von Mustern Springer Berlin.

  • Pitas I. (2000). Digital Image Processing Algorithms and Applications Prentice Hall New York NY.

  • Polanska J. Borys D. and Polanska A. (2006). Node assignment problem in Bayesian networks International Journal of Applied Mathematics and Computer Science 16(2): 233-240.

  • Przepiórkowski A. (2006). Frequency of letters in written Polish Linguistic Advisory Website of Polish Scientific Publishers (PWN) http://poradnia.pwn.pl/lista.php?id=7072

  • Rabiner L. and Juang B. (1993). Fundamentals of Speech Recognition Prentice-Hall Englewood Cliffs NJ.

  • Rafajłowicz E. Wnuk M. and Rafajłowicz W. (2008). Local detection of defects from image sequences International Journal of Applied Mathematics and Computer Science 18(4): 581-592 DOI: DOI: 10.2478/v10006-008-0051-6.

  • Rehg J. and Kanade T. (1993). Digit eyes: Vision-based human hand tracking Technical Report CMU-CS-93-220 School of Computer Science Carnegie Mellon University Pittsburg PA.

  • Sanchez-Reillo R. Sanchez-Avila C. and Gonzalez-Marcos A. (2000). Biometric identification through hand geometry measurements Transactions on Pattern Analysis and Machine Intelligence 22(10): 1168-1171.

  • Starner T. and Pentland A. (1995). Visual recognition of American sign language using hidden Markov models Proceedings of the International Workshop on Automatic Face-and Gesture-Recognition Zurich Switzerland pp. 189-194.

  • Terzopoulos D. (2003). Deformable models: Classic topology-adaptive and generalized formulations Geometric Level Set Methods in Imaging Vision and Graphics Springer-Verlag New York NY pp. 21-40.

  • Tóth L. Kocsor A. and Csirik J. (2005). On naive Bayes in speech recognition International Journal of Applied Mathematics and Computer Science 15(2): 287-294.

  • Wilkowski A. (2008). An efficient system for continuous hand posture recognition in video sequences in L. Rutkowski R. Tadeusiewicz L. Zadeh and J. Zurada (Eds.) Computational Intelligence: Methods and Applications EXIT Warsaw pp. 411-422.

  • Xu C.-Y. and Prince J. (1998). Snakes shapes and gradient vector flow IEEE Transactions on Image Processing 7(3): 359-369.

  • Yining D. Manjunath B. and Shin H. (1999). Colour image segmentation Computer Vision and Pattern Recognition IEEE Computer Society Conference CVPR'99 Fort Collins CO USA Vol. 2 pp. 2446-2451.

Journal information
Impact Factor

IMPACT FACTOR 2018: 1.504
5-year IMPACT FACTOR: 1.553

CiteScore 2018: 2.09

SCImago Journal Rank (SJR) 2018: 0.493
Source Normalized Impact per Paper (SNIP) 2018: 1.361

Mathematical Citation Quotient (MCQ) 2018: 0.08

Cited By
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 190 60 7
PDF Downloads 317 212 27