Hand gesture recognition based on free-form contours and probabilistic inference
A computer vision system is described that captures color image sequences, detects and recognizes static hand poses (i.e., "letters") and interprets pose sequences in terms of gestures (i.e., "words"). The hand object is detected with a double-active contour-based method. A tracking of the hand pose in a short sequence allows detecting "modified poses", like diacritic letters in national alphabets. The static hand pose set corresponds to hand signs of a thumb alphabet. Finally, by tracking hand poses in a longer image sequence, the pose sequence is interpreted in terms of gestures. Dynamic Bayesian models and their inference methods (particle filter and Viterbi search) are applied at this stage, allowing a bi-driven control of the entire system.
If the inline PDF is not rendering correctly, you can download the PDF file here.
Arulampalam M. S. Maskell S. and Gordon N. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking IEEE Transactions on Signal Processing50(2): 174-188.
Baum L. Petrie T. Soules G. and Weiss N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains Annal Mathematics Statistics41(1): 164-171.
Emambakhsh M. Ebrahimnezhad H. and Sedaaghi M. H. (2010). Integrated region-based segmentation using color components and texture features with prior shape knowledge International Journal of Applied Mathematics and Computer Science20(4): 711-726 DOI: 10.2478/v10006-010-0054-y.
Flasiński M. and Myśliński S. (2010). On the use of graph parsing for recognition of isolated hand postures of Polish sign language Pattern Recognition43(6): 2249-2264.
Fu C.-S. Cho W. and Essig S. (2000). Hierarchical colour image region segmentation for content-based image retrieval system IEEE Transactions on Image Processing9(1): 156-162.
Gonzalez R. C. and Wintz P. (1987). Digital Image Processing Addison-Wesley Reading MA.
Kapuściński T. (2006). The Recognition of the Polish Sign Language in a Vision System Ph.D. thesis University of Zielona Góra Zielona Góra (in Polish).
Kasprzak W. (2009). Image and Speech Signal Recognition WUT Press Warsaw (in Polish).
Kasprzak W. and Skrzyński P. (2006). Hand image interpretation based on double active contour tracking in T. Zielińska and C. Zieliński (Eds.) ROMANSY 16. Robot Design Dynamics and Control CISM Courses and Lectures Vol. 487 Springer Wien/New York NY pp. 439-446.
Kass M. Witkin A. and Terzopoulos D. (1998). Snakes. Active contour models International Journal of Computer Vision1(4): 321-331.
Marnik J. (2003). The recognition of characters from the Polish finger alphabet Technical report StatSoft Polska Cracow http://www.statsoft.pl/czytelnia/badanianaukowe/d0ogol/marnik.pdf
Murphy K. (2002). Dynamic Bayesian Networks: Representation Inference and Learning Ph.D. thesis University of California Berkeley CA.
Murphy K. P. (1998). Switching Kalman filters Technical report DEC/Compaq Cambridge Research Labs Cambridge MA http://www.cs.berkeley.edu/~murphyk/Articles/skf.ps.gz
Niemann H. (2000). Klassifikation von Mustern Springer Berlin.
Pitas I. (2000). Digital Image Processing Algorithms and Applications Prentice Hall New York NY.
Polanska J. Borys D. and Polanska A. (2006). Node assignment problem in Bayesian networks International Journal of Applied Mathematics and Computer Science16(2): 233-240.
Przepiórkowski A. (2006). Frequency of letters in written Polish Linguistic Advisory Website of Polish Scientific Publishers (PWN) http://poradnia.pwn.pl/lista.php?id=7072
Rabiner L. and Juang B. (1993). Fundamentals of Speech Recognition Prentice-Hall Englewood Cliffs NJ.
Rafajłowicz E. Wnuk M. and Rafajłowicz W. (2008). Local detection of defects from image sequences International Journal of Applied Mathematics and Computer Science18(4): 581-592 DOI: DOI: 10.2478/v10006-008-0051-6.
Rehg J. and Kanade T. (1993). Digit eyes: Vision-based human hand tracking Technical Report CMU-CS-93-220 School of Computer Science Carnegie Mellon University Pittsburg PA.
Sanchez-Reillo R. Sanchez-Avila C. and Gonzalez-Marcos A. (2000). Biometric identification through hand geometry measurements Transactions on Pattern Analysis and Machine Intelligence22(10): 1168-1171.
Starner T. and Pentland A. (1995). Visual recognition of American sign language using hidden Markov models Proceedings of the International Workshop on Automatic Face-and Gesture-Recognition Zurich Switzerland pp. 189-194.
Terzopoulos D. (2003). Deformable models: Classic topology-adaptive and generalized formulations Geometric Level Set Methods in Imaging Vision and Graphics Springer-Verlag New York NY pp. 21-40.
Tóth L. Kocsor A. and Csirik J. (2005). On naive Bayes in speech recognition International Journal of Applied Mathematics and Computer Science15(2): 287-294.
Wilkowski A. (2008). An efficient system for continuous hand posture recognition in video sequences in L. Rutkowski R. Tadeusiewicz L. Zadeh and J. Zurada (Eds.) Computational Intelligence: Methods and Applications EXIT Warsaw pp. 411-422.
Xu C.-Y. and Prince J. (1998). Snakes shapes and gradient vector flow IEEE Transactions on Image Processing7(3): 359-369.
Yining D. Manjunath B. and Shin H. (1999). Colour image segmentation Computer Vision and Pattern Recognition IEEE Computer Society Conference CVPR'99 Fort Collins CO USA Vol. 2 pp. 2446-2451.