The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume XLII-2/W4
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W4, 155–161, 2017
https://doi.org/10.5194/isprs-archives-XLII-2-W4-155-2017
Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci., XLII-2/W4, 155–161, 2017
https://doi.org/10.5194/isprs-archives-XLII-2-W4-155-2017

  10 May 2017

10 May 2017

PARAMETRIC REPRESENTATION OF THE SPEAKER’S LIPS FOR MULTIMODAL SIGN LANGUAGE AND SPEECH RECOGNITION

D. Ryumin1,2 and A. A. Karpov1,2 D. Ryumin and A. A. Karpov
  • 1St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences (SPIIRAS), Saint-Petersburg, Russian Federation
  • 2ITMO University, Saint-Petersburg, Russian Federation

Keywords: Sign language, Gestures, Speech recognition, Computer Vision, Principal Component Analysis, Machine learning, Face detection, Linear contrasting

Abstract. In this article, we propose a new method for parametric representation of human’s lips region. The functional diagram of the method is described and implementation details with the explanation of its key stages and features are given. The results of automatic detection of the regions of interest are illustrated. A speed of the method work using several computers with different performances is reported. This universal method allows applying parametrical representation of the speaker’s lipsfor the tasks of biometrics, computer vision, machine learning, and automatic recognition of face, elements of sign languages, and audio-visual speech, including lip-reading.