Unlock instant, AI-driven research and patent intelligence for your innovation.

Lip language recognition method fusing channel attention and selective feature fusion mechanism

A technology of feature fusion and attention, which is applied in the field of lip language recognition of deep neural network, can solve the problems of low lip language recognition accuracy and insufficient feature extraction of lip areas, etc., to achieve improved effect, better lip reading effect effect of effect

Active Publication Date: 2021-06-25
HEFEI UNIV OF TECH
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are relatively few studies on the sentence-level lip recognition method for complete sentence sequence prediction, and the existing models do not fully extract the features of the lip area, the accuracy of lip recognition is still low, and there is room for improvement.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Lip language recognition method fusing channel attention and selective feature fusion mechanism
  • Lip language recognition method fusing channel attention and selective feature fusion mechanism
  • Lip language recognition method fusing channel attention and selective feature fusion mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] In this embodiment, a lip language recognition method that integrates channel attention and selective feature fusion mechanism is to identify the content expressed by the speaker according to the movement of the speaker's lip area in the video, and map it into text language, so that To achieve lip reading based on deep learning, a lip recognition algorithm that integrates channel attention and selective feature fusion mechanisms is used. Download the sentence-level lip language recognition dataset GRID, and obtain the image of the speaker's lip area after face feature detection and processing, build a complete lip language recognition model, and speed up model training through batch normalization; the fusion channel attention mechanism improves the model's performance. Effect; use the Adam optimization algorithm to update and optimize the model parameters; use the final trained model to recognize the content expressed by the speaker according to the movement of the speak...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a lip language recognition method fusing a channel attention and selective feature fusion mechanism, and the method comprises the steps: 1, downloading a data set GRID for training a model, and carrying out the preprocessing of the data set; 2, building a lip language recognition network, and selecting a proper objective function to optimize model parameters; 3, evaluating the effect of the model by adopting corresponding evaluation indexes; and 4, performing lip language recognition on the video by using the trained model. According to the method, the stacked 3D convolutional neural network, the selective feature fusion network and the bidirectional GRU network are used for encoding the input video frame, a channel attention mechanism is added between every two 3D convolutional layers, and finally, the CTC decoder is used for generating the output text, so that the lip region features of a speaker can be better learned, and a more accurate lip reading effect is realized.

Description

technical field [0001] The invention belongs to the technical field of computer machine learning and artificial intelligence, and mainly relates to a lip language recognition method of a deep neural network. Background technique [0002] Lip language plays a vital role in human communication and speech comprehension, yet according to research, humans are poor at lip reading. Good lip language recognition technology can be a supplement to audio-based speech recognition, which can be used to improve hearing aids, improve the acquisition of language information in silent, safe, and noisy environments, etc., and has great practicality, so it has become an increasingly concerned topic. field. Before the advent of deep learning, most of the work on lip reading was based on hand-designed feature learning, which was computationally intensive and less accurate. Deep learning methods that have been developed in recent years are used to extract static features of the speaker's lip re...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06V40/161G06V40/171G06V40/20G06V20/41G06N3/045G06F18/25G06F18/214
Inventor 薛峰杨添王文博洪自坤
Owner HEFEI UNIV OF TECH