Unlock instant, AI-driven research and patent intelligence for your innovation.

A person recognition method and system based on a multi-frame audio-video fusion network

A technology for person recognition and fusion network, which is applied in the field of person recognition method and system based on multi-frame audio and video fusion network, can solve the problems affecting the effect of fusion features, the decline of visual feature discrimination, and the decline of feature discrimination ability, so as to avoid Influence, the effect of excellent recognition effect

Active Publication Date: 2021-09-24
INST OF COMPUTING TECH CHINESE ACAD OF SCI
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

At present, the person recognition algorithm based on audio and video fusion can make full use of the information of face features and voiceprint features to determine the identity of the person, but the fusion algorithm fails to solve the problem of the decline in the discrimination of visual features in low-quality situations
[0004] When the inventor was conducting research on person recognition in the field of network audio and video monitoring, he found the following defects in the existing technology: one is that the algorithm based on single modality is difficult to solve the practical problems of complex network audio and video monitoring, and the face recognition algorithm works at low The degradation of high-quality images is serious, and the recognition accuracy of voiceprint recognition algorithms is also limited; second, in the field of network audio and video surveillance, there are often a large number of difficult-to-recognize images
Directly extracting face features from these difficult-to-recognition pictures will lead to a decrease in the ability to distinguish features, which will affect the effect of subsequent fusion features

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A person recognition method and system based on a multi-frame audio-video fusion network
  • A person recognition method and system based on a multi-frame audio-video fusion network
  • A person recognition method and system based on a multi-frame audio-video fusion network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0039] In recent years, video accounts for the vast majority of network traffic, and the proportion continues to increase. Massive videos are inevitably mixed with illegal videos, and these videos spread quickly, have a wide range of influence, and are extremely harmful. Therefore, intelligent analysis of video content and prevention of illegal video flooding on the Internet have become urgent problems to be solved. Illegal video is a complex concept. To accurately identify it requires not only analyzing the underlying visual features, but also understanding the high-level semantic associations, which is a very challenging task. As people are the main body of video content, the accurate identification of specific people can effectively assist the intelligent analysis of illegal videos. Such as figure 1As shown, the multi-frame audio and video fusion algorithm is mainly divided into three stages: the fusion of multi-frame visual features, the fusion of multi-frame voiceprint ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention proposes a method and system for character recognition based on a multi-frame audio-video fusion network, which is characterized in that it includes: a visual feature fusion step, decoding a video to be recognized, obtaining continuous K frames of the video, and extracting the continuous K frames For the face features of each frame in , all the face features are weighted and fused to obtain multi-frame visual features, K is a positive integer; the voiceprint feature fusion step extracts the voiceprint features of each frame in the continuous K frames, using The time recurrent neural network fuses all the voiceprint features to obtain multi-frame voiceprint features; the audio and video feature fusion step uses the fully connected layer to fuse the multi-frame visual features and the multi-frame voiceprint features, and uses the classification loss to constrain the fusion process , the multi-frame audio-video fusion feature is obtained, and the person recognition is performed according to the multi-frame audio-video fusion feature.

Description

technical field [0001] The present invention relates to the field of character recognition, and in particular to a character recognition method and system based on a multi-frame audio-video fusion network. Background technique [0002] Person recognition in video mainly uses the intrinsic or extrinsic attributes of the person to determine its identity. At present, the commonly used method is to use the biological characteristics of the human body, such as human face, voiceprint, etc., to identify the identity of the person. The corresponding algorithms include face recognition algorithm, voiceprint recognition algorithm and so on. Mainstream face recognition algorithms use convolutional neural networks to learn a mapping from raw face images to identity-invariant features from large-scale face datasets. Researchers often carefully design different loss functions, such as bigram loss, triplet loss, and center loss, to constrain the mapping process from images to features. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06K9/62G06K9/00G10L17/02
CPCG10L17/02G06V40/168G06F18/253
Inventor 高科王永杰
Owner INST OF COMPUTING TECH CHINESE ACAD OF SCI