Visual and auditory perception integrated multitask collaborative identification method and system

A recognition method and multi-task technology, applied in the field of multi-source heterogeneous data processing and recognition, can solve the problems of complex deep neural network model, difficult to achieve rapid and balanced configuration of network resources, and huge number of parameters.

Inactive Publication Date: 2018-11-13
BEIJING UNIV OF POSTS & TELECOMM
View PDF0 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Deep neural network training requires a large amount of training data, which makes it powerless for small-scale data tasks. Faced with the high training and labeling costs of massive data, it has poor performance for real recognition tasks with continuous data stream input.
[0007] The deep neural networ...

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Visual and auditory perception integrated multitask collaborative identification method and system
  • Visual and auditory perception integrated multitask collaborative identification method and system
  • Visual and auditory perception integrated multitask collaborative identification method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0082] Such as figure 1 As shown, Embodiment 1 of the present invention provides a multi-task cooperative recognition method and system that integrates audio-visual perception.

[0083] A multi-task cooperative recognition system that integrates audio-visual perception disclosed in Embodiment 1 of the present invention includes:

[0084] The general-purpose feature extraction module is used to establish a time-synchronous matching mechanism for multi-source heterogeneous data, realize a multi-source data association description model based on potential high-level shared semantics, realize efficient support between different channel data, information complementarity, and maximize the realization of data de-redundancy;

[0085] The deep collaborative feature learning module is used to establish a long-term dependent generative memory model, explore a semi-supervised continuous learning system based on collaborative attention and deep autonomy, realize dynamic self-learning with...

Embodiment 2

[0095] Embodiment 2 of the present invention provides a method for multi-task discrimination using the above-mentioned system. The method includes: a general feature description of massive multi-source audio-visual media perception data, including establishing a time synchronization matching mechanism for multi-source heterogeneous data, and realizing based on A multi-source data association description model with potential high-level shared semantics; deep collaborative feature learning for long-term memory of continuous input streaming media data, including establishing a long-term dependent generative memory model, and exploring semi-supervised continuous learning based on collaborative attention and deep autonomy System; intelligent multi-task deep collaborative enhanced feedback recognition model under the adaptive framework, including adaptive sensing computing theory based on the cooperative work of agents, introducing adaptive deep collaborative enhanced feedback and mul...

Embodiment 3

[0103] Such as figure 1 As shown, Embodiment 3 of the present invention provides a multi-task cooperative recognition method that integrates audio-visual perception.

[0104] First, a transfer algorithm is explored to establish a general feature description method for multi-source audiovisual media perception data.

[0105] In order to achieve efficient collaborative analysis for different audiovisual tasks, and to extract highly robust and versatile feature descriptions for multi-source audiovisual perception data, as a prototype feature for subsequent collaborative learning, it is first necessary to analyze the audiovisual perception data. features. Most of the audio data actually acquired is a one-dimensional time series, and its descriptiveness is mainly reflected in its spectrum-time clues. It needs to use the spectral transformation of the auditory perceptual domain combined with the prosodic information of adjacent audio frames to describe. The visual perception data ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a visual and auditory perception integrated multitask collaborative identification method and system and belongs to the technical field of multi-source heterogeneous data processing and identification. The system comprises a universal feature extraction, a collaborative feature learning module and a suitable scene feedback, assessment and identification module. Based on timesynchronization matching mechanism of multi-source heterogeneous data, universal features of the multi-source heterogeneous data are extracted; a long-time dependent memory model is established, anduniversal features serving as priori knowledges are continuously learned by cooperating with a collaborative attention mechanism based on external dependence; environmental perception parameters in the multi-source heterogeneous data are extracted, a progressive network depth collaborative reinforcement identification mechanism is established, and multitask identification is achieved according tothe learning features and task demands of the memory model. The system combines with a suitable scene computing theory based on environmental perception, judges the weight of each identified task through depth reinforcement feedback, self-adaptively adjusts the priority of each task according to environmental change and achieves the effect of simultaneously outputting multiple visual and auditoryperception identification results.

Description

technical field [0001] The invention relates to the technical field of multi-source heterogeneous data processing and recognition, and in particular to a multi-task cooperative recognition method and system that integrates audio-visual perception. Background technique [0002] After experiencing ups and downs for a period of six years, artificial intelligence is about to enter a full-scale explosion by taking advantage of the wave of the IT era such as the Internet, mobile Internet, and Internet of Things, based on deep neural network algorithms, and supported by big data, cloud computing, and smart terminals. new era. The continuous growth of communication bandwidth and the continuous improvement of transmission speed have rapidly lowered the threshold for obtaining massive audio / video data. Faced with the urgent need for ultra-high-speed, mobile and universal storage and processing of massive data, traditionally weak artificial intelligence based on single-mode single-tas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 明悦
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products