Visual and auditory perception integrated multitask collaborative identification method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A recognition method and multi-task technology, applied in the field of multi-source heterogeneous data processing and recognition, can solve the problems of complex deep neural network model, difficult to achieve rapid and balanced configuration of network resources, and huge number of parameters.

Inactive Publication Date: 2018-11-13

BEIJING UNIV OF POSTS & TELECOMM

View PDF0 Cites 10 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] Deep neural network training requires a large amount of training data, which makes it powerless for small-scale data tasks. Faced with the high training and labeling costs of massive data, it has poor performance for real recognition tasks with continuous data stream input.

[0007] The deep neural network model is complex, the number of parameters is huge, and the training process requires powerful computing facilities. At the same time, when facing different recognition tasks, different convolutional layer structures are used, which makes it difficult to achieve rapid and balanced allocation of network resources.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0082] Such as figure 1 As shown, Embodiment 1 of the present invention provides a multi-task cooperative recognition method and system that integrates audio-visual perception.

[0083] A multi-task cooperative recognition system that integrates audio-visual perception disclosed in Embodiment 1 of the present invention includes:

[0084] The general-purpose feature extraction module is used to establish a time-synchronous matching mechanism for multi-source heterogeneous data, realize a multi-source data association description model based on potential high-level shared semantics, realize efficient support between different channel data, information complementarity, and maximize the realization of data de-redundancy;

[0085] The deep collaborative feature learning module is used to establish a long-term dependent generative memory model, explore a semi-supervised continuous learning system based on collaborative attention and deep autonomy, realize dynamic self-learning with...

Embodiment 2

[0095] Embodiment 2 of the present invention provides a method for multi-task discrimination using the above-mentioned system. The method includes: a general feature description of massive multi-source audio-visual media perception data, including establishing a time synchronization matching mechanism for multi-source heterogeneous data, and realizing based on A multi-source data association description model with potential high-level shared semantics; deep collaborative feature learning for long-term memory of continuous input streaming media data, including establishing a long-term dependent generative memory model, and exploring semi-supervised continuous learning based on collaborative attention and deep autonomy System; intelligent multi-task deep collaborative enhanced feedback recognition model under the adaptive framework, including adaptive sensing computing theory based on the cooperative work of agents, introducing adaptive deep collaborative enhanced feedback and mul...

Embodiment 3

[0103] Such as figure 1 As shown, Embodiment 3 of the present invention provides a multi-task cooperative recognition method that integrates audio-visual perception.

[0104] First, a transfer algorithm is explored to establish a general feature description method for multi-source audiovisual media perception data.

[0105] In order to achieve efficient collaborative analysis for different audiovisual tasks, and to extract highly robust and versatile feature descriptions for multi-source audiovisual perception data, as a prototype feature for subsequent collaborative learning, it is first necessary to analyze the audiovisual perception data. features. Most of the audio data actually acquired is a one-dimensional time series, and its descriptiveness is mainly reflected in its spectrum-time clues. It needs to use the spectral transformation of the auditory perceptual domain combined with the prosodic information of adjacent audio frames to describe. The visual perception data ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a visual and auditory perception integrated multitask collaborative identification method and system and belongs to the technical field of multi-source heterogeneous data processing and identification. The system comprises a universal feature extraction, a collaborative feature learning module and a suitable scene feedback, assessment and identification module. Based on timesynchronization matching mechanism of multi-source heterogeneous data, universal features of the multi-source heterogeneous data are extracted; a long-time dependent memory model is established, anduniversal features serving as priori knowledges are continuously learned by cooperating with a collaborative attention mechanism based on external dependence; environmental perception parameters in the multi-source heterogeneous data are extracted, a progressive network depth collaborative reinforcement identification mechanism is established, and multitask identification is achieved according tothe learning features and task demands of the memory model. The system combines with a suitable scene computing theory based on environmental perception, judges the weight of each identified task through depth reinforcement feedback, self-adaptively adjusts the priority of each task according to environmental change and achieves the effect of simultaneously outputting multiple visual and auditoryperception identification results.

Description

technical field [0001] The invention relates to the technical field of multi-source heterogeneous data processing and recognition, and in particular to a multi-task cooperative recognition method and system that integrates audio-visual perception. Background technique [0002] After experiencing ups and downs for a period of six years, artificial intelligence is about to enter a full-scale explosion by taking advantage of the wave of the IT era such as the Internet, mobile Internet, and Internet of Things, based on deep neural network algorithms, and supported by big data, cloud computing, and smart terminals. new era. The continuous growth of communication bandwidth and the continuous improvement of transmission speed have rapidly lowered the threshold for obtaining massive audio / video data. Faced with the urgent need for ultra-high-speed, mobile and universal storage and processing of massive data, traditionally weak artificial intelligence based on single-mode single-tas...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06F17/30

Inventor 明悦

Owner BEIJING UNIV OF POSTS & TELECOMM

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Visual and auditory perception integrated multitask collaborative identification method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology