Multi-mode and adversarial learning-based multi-task target detection and identification method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A target detection and multi-modal technology, applied in the field of deep learning target detection, can solve the problems of inability to generate models, fitting, low efficiency, etc., and achieve the effect of improving accuracy, improving robustness, fast and accurate detection and recognition

Pending Publication Date: 2022-07-29

HUNAN UNIV

View PDF0 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

A separate convolutional neural network is used to train the two tasks separately, ignoring the connection between the two tasks, and the parameters generated in the task will only be processed in a specific task, which is not only inefficient, but also due to training The risk of overfitting with too small a sample size does not produce a well-performing model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0095] The embodiment of the present invention proposes a target detection and recognition method based on multi-modal multi-task confrontation learning, such as figure 1 As shown, the method includes:

[0096] Step 1: Prepare the required image data set, normalize the data set images, and manually label the positions and types of objects in all data images; then use traditional data enhancement methods to expand the data set, and then use labelme software Label all RGB images with semantic information and generate a semantic map;

[0097] The details are as follows:

[0098] (1) The image training data set is collected through self-shooting and online search, and the images are required to contain different types of target objects.

[0099] (2) Normalize all image data, and convert the image into a standard image size of 256×256 pixels.

[0100] (3) Using the labelme software, the position of the target in the training sample image is closely surrounded by a rectangular f...

Embodiment 2

[0147] Based on the above method, an embodiment of the present invention also provides a multi-task target detection and recognition device based on multi-modality and confrontation learning, including:

[0148] Semantic map acquisition unit: through RGB image target annotation and preprocessing, and obtain the corresponding semantic map;

[0149] Recognition network construction unit: construct a multi-task recognition network model based on multi-modal and adversarial learning by using multi-modal feature fusion network, region proposal network and multi-task target detection network connected in sequence;

[0150] The multi-modal feature fusion network is formed by using two Resnet18 backbone CNN networks and then connecting the concat fusion network;

[0151] The region proposal network outputs random windows and proposal boxes;

[0152] The multi-tasks in the multi-task target detection network include three auxiliary tasks and one main task, wherein the main task is a t...

Embodiment 3

[0162] An example of the present invention also provides an electronic terminal, characterized in that: at least it includes:

[0163] one or more processors;

[0164] one or more memories;

[0165] The processor invokes the computer program stored in the memory to execute: the steps of the foregoing multi-task target detection and recognition method based on multi-modality and adversarial learning.

[0166] It should be understood that the specific implementation process refers to the related content of Embodiment 1.

[0167] The terminal also includes: a communication interface for communicating with external devices and performing data interactive transmission. For example, it communicates with the collection equipment of the operation information collection subsystem and the communication modules of other trains to obtain the real-time operation information of the train itself and its adjacent trains.

[0168] Among them, the memory may include high-speed RAM memory, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a target detection and identification method and device based on multi-modal multi-task adversarial learning. According to the method, a whole model is divided into a feature extraction stage, a region proposal stage and a multi-task target detection stage. In the feature extraction stage, a multi-modal feature fusion method is adopted to perform feature extraction on an RGB image and a semantic image of input data, so that a model is more sensitive to position information of a target in the image, and extraction of semantic information of the target is enhanced; the regional proposal stage is used for generating a random window and a proposal box as input of the next stage; in the multi-task target detection stage, a multi-task learning method is adopted, and the detection precision of a main task is improved by jointly training three auxiliary tasks. For a target detection network, an adversarial learning thought is introduced, and two adversarial generative networks are added to generate a multi-style sample, so that the robustness of the model is improved.

Description

technical field [0001] The invention belongs to the target detection field of deep learning, and relates to a multi-task target detection and identification method and device based on multimodality and confrontation learning. Background technique [0002] Object detection technology is to locate and identify objects of interest in images. With the development of computer vision and the rise of artificial intelligence, target detection technology has made great progress in recent years due to its large application requirements in the era of intelligence. National-level major industrial fields such as national defense and military industry have been widely used. However, the rapid development of the industry has put forward higher requirements for target detection technology, and traditional methods have been unable to support the further development of various industries. For this reason, this paper uses the difference in shape, color, and texture features between different...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V10/10G06V10/32G06V10/74G06V10/764G06V10/80G06V10/82G06V10/56G06K9/62G06N3/04G06N3/08

CPCG06N3/08G06N3/047G06N3/045G06F18/22G06F18/2415G06F18/253Y02T10/40

Inventor 张辉吴刘宸钟杭曹意宏王耀南刘理毛建旭冯冰玉

Owner HUNAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Multi-mode and adversarial learning-based multi-task target detection and identification method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology