Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-mode and adversarial learning-based multi-task target detection and identification method and device

A target detection and multi-modal technology, applied in the field of deep learning target detection, can solve the problems of inability to generate models, fitting, low efficiency, etc., and achieve the effect of improving accuracy, improving robustness, fast and accurate detection and recognition

Pending Publication Date: 2022-07-29
HUNAN UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

A separate convolutional neural network is used to train the two tasks separately, ignoring the connection between the two tasks, and the parameters generated in the task will only be processed in a specific task, which is not only inefficient, but also due to training The risk of overfitting with too small a sample size does not produce a well-performing model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-mode and adversarial learning-based multi-task target detection and identification method and device
  • Multi-mode and adversarial learning-based multi-task target detection and identification method and device
  • Multi-mode and adversarial learning-based multi-task target detection and identification method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0095] The embodiment of the present invention proposes a target detection and recognition method based on multi-modal multi-task confrontation learning, such as figure 1 As shown, the method includes:

[0096] Step 1: Prepare the required image data set, normalize the data set images, and manually label the positions and types of objects in all data images; then use traditional data enhancement methods to expand the data set, and then use labelme software Label all RGB images with semantic information and generate a semantic map;

[0097] The details are as follows:

[0098] (1) The image training data set is collected through self-shooting and online search, and the images are required to contain different types of target objects.

[0099] (2) Normalize all image data, and convert the image into a standard image size of 256×256 pixels.

[0100] (3) Using the labelme software, the position of the target in the training sample image is closely surrounded by a rectangular f...

Embodiment 2

[0147] Based on the above method, an embodiment of the present invention also provides a multi-task target detection and recognition device based on multi-modality and confrontation learning, including:

[0148] Semantic map acquisition unit: through RGB image target annotation and preprocessing, and obtain the corresponding semantic map;

[0149] Recognition network construction unit: construct a multi-task recognition network model based on multi-modal and adversarial learning by using multi-modal feature fusion network, region proposal network and multi-task target detection network connected in sequence;

[0150] The multi-modal feature fusion network is formed by using two Resnet18 backbone CNN networks and then connecting the concat fusion network;

[0151] The region proposal network outputs random windows and proposal boxes;

[0152] The multi-tasks in the multi-task target detection network include three auxiliary tasks and one main task, wherein the main task is a t...

Embodiment 3

[0162] An example of the present invention also provides an electronic terminal, characterized in that: at least it includes:

[0163] one or more processors;

[0164] one or more memories;

[0165] The processor invokes the computer program stored in the memory to execute: the steps of the foregoing multi-task target detection and recognition method based on multi-modality and adversarial learning.

[0166] It should be understood that the specific implementation process refers to the related content of Embodiment 1.

[0167] The terminal also includes: a communication interface for communicating with external devices and performing data interactive transmission. For example, it communicates with the collection equipment of the operation information collection subsystem and the communication modules of other trains to obtain the real-time operation information of the train itself and its adjacent trains.

[0168] Among them, the memory may include high-speed RAM memory, an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a target detection and identification method and device based on multi-modal multi-task adversarial learning. According to the method, a whole model is divided into a feature extraction stage, a region proposal stage and a multi-task target detection stage. In the feature extraction stage, a multi-modal feature fusion method is adopted to perform feature extraction on an RGB image and a semantic image of input data, so that a model is more sensitive to position information of a target in the image, and extraction of semantic information of the target is enhanced; the regional proposal stage is used for generating a random window and a proposal box as input of the next stage; in the multi-task target detection stage, a multi-task learning method is adopted, and the detection precision of a main task is improved by jointly training three auxiliary tasks. For a target detection network, an adversarial learning thought is introduced, and two adversarial generative networks are added to generate a multi-style sample, so that the robustness of the model is improved.

Description

technical field [0001] The invention belongs to the target detection field of deep learning, and relates to a multi-task target detection and identification method and device based on multimodality and confrontation learning. Background technique [0002] Object detection technology is to locate and identify objects of interest in images. With the development of computer vision and the rise of artificial intelligence, target detection technology has made great progress in recent years due to its large application requirements in the era of intelligence. National-level major industrial fields such as national defense and military industry have been widely used. However, the rapid development of the industry has put forward higher requirements for target detection technology, and traditional methods have been unable to support the further development of various industries. For this reason, this paper uses the difference in shape, color, and texture features between different...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06V10/10G06V10/32G06V10/74G06V10/764G06V10/80G06V10/82G06V10/56G06K9/62G06N3/04G06N3/08
CPCG06N3/08G06N3/047G06N3/045G06F18/22G06F18/2415G06F18/253Y02T10/40
Inventor 张辉吴刘宸钟杭曹意宏王耀南刘理毛建旭冯冰玉
Owner HUNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products