An object detection method based on cross-modal and multi-scale feature fusion

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A multi-scale feature and object detection technology, applied in the field of image recognition, can solve the problems of speed limitation, lack of inclusion, inability to directly obtain general feature expression of depth information, etc., to achieve real-time detection speed and improve detection performance.

Active Publication Date: 2021-10-15

ZHEJIANG UNIV OF TECH

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, the industry does not contain a sufficient number of categories, and a large-scale depth image dataset that has been labeled, so that it is impossible to directly obtain the general feature expression of depth information

[0004] On the other hand, the existing fusion feature detection methods have speed limitations, and often require high-performance GPUs to obtain results after long-term calculations, which cannot meet the rigid requirements for high real-time performance in industrial systems.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0028] The technical solution of the present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments, and the following embodiments do not constitute a limitation of the present invention.

[0029] The general idea of the present invention is that, without relying on a large number of labeled depth image data sets, it is possible to fuse depth image and RGB image features across modalities, real-time, efficient, and accurately complete object recognition, positioning and detection. Train to obtain a fusion model that can accept cross-modal RGB and depth image input, and obtain the location and category information of multiple objects in real time. This solution needs to complete cross-modal feature transfer: initialize the depth map information network from the RGB model parameters and train the depth map model; and then initialize the feature extraction of the fusion network proposed by the present invention based on ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an object detection method based on fusion of cross-modal and multi-scale features. The depth map detection network model is initialized by the network parameters of the RGB detection network model; and then based on the obtained RGB detection network model and the depth map detection network model, respectively Initialize the feature extraction weights of the fusion network model, and finally train a fusion network model that fuses multi-scale and cross-modal features. The present invention does not rely on a large number of marked depth image data sets, and can fuse depth image and RGB image features across modalities, real-time, efficient, and accurately complete object recognition, positioning and detection. The fusion network model designed by the present invention only needs a consumer-grade graphics card and a CPU as hardware to achieve real-time detection speed.

Description

technical field [0001] The present invention relates to the field of image recognition technology, in particular to an object detection method based on cross-modal multi-scale feature fusion, which simultaneously completes detection and positioning of objects in color depth images (RGB-D images, including color information and depth information) and precise identification tasks. Background technique [0002] In industry, faster, more accurate and more generalizable object detection methods are always in urgent need. RGB images will be severely affected in some special environments, such as motion or glare, which will degrade image data. Using RGB image features to complete detection often cannot achieve the expected accuracy. So it is necessary to utilize information from different sensors such as depth information to improve the performance of object detection. [0003] Since convolutional neural networks are used for object recognition and detection tasks, most high-prec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06K9/62G06K9/46

CPCG06V10/56G06F18/253G06F18/214

Inventor 刘盛尹科杰刘儒瑜陈一彬沈康

Owner ZHEJIANG UNIV OF TECH

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An object detection method based on cross-modal and multi-scale feature fusion

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology