Visual salience and semantic attribute based cross-modal image natural language description method

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of semantic attributes and natural language, applied in the field of natural language description of cross-modal images based on visual salience and semantic attributes, can solve the problems of lack of focus and low accuracy of target description, so as to increase the importance and reduce the contribution , the effect of improving the accuracy

Active Publication Date: 2018-02-13

XIDIAN UNIV

View PDF8 Cites 51 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] To sum up, the problems existing in the existing technology are: the current top-down image description method does not focus on the focus and the description accuracy of each target is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] In order to make the object, technical solution and advantages of the present invention more clear, the present invention will be further described in detail below in conjunction with the examples. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

[0035] The application principle of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0036] Such as figure 1 As shown, the cross-modal image natural language description method based on visual salience and semantic attributes provided by the embodiment of the present invention includes the following steps:

[0037] S101: Divide the image into sub-regions and use CNN to extract multi-scale depth visual features from the image;

[0038] S102: Input the multi-scale feature vector extracted by CNN into the pre-trained saliency model, regress the saliency score of each s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention belongs to the technical field of computer vision and natural language processing, and discloses a visual salience and semantic attribute based cross-modal image natural language description method. The method comprises the steps that multiscale deep visual features of all regions are extracted by adopting a convolutional neural network; by means of a pre-trained significance model,an image significance graph is returned, and an original image is weighted; a predefined dictionary is built to serve as a semantic attribute category, and semantic attribute detection is conducted ona visual significance image; semantic attributes are calculated through multi-instance learning; image features are weighted through the semantic attributes; visual-salience-based semantic attributefeatures are decoded through a long short-term memory network, and image description is generated. The method has the advantage of being high in accuracy and can be used for image retrieval under complex scenes, multi-objective image semantic understanding and the like.

Description

technical field [0001] The invention belongs to the technical field of computer vision and natural language processing, and in particular relates to a natural language description method for cross-modal images based on visual salience and semantic attributes. Background technique [0002] The automatic image description system can automatically generate accurate, fluent, and close to human natural language descriptions based on the interactive relationship between objects and the environment in the image, so as to understand the semantics of the content in the visual scene. The system unifies image visual features and semantic information, makes image semantic information reflect its visual content more objectively, and uses semantic information for high-level reasoning, large-scale image organization, and final image understanding. Compared with other popular directions in the field of computer vision such as image retrieval, image segmentation and other fields, the essence...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06K9/62G06N3/08

CPCG06N3/084G06F18/217G06F18/214

Inventor 田春娜王蔚高新波李明郎君王秀美张相南刘恒袁瑾

Owner XIDIAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Visual salience and semantic attribute based cross-modal image natural language description method

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology