Image description generation method based on external triple and abstract relationship

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An image description and triplet technology, applied in neural learning methods, computer parts, character and pattern recognition, etc., can solve the problem of too simple description, and achieve the effect of accurate description

Pending Publication Date: 2022-04-12

HANGZHOU DIANZI UNIV

View PDF0 Cites 4 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The purpose of the present invention is to address the deficiencies in the prior art and provide an image description generation method based on external triples and abstract relationships to solve the problem that the descriptions that cannot be generated by traditional image description generation methods are too simple, and based on the original improve forecast accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0013] The present invention will be further described below in conjunction with the accompanying drawings.

[0014] refer to figure 1 and 5 Shown is a flow diagram of an overall embodiment of the present invention.

[0015] In order to solve these problems, the present invention constructs an external relation library, searches for similarity relation and abstract relation from the library according to the image target category, and integrates them with scene graph features. Specifically, we first use an open domain knowledge extraction tool to extract triples in image description texts, build an external relation library, and encode the features of the triples. According to the text similarity of the relations in the triples, the triples with high similarity are clustered into one class, which is called abstract relation. At the same time, the model performs target detection on the image to obtain target visual features and semantic labels. According to the text similari...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image description generation method based on an external triple and an abstract relation. The method comprises the following steps: firstly, extracting a triple in an image description text, constructing an external relation library, and carrying out feature coding on the triple; and clustering the triples of which the text similarity is higher than a threshold value into one class. Meanwhile, the model performs target detection on the image to obtain a target visual feature set and a target category set; and querying a triple of which the target is similar to the target category in an external relation library according to the text similarity. The model predicts the target, the attribute and the relation of the image by using the visual features of the target to generate a scene graph; and fusing visual features and text features by using a convolutional neural network, and carrying out feature coding on targets, attributes and relationships. And finally, fusing target, attribute and relation coding features of the scene graph and coding features of the similarity relation and the abstract relation, and inputting the fused features into a double-layer LSTM sequence generation model to obtain final image description. According to the invention, the expression of the model generation description is richer.

Description

technical field [0001] The invention in this paper relates to an image description generation method, specifically an image description generation method based on external triples and abstract relationships, and belongs to the field of image description generation. Background technique [0002] Image caption generation is a comprehensive task combining computer vision and natural language processing, which is extremely challenging. Inspired by codecs, attention mechanisms, and reinforcement learning-based training objectives in the field of natural language processing, modern image description generation models have made amazing progress, and researchers are increasingly paying attention to the field of image description generation. It even surpasses humans in some evaluation metrics. [0003] The technology of image description generation methods continues to develop, but there is a problem that has never been solved but cannot be ignored, that is, the existing model is on...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06V10/762G06V10/80G06V10/82G06F16/35G06K9/62G06N3/04G06N3/08

Inventor 姜明陈景翔张旻李鹏飞

Owner HANGZHOU DIANZI UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Image description generation method based on external triple and abstract relationship

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology