Image description generation method based on external triple and abstract relationship

An image description and triplet technology, applied in neural learning methods, computer parts, character and pattern recognition, etc., can solve the problem of too simple description, and achieve the effect of accurate description

Pending Publication Date: 2022-04-12
HANGZHOU DIANZI UNIV
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to address the deficiencies in the prior art and provide an image description generation method based on external triples and abstract relationships to solve the problem that the descriptions that cannot be generated by traditional image description generation methods are too simple, and based on the original improve forecast accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image description generation method based on external triple and abstract relationship
  • Image description generation method based on external triple and abstract relationship
  • Image description generation method based on external triple and abstract relationship

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0013] The present invention will be further described below in conjunction with the accompanying drawings.

[0014] refer to figure 1 and 5 Shown is a flow diagram of an overall embodiment of the present invention.

[0015] In order to solve these problems, the present invention constructs an external relation library, searches for similarity relation and abstract relation from the library according to the image target category, and integrates them with scene graph features. Specifically, we first use an open domain knowledge extraction tool to extract triples in image description texts, build an external relation library, and encode the features of the triples. According to the text similarity of the relations in the triples, the triples with high similarity are clustered into one class, which is called abstract relation. At the same time, the model performs target detection on the image to obtain target visual features and semantic labels. According to the text similari...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an image description generation method based on an external triple and an abstract relation. The method comprises the following steps: firstly, extracting a triple in an image description text, constructing an external relation library, and carrying out feature coding on the triple; and clustering the triples of which the text similarity is higher than a threshold value into one class. Meanwhile, the model performs target detection on the image to obtain a target visual feature set and a target category set; and querying a triple of which the target is similar to the target category in an external relation library according to the text similarity. The model predicts the target, the attribute and the relation of the image by using the visual features of the target to generate a scene graph; and fusing visual features and text features by using a convolutional neural network, and carrying out feature coding on targets, attributes and relationships. And finally, fusing target, attribute and relation coding features of the scene graph and coding features of the similarity relation and the abstract relation, and inputting the fused features into a double-layer LSTM sequence generation model to obtain final image description. According to the invention, the expression of the model generation description is richer.

Description

technical field [0001] The invention in this paper relates to an image description generation method, specifically an image description generation method based on external triples and abstract relationships, and belongs to the field of image description generation. Background technique [0002] Image caption generation is a comprehensive task combining computer vision and natural language processing, which is extremely challenging. Inspired by codecs, attention mechanisms, and reinforcement learning-based training objectives in the field of natural language processing, modern image description generation models have made amazing progress, and researchers are increasingly paying attention to the field of image description generation. It even surpasses humans in some evaluation metrics. [0003] The technology of image description generation methods continues to develop, but there is a problem that has never been solved but cannot be ignored, that is, the existing model is on...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V10/762G06V10/80G06V10/82G06F16/35G06K9/62G06N3/04G06N3/08
Inventor 姜明陈景翔张旻李鹏飞
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products