Image description generation method based on text hierarchical structure

A hierarchical structure and image description technology, applied in the direction of neural learning methods, neural architecture, biological neural network models, etc., can solve problems such as no semantic information utilization, visual intuitive cognition deviation, and inability to conform to natural language habits, etc., to achieve natural language Rich and Accurate Effects

Active Publication Date: 2021-10-29
HUBEI UNIV OF TECH
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, such methods do not distinguish non-visual words, the gradient from non-visual words may be misleading and reduce the overall effectiveness of visual signals in guiding the generation of text descriptions, the semantic information is not fully utilized, and there is semantic feature extraction ability Insufficient defects make the generated description not vivid enough to conform to natural language habits
And the image description generation model under the existing attention mechanism usually only pays attention to the local info

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image description generation method based on text hierarchical structure
  • Image description generation method based on text hierarchical structure
  • Image description generation method based on text hierarchical structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0072] The image description generation method based on the text hierarchy structure proposed by the present invention uses the VGG network model to extract local features and global features from the image in the image encoding stage, and uses the double-layer improved LSTM structure and adaptive attention mechanism to image in the decoding stage. Features are decoded and image features are translated into natural language, such as figure 1 shown. The method...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an image description generation method based on a text hierarchical structure. According to the invention, a double-layer LSTM decoder is constructed, a visual and language information selection mechanism is introduced, effective selection is performed between image features and language information by utilizing image global features, word embedding and an attention guiding mechanism, and semantic information is generated through decoding to describe sentences more accurately. Aiming at the defect that a traditional language model is insufficient in semantic feature extraction capability, an ordered long-short term memory network improved by an FARIMA filter is introduced in a decoding stage. Semantic information of different text hierarchies is reserved through hierarchical structures of coded sentences, and semantic alignment is performed on the content by utilizing image space information. The cross-modal representation capability of a decoder in alignment of image features and semantic features is improved, and the long-time dependence of a network is increased. The method extracted according to the invention is richer in semantic relationship and more in line with natural language habits.

Description

technical field [0001] The invention relates to the fields of computer vision and natural language processing, in particular to an image description generation method based on a text hierarchy. Background technique [0002] Image description generation is known as one of the most important comprehensive topics in the intersection of computer vision and natural language processing. The main task of image description is to input an image to the machine, and then enable the machine to recognize and understand the objects, object attributes and their relationships in the image, and automatically generate a descriptive text with correct semantics and grammar in human natural language . [0003] Image description generation not only requires the model to recognize the target in the image, but also needs to recognize other visual elements, such as the actions and attributes of the target, understand the relationship between the targets, and generate human-readable description sent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62G06K9/46G06N3/04G06N3/08
CPCG06N3/049G06N3/08G06N3/045G06F18/2415G06F18/214Y02D10/00
Inventor 靳华中袁福祥包志熙黎林姚颖
Owner HUBEI UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products