Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

An image subtitle generation method based on MLL and ASCA-FR

An ASCA-FR, subtitle technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problems of low accuracy of subtitle description, poor reflection, and thinness, and achieve smooth and grammatical expression of subtitles. Accurate, full training process results

Active Publication Date: 2019-05-03
XIDIAN UNIV
View PDF5 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage of this method is that the visual attention model used in this method only considers the feature set of the image and the generated word information at the previous moment when outputting, and only uses the forward generation process from image to subtitle , so that the accuracy of the subtitle description is low, and it cannot reflect the content of the image well
The disadvantage of this method is that the loss function used to train the network is only based on the cross-entropy loss function of label subtitles, which makes the training process too thin, and the generated subtitles are not smooth and have many grammatical errors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • An image subtitle generation method based on MLL and ASCA-FR
  • An image subtitle generation method based on MLL and ASCA-FR
  • An image subtitle generation method based on MLL and ASCA-FR

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] Attached below figure 1 , the present invention is further described in detail.

[0041] Refer to the attached figure 1 , the implementation steps of the present invention are further described in detail.

[0042] Step 1, generate natural image test set and training set.

[0043] At least 10,000 natural images are randomly selected from the Internet or public image datasets to form a natural image collection.

[0044] No more than 5000 natural images are randomly selected from the natural image collection to form a natural image test set.

[0045] Configure English label subtitles for each remaining natural image in the natural image set, delete the part of English label subtitles greater than L, where L represents the maximum number of English words in the set subtitles, and the deleted label subtitles correspond to them The natural images constitute the natural image training set.

[0046] Set the English end character to .

[0047] The Englis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image subtitle generation method of feature reconstruction ASCA-FR of a joint attention mechanism based on multi-scale learning MLL and adjacent time nodes. The invention mainly solves the problems of inaccurate generated caption description and unsmooth expression caused by the fact that the output of an attention model at a certain moment only considers the feature setof an image and the word vector at the previous moment and only uses a cross entropy loss function to train a network in the prior art. The method comprises the following specific steps: (1) generating a natural image test set and a training set; (2) extracting feature vectors; (3) constructing ASCA-FR network; (4) training ASCA-FR network; (5) obtaining natural image subtitles; according to themethod, the MLL loss function pair is utilized to train the constructed ASCA-FR network, so that the generated subtitles are accurately described and smoothly expressed.

Description

technical field [0001] The invention belongs to the technical field of image processing, and further relates to a feature reconstruction ASCA-FR (Adjacent Step Co- Attention and Feature Reconstruction) image caption generation method. The present invention can extract and process semantic information in any natural image to generate an image caption corresponding to the natural image. Background technique [0002] For a natural image, humans can organize vivid and vivid language in the brain to describe the visual scene information in the image with just a quick glance. With the rapid development of artificial intelligence and deep learning technology, image caption generation, as an important research topic in the field of natural language processing, has attracted more and more attention. The task of image caption generation is to automatically generate captions for any natural image that are closely related to its semantic information. However, due to the complex and d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62
Inventor 何立火李琪琦高新波蔡虹霞路文张怡屈琳子钟炎喆武天妍
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products