An image subtitle generation method based on MLL and ASCA-FR

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An ASCA-FR, subtitle technology, applied in character and pattern recognition, instruments, computer parts, etc., can solve the problems of low accuracy of subtitle description, poor reflection, and thinness, and achieve smooth and grammatical expression of subtitles. Accurate, full training process results

Active Publication Date: 2019-05-03

XIDIAN UNIV

View PDF5 Cites 11 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

The disadvantage of this method is that the visual attention model used in this method only considers the feature set of the image and the generated word information at the previous moment when outputting, and only uses the forward generation process from image to subtitle , so that the accuracy of the subtitle description is low, and it cannot reflect the content of the image well

The disadvantage of this method is that the loss function used to train the network is only based on the cross-entropy loss function of label subtitles, which makes the training process too thin, and the generated subtitles are not smooth and have many grammatical errors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0040] Attached below figure 1 , the present invention is further described in detail.

[0041] Refer to the attached figure 1 , the implementation steps of the present invention are further described in detail.

[0042] Step 1, generate natural image test set and training set.

[0043] At least 10,000 natural images are randomly selected from the Internet or public image datasets to form a natural image collection.

[0044] No more than 5000 natural images are randomly selected from the natural image collection to form a natural image test set.

[0045] Configure English label subtitles for each remaining natural image in the natural image set, delete the part of English label subtitles greater than L, where L represents the maximum number of English words in the set subtitles, and the deleted label subtitles correspond to them The natural images constitute the natural image training set.

[0046] Set the English end character to .

[0047] The Englis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses an image subtitle generation method of feature reconstruction ASCA-FR of a joint attention mechanism based on multi-scale learning MLL and adjacent time nodes. The invention mainly solves the problems of inaccurate generated caption description and unsmooth expression caused by the fact that the output of an attention model at a certain moment only considers the feature setof an image and the word vector at the previous moment and only uses a cross entropy loss function to train a network in the prior art. The method comprises the following specific steps: (1) generating a natural image test set and a training set; (2) extracting feature vectors; (3) constructing ASCA-FR network; (4) training ASCA-FR network; (5) obtaining natural image subtitles; according to themethod, the MLL loss function pair is utilized to train the constructed ASCA-FR network, so that the generated subtitles are accurately described and smoothly expressed.

Description

technical field [0001] The invention belongs to the technical field of image processing, and further relates to a feature reconstruction ASCA-FR (Adjacent Step Co- Attention and Feature Reconstruction) image caption generation method. The present invention can extract and process semantic information in any natural image to generate an image caption corresponding to the natural image. Background technique [0002] For a natural image, humans can organize vivid and vivid language in the brain to describe the visual scene information in the image with just a quick glance. With the rapid development of artificial intelligence and deep learning technology, image caption generation, as an important research topic in the field of natural language processing, has attracted more and more attention. The task of image caption generation is to automatically generate captions for any natural image that are closely related to its semantic information. However, due to the complex and d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06K9/62

Inventor 何立火李琪琦高新波蔡虹霞路文张怡屈琳子钟炎喆武天妍

Owner XIDIAN UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

An image subtitle generation method based on MLL and ASCA-FR

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology