Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Image title generation method based on conditional embedding pre-training language model

A language model and pre-training technology, applied to neural learning methods, biological neural network models, computer components, etc., can solve problems such as not being able to learn from image information at all times, and achieve good robustness and self-adaptability

Active Publication Date: 2021-07-20
HANGZHOU DIANZI UNIV
View PDF18 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The method of the present invention solves the problem that the pre-trained language model cannot always learn from image information when performing downstream tasks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image title generation method based on conditional embedding pre-training language model
  • Image title generation method based on conditional embedding pre-training language model
  • Image title generation method based on conditional embedding pre-training language model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0060] Such as Figure 6 As shown, the target detected by the target detection algorithm includes: flower vase lavender, construct a keyword set W={flower vase lavender}, and compose the input sequence S with the keyword set and the special characters improved in steps 1-2. Input it into the CE-UNILM model, and the predicted result is: a flower in a vase of purple lavender.

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image title generation method based on a conditional embedding pre-training language model. The invention provides a network based on a pre-training language model, and the network is called CE-UNILM. A KEN is constructed at the input end of a pre-training language model UNILM, the KEN performs target detection on an image by using a target detection method, and a result is used as key text information and is input in a keyword embedding manner. Image features are extracted by constructing a VEN, and an image is coded and input in a conditional embedding mode. Meanwhile, according to the CELN provided by the invention, the CELN is an effective mechanism for adjusting the pre-training language model to perform feature selection through visual embedding, and the CELN is applied to a transformer in the unified pre-training language model. The result shows that the method has better robustness and adaptive ability.

Description

technical field [0001] The invention belongs to the technical field of image description, and relates to a method for generating an image title, in particular to a method for generating an image title based on a conditional embedded pre-trained language model. Background technique [0002] Large-scale pre-trained language models have greatly improved the performance of text understanding tasks and text generation tasks, which has also changed researchers' research methods, making adjustments to pre-trained language models for downstream tasks a mainstream method. There are more and more researches on image-text, speech-text, etc., and the specific applications include image subtitles, video subtitles, image question answering, video question answering, etc. [0003] Compared with the traditional encoding-decoding task process, the results of the pre-trained language model on natural language processing tasks are excellent. This is because articles and sentences are inherent...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/46G06N3/04G06N3/08
CPCG06N3/08G06V10/40G06N3/044G06F18/2411G06F18/214
Inventor 张旻林培捷李鹏飞姜明汤景凡
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products