Image multi-subtitle automatic generation method based on multiscale hierarchical residual network

An automatic generation, multi-scale technology, applied in the field of multi-subtitle acquisition, can solve problems such as easy to ignore image details, achieve the effect of solving gradient disappearance and gradient explosion problems, reducing parameters, and increasing funnel structure

Active Publication Date: 2018-03-27
ZHEJIANG GONGSHANG UNIVERSITY
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Detection-based method: Although the sequence-based method achieves high accuracy on the subtitle acquisition ta

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image multi-subtitle automatic generation method based on multiscale hierarchical residual network
  • Image multi-subtitle automatic generation method based on multiscale hierarchical residual network
  • Image multi-subtitle automatic generation method based on multiscale hierarchical residual network

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0052] In order to describe the present invention in more detail, the technical solution of the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0053] The multi-subtitle acquisition method provided in this embodiment can obtain a non-fixed number of categorical target descriptors in an image, and can be applied to semantic image search, the visual intelligence of chat robots, and the acquisition of subtitles of images and videos shared by social media.

[0054] The process of using the method for automatically generating image multiple captions based on the multi-scale hierarchical residual network in this embodiment to describe the target in the image semantically includes two parts: training and testing. Before describing these two parts, the following will focus on the multi-caption generation model adopted in this embodiment.

[0055] figure 1 It is a schematic diagram of the framework of a multi-subtitl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an image multi-subtitle automatic generation method based on a multiscale hierarchical residual network, and adopts an improved funnel network to capture multiscale target information. Firstly, when a funnel framework network is constructed, a densely connected polymerization residual block is put forward, and residual LSTM (Long Short Term Memory) is further put forward inorder to solve the problems of gradient vanishing and gradient explosion. By use of the method, high experiment performance is obtained, and the method has an obvious advantage on multi-subtitle taskacquisition.

Description

technical field [0001] The invention relates to a multi-subtitle acquisition technology, in particular to an image multi-subtitle automatic generation method based on a multi-scale layered residual network. Background technique [0002] Multi-caption acquisition is to obtain a non-fixed number of category target descriptors in an image. This work serves as a foundational service for many important applications, such as semantic image search, visual intelligence for chatbots, sharing images and videos on social media, helping people perceive the world around them, and more. [0003] The current study combines convolutional neural networks and recurrent neural networks to predict captions from image feature maps. However, some bottlenecks have been encountered in improving performance: 1) Target detection is still an open problem in computer vision; 2) From image feature space to description space is a nonlinear multimodal mapping; 3) Deeper network It is easier to learn thi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/00G06N3/04G06N3/08
CPCG06N3/08G06V30/413G06V30/40G06N3/045
Inventor 田彦王勋黄刚
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products