Supercharge Your Innovation With Domain-Expert AI Agents!

Method and device of training a captioning model, computer equipment and storage medium

A subtitle and model technology, applied in the field of training subtitle models, can solve problems such as low training quality, high training difficulty, and data consumption, and achieve the effects of simplifying the training process, improving training quality, and saving memory and data consumption

Pending Publication Date: 2020-10-30
TENCENT AMERICA LLC
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] Embodiments of the present application provide a method and device for training subtitle models, computer equipment and storage media, aiming to solve the problem that existing subtitle model training methods consume both memory and data, and the training is difficult and the training quality is not high The problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device of training a captioning model, computer equipment and storage medium
  • Method and device of training a captioning model, computer equipment and storage medium
  • Method and device of training a captioning model, computer equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] Currently, great progress has been made in image and video captioning. Much of this is due to advances in machine translation. For example, the encoder-decoder framework and attention mechanism were first introduced in machine translation and then extended to subtitles. Both image captioning methods and video captioning methods follow their pipelines and apply an attention mechanism in caption generation. Compared with image subtitles, video subtitles describe dynamic scenes rather than static ones.

[0018] from figure 1 As can be seen in , video captioning is much more difficult due to larger appearance variations. Some related techniques propose boundary-aware long-short-term memory (LSTM, long short-term memory) units to automatically detect temporal video segments. Some related techniques integrate natural language knowledge into their networks by training linguistic LSTM models on large external text datasets. Some related technologies extend the Gated Recur...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method of training a captioning model used to perform automatic video captioning of an input video, including initializing a plurality of long short-term memory (LSTM) units included in the captioning model using cross-entropy loss; training the LSTM units using reinforcement learning; training the LSTM units and a plurality of convolutional neural networks (CNNs) included in the captioning model using multitask training; and generating a video caption corresponding to the input video using the captioning model.

Description

[0001] priority information [0002] This application claims priority to U.S. Application No. 16 / 396,924, entitled "End-to-End Video Captioning with Multi-Task Reinforcement Learning," filed April 29, 2019, the entire contents of which are incorporated by reference In this application. technical field [0003] This application relates to video subtitle technology. Specifically, the present application relates to a method and device for training a subtitle model, a computer device and a storage medium. Background technique [0004] Video subtitles are crucial for many downstream applications such as video retrieval, indexing, browsing, etc. Existing video captioning methods are trained component by component, and the quality of the overall system is affected by the performance of each individual component. [0005] End-to-end (E2E) training of related techniques is often hampered by hardware constraints (e.g., graphics processing unit (GPU) memory) and is prone to overfit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04N21/488H04N21/81H04N5/278G06N3/04G06N3/08
CPCH04N21/4884H04N21/8133H04N5/278G06N3/08G06N3/044G06N3/045G06N3/006G06N3/088G06V20/41G06V10/82G06V10/764G06F18/2413G06N20/00G06V20/47G06F18/217
Inventor 宫博庆
Owner TENCENT AMERICA LLC
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More