GRU codec training method and audio abstract generation method and device

A codec and training method technology, applied in the fields of audio summary generation and GRU codec training method, can solve the problems of different expected content, time-consuming and labor-intensive, affecting user experience, etc., to save human, material and financial resources, The effect of improving efficiency and accuracy

Active Publication Date: 2019-09-10
AISPEECH CO LTD
View PDF3 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] As a result, people who search for related content often get different content from the expected content, which affects the user experience.
For this reason, audio and video application software operators can only manually mark the content uploaded by users and store it in association with the corresponding audio and video.
[0005] This method is not only time-consuming and labor-intensive, but also has high labor costs.
Moreover, due to the limitation of marking the cognitive ability of individuals, the marking of audio and video often lacks diversity, so that when the user searches using an expression form different from the marked content, it is impossible to search for the actual matching audio and video content.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • GRU codec training method and audio abstract generation method and device
  • GRU codec training method and audio abstract generation method and device
  • GRU codec training method and audio abstract generation method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them.

[0047] This application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, progra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a GRU codec training method. The GRU codec training method comprises: segmenting a sample audio into a plurality of sub-sample audio segments; obtaining sample Fbank filter characteristics of the plurality of sub-sample audio segments; inputting the sample Fbank filter features into a to-be-trained GRU encoder to obtain a sample feature vector with a fixed length; inputtingthe sample feature vectors into a to-be-trained GRU decoder to obtain corresponding sample word vector embedding and sample hidden layer vectors; generating corresponding reference sample word vectorembedding according to the sample annotation statement corresponding to the sample audio; generating word-level cross entropy loss according to the sample word vector embedding and the reference sample word vector embedding; and optimizing and adjusting network parameters of the to-be-trained GRU decoder and the to-be-trained GRU encoder at least based on the word-level cross entropy loss. According to the method, automatic generation of the character abstract according to the audio becomes possible, manpower, material resources and financial resources are saved, and efficiency and accuracy are greatly improved.

Description

technical field [0001] The present application relates to the technical field of artificial intelligence, and in particular to a GRU codec training method, and an audio abstract generation method and device. Background technique [0002] With the rapid development and maturity of smart terminal devices (smart phones, tablet computers, etc.), multimedia (audio and video, etc.) For example, Himalaya, Douyin, Kuaishou, etc.). [0003] Users can upload or search on these application software to listen to or watch the content they are interested in. However, at present, what is used to characterize audio and video content is often the name given by the uploader himself, and it has been found in practice that the content named by the uploader often exists when the actual content of the audio and video does not match (for example, uploading this is often just to cheat clicks, and will The audio and video uploaded by myself is named as the current hot event, in order to achieve th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/635G06K9/62G06N3/04G10L19/16G10L19/26
CPCG06F16/635G10L19/26G10L19/167G06N3/045G06F18/214
Inventor 吴梦玥俞凯徐薛楠丁翰林
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products