Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Audio text alignment method and system based on Doc2Vec

An audio and text technology, applied in the field of audio comparison e-book production, can solve the problems of lower recognition output accuracy, large decoding time consumption, and high space complexity

Active Publication Date: 2021-07-30
BEIJING UNIV OF POSTS & TELECOMM +1
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the longer audio is directly recognized, it will not only bring a large decoding time consumption, but also reduce the accuracy of the recognition output
The second is the alignment algorithm. At present, various alignment algorithms are quite perfect, but there are still problems such as time complexity and high space complexity.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio text alignment method and system based on Doc2Vec
  • Audio text alignment method and system based on Doc2Vec
  • Audio text alignment method and system based on Doc2Vec

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0065] The embodiment of the present invention discloses an audio-text alignment method based on Doc2Vec. The ultimate goal of audio-text alignment is to establish an association relationship between audio and text in the time dimension, that is, to find the corresponding text content in the audio time interval. There are generally three alignment levels for audio and text: paragraph alignment, sentence alignment, and word alignment. Since eBooks are inherentl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an audio text alignment method and system based on Doc2Vec; the method comprises the steps: carrying out the threshold estimation based on an AIC-FCM optimized through a simulated annealing genetic algorithm, segmenting an audio with the length of a book into short audios with sentences as dimensions, carrying out the voice recognition of the short audios, and outputting short texts with sentences as dimensions; carrying out paragraph extraction on the electronic book based on a Doc2Vec model to obtain the paragraph texts with paragraphs as dimensions; and performing text similarity matching on the short text and the paragraph text based on a dynamic matching method of a threshold prediction method to complete text alignment. Compared with a traditional audio text alignment algorithm, long audio segmentation is closer to an ideal segmentation result, the alignment effect is basically equal to that of Doc2vec, and the time complexity is reduced by about 35%.

Description

technical field [0001] The invention relates to the technical field of natural language processing, and more specifically relates to the problem of alignment of audio texts, so as to improve the efficiency and quality of audio comparison e-book production. Background technique [0002] Based on the association between audiobooks and e-books, e-books that can produce sounds can be produced. This kind of e-book has important practical value in education, especially in language education scenarios. However, currently, the production of such e-books is not widely. The reason is that such books mostly rely on manual annotation for production, which limits their development to a large extent. As the technical feasibility of this type of book, audio text technology still has some problems. First of all, in terms of audio processing, the audio accompanying the book is generally read and recorded by professionals according to the text of the e-book, so it has the ability to match th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/194G06F16/35G06F16/65G06N3/12
CPCG06F40/194G06F16/35G06F16/65G06N3/126
Inventor 陈科良崔岩松任维政张晓欢樊昌熙孙孟寒张帅崔晨岩
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products