A doc2vec-based audio text alignment method and system

A text and audio technology, applied in the field of audio comparison e-book production, can solve the problems of time complexity, large decoding time consumption, and reduce the accuracy of recognition output, etc., to achieve the effect of reducing time complexity

Active Publication Date: 2021-12-21
BEIJING UNIV OF POSTS & TELECOMM +1
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

If the longer audio is directly recognized, it will not only bring a large decoding time consumption, but also reduce the accuracy of the recognition output
The second is the alignment algorithm. At present, various alignment algorithms are quite perfect, but there are still problems such as time complexity and high space complexity.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A doc2vec-based audio text alignment method and system
  • A doc2vec-based audio text alignment method and system
  • A doc2vec-based audio text alignment method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0064] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0065] The embodiment of the present invention discloses an audio-text alignment method based on Doc2Vec. The ultimate goal of audio-text alignment is to establish an association relationship between audio and text in the time dimension, that is, to find the corresponding text content in the audio time interval. There are generally three alignment levels for audio and text: paragraph alignment, sentence alignment, and word alignment. Since eBooks are inherentl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Doc2Vec-based audio-text alignment method and system. The method includes: performing threshold threshold estimation based on AIC-FCM optimized by simulated annealing genetic algorithm, and dividing the long audio with the book into short audio with sentence as the dimension , and conduct speech recognition on short audio to output short text with sentence as the dimension; extract paragraphs from e-books based on the Doc2Vec model, and obtain paragraph text with paragraph as the dimension; dynamic matching method based on threshold prediction method for short text and paragraph text Perform text similarity matching to complete text alignment. Compared with the traditional audio-text alignment algorithm, it is closer to the ideal segmentation result in long audio segmentation, and the alignment effect is basically the same as that of Doc2vec, and the time complexity is reduced by about 35%.

Description

technical field [0001] The invention relates to the technical field of natural language processing, and more specifically relates to the problem of alignment of audio texts, so as to improve the efficiency and quality of audio comparison e-book production. Background technique [0002] Based on the association between audiobooks and e-books, e-books that can produce sounds can be produced. This kind of e-book has important practical value in education, especially in language education scenarios. However, currently, the production of such e-books is not widely. The reason is that such books mostly rely on manual annotation for production, which limits their development to a large extent. As the technical feasibility of this type of book, audio text technology still has some problems. First of all, in terms of audio processing, the audio accompanying the book is generally read and recorded by professionals according to the text of the e-book, so it has the ability to match th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/194G06F16/35G06F16/65G06N3/12
CPCG06F40/194G06F16/35G06F16/65G06N3/126
Inventor 陈科良崔岩松任维政张晓欢樊昌熙孙孟寒张帅崔晨岩
Owner BEIJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products