A high-performance audio and video automatic sentence segmentation method and system

An audio and video, high-performance technology, applied in the field of high-performance audio and video automatic sentence segmentation method and system, can solve the problems of increasing time cost and labor cost, dislocation of start point and end point, slow speed of dictation, etc., and achieve saving Time cost and labor cost, eliminate the influence of noise or background sound, and improve processing efficiency

Active Publication Date: 2020-12-11
深圳亿幕信息科技有限公司
View PDF8 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Due to human-made slapping, there is often a certain delay, resulting in misalignment of the obtained start point and end point, and additional manual adjustment is required
Therefore, the whole process takes a lot of time and the accuracy is not high
For example, 30 minutes of audio requires 40 minutes to 1 hour of sentence segmentation time, which is extremely inefficient
In the process of subtitle production, if sentences are not segmented, but manual dictation is directly performed, it is often difficult to parallelize, and the speed of human dictation will be slower than the speed of automatic sentence segmentation by machines, which will increase a lot of time and labor costs.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A high-performance audio and video automatic sentence segmentation method and system
  • A high-performance audio and video automatic sentence segmentation method and system
  • A high-performance audio and video automatic sentence segmentation method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] like figure 1 As shown, embodiment 1 of the present invention provides a kind of high-performance audio-video automatic sentence segmentation method, comprises the following steps:

[0045] S1: Read the message to be processed from the MNS message queue, and the working thread in the MNS downloads the corresponding media file according to the queue task, and converts it into a wav format file;

[0046] S2: Set a sentence duration threshold t 0, randomly select multiple non-noise sampling points from the wav format file, and calculate the time interval t between adjacent non-noise sampling points; when t>t 0 When , the previous non-noise sampling point is set as a period, and the timestamp of the period is recorded, and the interval between two adjacent periods is an independent clause;

[0047] S3: Make subtitles matching the wav format file, and segment and mark the subtitles according to the timestamp; match the segmented subtitles with the media file according to t...

Embodiment 2

[0051] Embodiment 2 discloses a high-performance audio-video automatic sentence segmentation method on the basis of Embodiment 1. This Embodiment 2 further defines that in step S1, the MNS message queue includes an input channel and at least two output channels, and the output channel It is the consumption process or the consumption thread of the task message, the consumption process==the number of CPUs, and the consumption thread==the number of CPUs.

[0052] The number of consumption processes or consumption threads is automatically set according to the number of server CPUs before startup. Generally, the default is the same as the number of CPUs to ensure that multiple consumption processes or consumption threads run at the same time without conflicts or conflicts. There will be idle resources.

[0053] Such as figure 2 As shown, the specific method of step S2 is as follows:

[0054] S2.1: Set an amplitude threshold as the noise threshold A 0 , randomly select multiple ...

Embodiment 3

[0062] Embodiment 3 discloses a high-performance audio-video automatic sentence segmentation method on the basis of Embodiment 1. This embodiment 3 further defines the sampling point as a continuous frame, and the number of frames of each sampling point is equal. At this time In order to ensure that the data is valid and reliable, the amplitude A is the maximum amplitude of all valid frames in the sampling point, t is the time interval between the last frame of the previous sampling point and the first frame of the subsequent sampling point, and T is the preceding period The duration between the last frame of and the first frame after the period, and the timestamp is the time point of the last frame of the period.

[0063] Such as image 3 As shown, based on the above premise, the specific method of step S2.1 is as follows:

[0064] S2.1.1: Set an amplitude threshold as the noise threshold A 0 , randomly select a plurality of sampling points from the wav format file;

[006...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a high-performance audio-video automatic sentence breaking method and a corresponding system, which uniformly manages the information to be processed through an MNS message queue and realizes continuous processing of a large number of tasks through a multi-thread processing mode, thereby improving the processing efficiency. Selecting non-noisy sampling points from wav formatfiles can effectively eliminate the influence of noise or background sound and reduce the probability of meaningless sentence breakage. According to the language habit, the threshold value of sentence length t0 is set. When the adjacent non-noise sampling point t > t0, the minimum requirement of sentence length is satisfied, and the segmentation can be carried out at this time. When making subtitles, in order to adjust the time axis and match the sentence with the text, the timestamp of the period is used as the starting time of the corresponding text, and the subtitles are matched one by one, so as to complete the configuration of the subtitles. The automatic sentence segmentation method can effectively shorten the sentence segmentation time and improve the sentence segmentation accuracy, thereby greatly saving time cost and labor cost.

Description

technical field [0001] The invention belongs to the technical field of making audio and video subtitles, and in particular relates to a high-performance audio and video automatic sentence segmentation method and system. Background technique [0002] At present, when making audio and video subtitles, it is mainly through manual speech segmentation. The premise of artificial voice sentence segmentation is to listen to all the voices, and mark the start and end points of a sentence by tapping shortcut keys, memory capture, voice recognition, etc. Due to human-made slapping, there is often a certain delay, resulting in misalignment of the obtained start point and end point, and additional manual adjustment is required. Therefore, the whole process takes a lot of time and the accuracy is not high. For example, 30 minutes of audio requires 40 minutes to 1 hour of sentence segmentation time, which is extremely inefficient. In the process of subtitle production, if sentence segme...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G11B27/031G11B27/10
CPCG11B27/031G11B27/10
Inventor 邱理陈镇诚
Owner 深圳亿幕信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products