Unlock instant, AI-driven research and patent intelligence for your innovation.

Audio editing system and audio editing method

An editing system and audio technology, applied in the audio editing system and audio editing field, can solve the problems of model convergence speed, flexibility limitation, error accumulation, etc., and achieve the effect of fast convergence speed, great flexibility, and small error

Active Publication Date: 2016-12-14
SONY CORP +1
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

These two frameworks have their own advantages and disadvantages. The former inherits the error of the segmentation step without correction in the clustering process. Due to the limitations of the distance measurement method, errors will accumulate; Markov model, the initialization of the model is carried out by directly dividing the audio data into equal parts, the initial error introduced is relatively large, which brings certain problems to the convergence speed of the model, and because the hidden Markov model is based on the characteristics of frame classification , there will be a certain amount of error introduced without restrictions when performing segmentation. The general approach is to add a certain duration limit to the dwell time of each hidden Markov model. This approach brings a lot of flexibility to the system. big limitations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio editing system and audio editing method
  • Audio editing system and audio editing method
  • Audio editing system and audio editing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0038] The present invention will be described in detail below in conjunction with specific embodiments and accompanying drawings. The following description is divided into a plurality of embodiments for the convenience of description, but each embodiment is only an illustration, and those skilled in the art should understand various modifications, amendments, substitutions, substitutions, and the like. Specific numerical examples have been used to facilitate understanding of the invention, but unless otherwise specified, those numerical values ​​are merely examples, and any appropriate value can be used. In order to facilitate the understanding of the invention, specific mathematical formulas have been used, but unless otherwise specified, those mathematical formulas are merely examples, and any appropriate mathematical formula can be used. The distinction between the respective embodiments is not essential to the present invention, and the matters described in the respective...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The audio editing system includes: a plurality of initial segmentation devices, which respectively initially segment the audio streams from multiple channels into a plurality of different paragraphs; Synthesize, and select the audio stream of the optimal channel from every two adjacent segmentation points, so as to obtain multiple initial segmentation segments, and fuse the obtained multiple initial segmentation segments to form a unified audio data file ; The audio clustering device, based on the algorithm of hierarchical clustering, carries out supervised clustering to a plurality of initial segmentation segments, and the initial segmentation segments belonging to the same nature are gathered into a category; the re-segmentation device utilizes the audio clustering device As a result of the clustering, train the hidden Markov model corresponding to each category, and perform Viterbi alignment segmentation on the unified audio file to obtain the re-segmented audio stream. Through the above high-precision speaker segmentation system, the accuracy of final speaker clustering can be improved.

Description

technical field [0001] The invention relates to the technical field of audio clustering, in particular to an audio editing system and an audio editing method. Background technique [0002] Speaker clustering is a specific application of clustering technology in speech signal processing. Its purpose is to classify speech segments so that each class only contains the same speaker data, and the data of the same speaker are merged into in the same class to obtain speaker-specific information. From an application point of view, speaker clustering technology can be applied to audio information management, retrieval and other fields. It facilitates speaker tracking in audio streams of conferences, voicemails, lectures, and news broadcasts, enabling structured analysis, understanding, and management of audio data. In particular, clustering algorithms are also of great practical value to speech recognition systems. Almost all automatic speech recognition systems today use adaptive ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L25/48
Inventor 卢鲤赵庆卫颜永红刘昆吴伟国
Owner SONY CORP
Features
  • R&D
  • Intellectual Property
  • Life Sciences
  • Materials
  • Tech Scout
Why Patsnap Eureka
  • Unparalleled Data Quality
  • Higher Quality Content
  • 60% Fewer Hallucinations
Social media
Patsnap Eureka Blog
Learn More