Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Optimal mapping cross-language tone conversion method and system based on PPG consistency

An optimal mapping and timbre conversion technology, applied in the field of language recognition, can solve the problems of different distribution coverage, inaccurate description, inconsistent input data, etc., to improve satisfaction, better convey information and intention, and enrich voice speaker selection Effect

Pending Publication Date: 2021-08-31
SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The purpose of the present invention is to make up for the defects of the prior art, and provide a method, system and electronic equipment for optimally mapping cross-language timbre conversion based on PPG consistency, so as to solve the inaccurate description and distribution coverage of PPG cross-language in the prior art. different degrees, and the inconsistency of the input data in the neural network training and synthesis stages

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Optimal mapping cross-language tone conversion method and system based on PPG consistency
  • Optimal mapping cross-language tone conversion method and system based on PPG consistency
  • Optimal mapping cross-language tone conversion method and system based on PPG consistency

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] The specific implementation manner of the present invention will be described in more detail below with reference to schematic diagrams. The advantages and features of the present invention will be more apparent from the following description. It should be noted that all the drawings are in a very simplified form and use imprecise scales, and are only used to facilitate and clearly assist the purpose of illustrating the embodiments of the present invention.

[0048] First, an explanation of the terms used in this application:

[0049] PPG: Phonetic PosteriorGram, that is, when performing the classification task of ASR identifying a specific phoneme, it is often possible to obtain the posterior probability of a certain speech frame belonging to all possible phonemes, which is called the phoneme posterior probability map. The PPG for each frame may represent a representation of the speech content of the current speech frame. With multi-speaker general ASR, the PPG of an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an optimal mapping cross-language tone conversion method and system based on PPG consistency and electronic equipment. The optimal mapping cross-language timbre conversion method based on PPG consistency comprises the steps of firstly, extracting frame-level acoustic features of converted voice through a voice signal processing technology, and obtaining the representation PPG of the frame-level voice content corresponding to the voice waveform through ASR calculation; meanwhile, in combination with a preset large corpus of the target speaker, carrying out optimal search in the PPG set of the target speaker so as to obtain a mapping sequence which can accurately represent the voice content of the converted voice and conforms to the characteristics of the target speaker; and finally, converting the mapping sequence into a natural voice waveform through a neural network acoustic model and a vocoder. According to the invention, the relation between the PPG modeling converted voice and the target speaker corpus is represented through the frame-level voice content, and the limitation of specific languages is not involved, so that cross-language tone conversion can be realized.

Description

technical field [0001] The invention relates to the technical field of language recognition, in particular to a PPG consistency-based optimal mapping cross-language timbre conversion method, system and electronic equipment. Background technique [0002] Voice transformation is the modification of one speaker's speech to make it sound as if it was uttered by another specific speaker. Speech conversion can be widely used in many fields including customized feedback of computer-aided pronunciation trimming systems, development of personalized speaking aids for speech-impaired subjects, film dubbing using various human voices, etc. [0003] Due to the rise of globalization, in social media texts, informal information and voice navigation, there is an alternation of different language content in text or speech. In the man-machine oral dialogue system, when synthesizing such sentences, the voice must be consistent, and the pronunciation should be accurate and natural, but in fact...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/00G10L21/003
CPCG10L15/005G10L21/003
Inventor 吴志勇户建坤陈学源
Owner SHENZHEN GRADUATE SCHOOL TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products