Device and method for generating voice animation from audio signal

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An audio signal and voice technology, which is applied in the field of devices for generating voice animation, can solve problems such as decision-making, and achieve the effect of high accuracy and simple execution method.

Pending Publication Date: 2021-12-03

TCL CORPORATION

View PDF0 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, in cases where audio is the only input available, the quality of the results largely depends on real-time phoneme recognition, such as recognition accuracy and latency, due to resource constraints or privacy concerns.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0046] The technical solutions of the embodiments of the present invention are described below with reference to the accompanying drawings. The described embodiments are only some, but not all, embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0047] Several recent works demonstrate that believable voice animation is achievable. In some works, the audio input is divided into small segments called frames, and the basic frequency features are extracted from each frame. Phonemes are perceptually distinct units of sound in a given language that distinguish words from words. Phonemes are predicted by identifying vowels and basic fricative consonants from the features, which are then mapped with corresponding static animations called visemes, i.e. copies of the phonemes in the visual domain....

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a device and a method for generating a voice animation from an audio signal. The method comprises the following steps: receiving the audio signal; and converting the received audio signal into a frequency domain audio feature; the frequency domain audio features are input into a neural network to execute a neural network processing process to identify phonemes, the neural network is trained by using a phoneme data set, and the audio data set comprises audio signals with corresponding accurate phoneme labels; and generating a voice animation from the recognized phonemes.

Description

technical field [0001] The invention relates to the field of animation generation, in particular to a device and method for generating voice animation. Background technique [0002] Avatars are very popular these days in streaming video, gaming, and virtual reality-related applications. These characters interact with each other in the virtual world, and sometimes they interact with viewers or players in the real world. Humans are very sensitive to any facial artifacts and incongruous or out-of-sync performance of virtual characters, which makes facial animation, especially voice animation, very challenging since animation involves both voice and mouth movements. [0003] In realistic voice animation, artists seek the most immersive experience with highly anthropomorphic virtual characters. However, high costs in the production process and huge requirements for data including audio, video and possibly 3D models greatly undermine its expansion potential. In the case of an o...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/10G10L25/24G10L15/14G10L15/16

CPCG10L21/10G10L25/24G10L15/16G10L15/142G10L2015/025G10L15/02G10L13/047G10L25/30

Inventor 郁子潇汪灏泓

Owner TCL CORPORATION

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Device and method for generating voice animation from audio signal

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology