Methods and systems for image and voice processing

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
a voice processing and voice technology, applied in the field of systems and techniques for digital image and voice processing, can solve the problems of inordinate time and large amount of computer resources required for conventional techniques for processing computer generated videos, and achieve the effect of reducing the number of images and minimizing or reducing errors

Active Publication Date: 2021-02-25

NEON EVOLUTION INC

View PDF7 Cites 24 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Benefits of technology

This patent describes a way to use neural networks to classify and match faces, as well as a method to change the voice in a recording without affecting its overall structure and content. The invention uses a one-shot or few-shot architecture, which means a small number of example images are used to train the neural networks for future classification. This is in contrast to other methods that may require thousands of facial images for training. The patent also describes how a system can use an autoencoder to automatically generate a latent voice representation from a training voice sample, which can then be used to change the voice in a recording without affecting its overall structure and content. The invention is useful for applications such as voice swapping and audio re-recording.

Problems solved by technology

Conventional techniques for processing computer generated videos may require large amounts of computer resources, take an inordinate amount of time.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0035]As discussed above, conventional techniques for processing computer generated videos require large amounts of computer resources and take an inordinate amount of time. Further, certain relatively new applications for digital image processing, such as face-swapping, are becoming ever more popular, creating further demand for computer resources.

[0036]Conventionally, face-swapping is performed by capturing an image or a video of a person (sometimes referred to as the source) whose face is to be used to replace a face of another person in a destination video. For example, a face region in the source image and target image may be recognized, and the face region from the source may be used to replace the face region in the destination, and an output image / video is generated. The source face in the output preserves the expressions of the face in the original destination image / video (e.g., has lip motions, eye motions, eyelid motions, eyebrow motions, nostril flaring, etc.). If insuff...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

Systems and methods are disclosed configured to train an autoencoder using images that include faces, wherein the autoencoder comprises an input layer, an encoder configured to output a latent image from a corresponding input image, and a decoder configured to attempt to reconstruct the input image from the latent image. An image sequence of a face exhibiting a plurality of facial expressions and transitions between facial expressions is generated and accessed. Images of the plurality of facial expressions and transitions between facial expressions are captured from a plurality of different angles and using different lighting. An autoencoder is trained using source images that include the face with different facial expressions captured at different angles with different lighting, and using destination images that include a destination face. The trained autoencoder is used to generate an output where the likeness of the face in the destination images is swapped with the likeness of the source face, while preserving expressions of the destination face.

Description

INCORPORATION BY REFERENCE TO ANY PRIORITY APPLICATIONS[0001]Any and all applications for which a foreign or domestic priority claim is identified in the Application Data Sheet as filed with the present application are hereby incorporated by reference under 37 CFR 1.57.BACKGROUND OF THE INVENTIONField of the Invention[0002]This document relates to systems and techniques for digital image and voice processing.Description of the Related Art[0003]Conventional techniques for processing computer generated videos may require large amounts of computer resources, take an inordinate amount of time. Hence, more computer resource-efficient and time-efficient techniques are needed to perform advanced forms of digital image processing, such as face-swapping.SUMMARY[0004]The following presents a simplified summary of one or more aspects in order to provide a basic understanding of such aspects. This summary is not an extensive overview of all contemplated aspects, and is intended to neither ident...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(United States)

IPC IPC(8): G06K9/62G06K9/00G06T1/20G06T7/32G06V10/764G06V10/774

CPCG06K9/6256G06K9/00302G06T2207/20081G06T1/20G06T7/32G06K9/00268G06T11/001G06T11/60G06T13/80G10L2021/0135G10L2021/105G10L21/0356G06T17/00G06T13/40G06N3/084G06N3/088G06V40/168G06V40/174G06V10/454G06V10/82G06V10/764G06V10/774G06N3/045G06F18/214

Inventor BERLIN, CODY GUSTAVEBOGAN, III, CARL DAVISLANDE, KENNETH MICHAELLASER, JACOB MYLESLEE, BRIAN SUNGØLAND, ANDERS

Owner NEON EVOLUTION INC

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Methods and systems for image and voice processing

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Benefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology