Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Automated voice and speech labeling

a voice and speech labeling and automatic technology, applied in the field of speech data analysis system, can solve the problems of a large amount of training, several limitations on the application of the approach, and a substantial barrier to adoption by new users, and achieve the effect of reducing the number of users, reducing and increasing the difficulty of speech analysis

Active Publication Date: 2013-10-03
SRC INC
View PDF15 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The present invention provides a speech-to-text transcription system that does not require excessive training. The system uses speech decomposition and normalization methods to provide a normalized signal. It compares obtained data to a database of language, dialect, accent, and speaker attributes in order to produce a transcription of the speech. Other objects and advantages of the invention are to improve the accuracy and efficiency of speech-to-text transcription systems.

Problems solved by technology

However, each of these techniques applies only rudimentary signal processing techniques, and none are able to achieve high levels of accuracy without a large amount of training.
Although effective, domain training results in several limitations on the application of the approach, both in the specific speech domain and in how much confidence the user has in the product.
Additionally, in situations in which a significant amount of training is required, the time and effort required can be a substantial barrier to adoption by new users.
In addition to training requirements, ASR systems continue to suffer from less-than-perfect accuracy, with some estimating a current peak effectiveness of only 80-90%.
Although ASR systems can greatly increase productivity, the need to correct converted speech detracts from the possible productivity maximum.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automated voice and speech labeling
  • Automated voice and speech labeling
  • Automated voice and speech labeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020]Referring now to the drawings, wherein like reference numerals refer to like parts throughout, FIG. 1 depicts a flowchart overview of a method of automated voice and speech labeling according to one embodiment of the present invention. The initial inputs comprises digital audio signal 2 and can comprise any digital audio signal, but preferably comprises words spoken aloud by a human being or computerized device. As a pre-processing step, an analog waveform, for example, may be digitized according to any method of analog-to-digital conversion known in the art, including through the use of a commercially-available analog-to-digital converter. A digital signal that is either received by the system for analysis or created per the digitization of an analog signal can be further prepared for downstream analysis by any known process of digital manipulation known in the art, including but not limited to storage of the signal in a database until it is needed or the system is ready to a...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A system and method for voice and speech analysis which correlates a speaker signal source and a normalized signal comprising measurements of input acoustic data to a database of language, dialect, accent, and / or speaker attributes in order to create a transcription of the input acoustic data.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS[0001]This application claims priority to U.S. Provisional Application No. 61 / 617,884, filed on Mar. 30, 2012, the entirety of which is hereby incorporated by reference.BACKGROUND OF THE INVENTION[0002]1. Field of the Invention[0003]The present invention relates to a speech data analysis system, and, more specifically, to a system that correlates speaker signal source and a normalized signal comprising measurements of input acoustic data to a database of language, dialect, accent, and / or speaker attributes in order to create a detailed transcription of the input acoustic data.[0004]2. Description of the Related Art[0005]Speech transcription is an evolving area of technology served by several disparate technologies targeted at subsets of the issue. Individual systems and applications focus on and attempt to solve their own problems, including speech-to-text, phrase and word recognition, language recognition, and speaker identification. However, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(United States)
IPC IPC(8): G10L15/26
CPCG10L15/10G10L15/26G06F16/685G10L15/22
Inventor ELLER, DAVID DONALDMORPHET, STEVEN BRIANBOYETT, WATSON BRENT
Owner SRC INC
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products