Method for text error correction after voice recognition based on domain identification

A speech recognition and text error correction technology, applied in speech recognition, speech analysis, natural language data processing, etc., can solve the problems of a large number of manual intervention, low error correction efficiency, and inability to correct proprietary names, etc., to reduce a lot of time Effects of loss, data accuracy and authenticity, enhanced practicability and robustness

Active Publication Date: 2018-02-27
SICHUAN CHANGHONG ELECTRIC CO LTD
View PDF8 Cites 55 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The technical problem to be solved by the present invention is: to propose a method for correcting text after speech recognition based on field recognition, which solves the problem that the processing method in the traditional technology requires a lot of manual intervention, the error correction efficiency is low, and the proper name cannot be corrected. error correction problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for text error correction after voice recognition based on domain identification
  • Method for text error correction after voice recognition based on domain identification

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0037] The present invention aims to propose a method for correcting text errors after speech recognition based on domain recognition, which solves the problems that the processing method in the traditional technology requires a lot of manual intervention, the error correction efficiency is low, and the proper name cannot be corrected.

[0038] The invention adopts the Bigram model and the whoosh search engine to judge the field of the input text. By introducing the Markov hypothesis, Bigram solves the problems of sparse data and too large parameter space in n-grams. It is assumed that the appearance of a word only depends on the previous A word that appears, thereby establishing a relationship between words. The whoosh search engine helps to establish domain discrimination and builds an index according to the input text, which can quickly realize the candidate set recognition of fuzzy matching, and improve the speed of text error correction after multi-domain semantic recognit...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of voice recognition text processing and discloses a method for text error correction after voice recognition based on domain identification and aims at solving theproblem that a processing method in the prior art needs lots of labor for intervention, is low in error correction efficiency and cannot conduct error correction on proper names. The method comprisesthe following steps that (a) error knowing and analysis are conducted on texts obtained after voice recognition, and the field which text sentences belong to are primarily determined; (b) sentences toundergo error correction are segmented according to predefined syntax rules and are divided into redundancy portions and core portions; (c) a search engine is utilized to perform character string fuzzy matching and determine candidate specific word bank sets of the core portions of the sentences; (d) similarity scores are calculated according editing distances, and error correction is conducted on the redundancy portions and the core portions; (e) the redundancy portions and core portions undergoing the error correction are fused, and then error correction results are output.

Description

technical field [0001] The invention belongs to the field of speech recognition and text processing, and in particular relates to a method for correcting text errors after speech recognition based on domain recognition. Background technique [0002] In recent years, with the increasing demand and development of artificial intelligence, it has become a top priority for computers to correctly understand human language. Speech recognition can be mainly divided into pre-processing and post-processing. The pre-processing process mainly includes the process of speech signal processing, which extracts and analyzes the parameters spoken by humans / users, focusing on the processing of speech signals; speech post-processing involves the processing of speech signals. The conversion of syllables to Chinese characters, in other words, is the process of converting speech signal information into a computer-recognizable internal code. In the actual post-processing process of speech recognit...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27G06F17/30G10L15/183G10L15/26
CPCG10L15/183G10L15/26G06F16/3343G06F16/90344G06F40/211G06F40/232G06F40/253G06F40/284
Inventor 杨鑫刘楚雄唐军
Owner SICHUAN CHANGHONG ELECTRIC CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products