Punctuation addition method and device in speech recognition

A technology of speech recognition and punctuation, which is applied in the fields of instruments, calculations, electrical digital data processing, etc., and can solve the problems of lack of effectiveness of recognition results

Inactive Publication Date: 2013-06-19
BEIJING SINOVOICE TECH CO LTD
View PDF0 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The embodiment of the present invention discloses a method and device for adding punctuation in speech reco

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Punctuation addition method and device in speech recognition
  • Punctuation addition method and device in speech recognition
  • Punctuation addition method and device in speech recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0047] A method for adding punctuation in speech recognition disclosed in the embodiment of the present invention is introduced in detail. refer to figure 1 , shows a flowchart of a method for adding punctuation in speech recognition in an embodiment of the present invention. Step 100, feature extraction is performed on the current word in the sentence obtained through speech recognition. For example, the voice recognition content is "Hello, have you eaten yet?" The feature extraction process is as follows: ①, P2_null P1_null C_you L1_good L2_you ②, P2_null P1_you C_good L1_you L2_eat ③, P2_you P1_good C_you L1_eat L2_ Rice ④, P2_good P1_you C_eat L1_rice L2_ ⑤, P2_did you P1_eat C_rice L1_did L2_ ⑥, P2_eat P1_rice C_did L1_L2_null⑦ 、P2_fan P1_got C_do L1_null L2_null

[0048] Among them, C represents the current word, P1 represents the first word before the current word, P2 represents the second word before the current word, L1 represents the first word after the current ...

Embodiment 2

[0057] A method for adding punctuation in speech recognition disclosed in the embodiment of the present invention is introduced in detail.

[0058] refer to figure 2 , shows a flowchart of a method for adding punctuation in speech recognition in an embodiment of the present invention.

[0059] Step 200, feature extraction is performed on the current word in the sentence obtained through speech recognition.

[0060] The step 200 may specifically be:

[0061] According to the order of each word in the sentence obtained by speech recognition, each word is determined as the current word in turn, and the first n words and the last m words of the current word are determined as the characteristics of the current word.

[0062] Wherein, n and m are positive integers, and the first n words and the last m words of the current word include empty words. Moreover, n and m may be equal or unequal.

[0063] For example, if the voice recognition content is "Hello, have you eaten?", the c...

Embodiment 3

[0097] A device for adding punctuation in speech recognition disclosed in the embodiment of the present invention is introduced in detail.

[0098] refer to image 3, shows a structural diagram of a device for adding punctuation in speech recognition in an embodiment of the present invention.

[0099] The punctuation adding device in the described a kind of speech recognition can specifically include:

[0100] An extraction module 30 , an identification module 32 , and a selection module 34 .

[0101] The functions of each module and the relationship between each module are introduced in detail below.

[0102] The extraction module 30 is configured to perform feature extraction on the current word in the sentence obtained through speech recognition.

[0103] The identification module 32 is configured to identify the extracted features of the current word in a pre-established maximum entropy model to obtain identification characters after the current word.

[0104] For exam...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a punctuation addition method and a device in speech recognition to solve the problem that an obtained recognition result through the speech recognition is lack of pragmaticality. The method comprises a step of extracting features of present words in a sentence obtained through the speech recognition, a step of recognizing the extracted features of the present words in a preset maximum entropy model to obtain identification characters after the present words, and a step of choosing punctuations corresponding to the identification characters after the present words from a known identification character set according to an incidence relation of the obtained identification characters and each punctuation to be added after the present words. The punctuations (the punctuation can be empty) which need to be added after the present words are forecasted according to logical relations between the present words and several words before and after the present words and the preset maximum entropy model. The pragnaticality of the speech recognition is improved in the speech recognition result after the punctuations are added.

Description

technical field [0001] Embodiments of the present invention relate to the technical field of speech recognition, in particular to a method and device for adding punctuation in speech recognition. Background technique [0002] When performing speech recognition, the received speech content can only be recognized and converted into texts such as Chinese characters or English. When the received speech content is a series of text speech, the result of recognition conversion is only a series of Chinese characters or English text. [0003] Since punctuation marks belong to unpronounced information, general speech recognition results are only text information such as Chinese characters or English, without punctuation information. Punctuation information needs to be manually added to the speech recognition results by the user. However, in continuous speech recognition, there are not many studies on the automatic addition of punctuation marks. Most of them are recognized as commas ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/28
Inventor 李健吴飞郑晓明张连毅武卫东
Owner BEIJING SINOVOICE TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products