Voice recognition character string processing comparison method based on Pinyin

A technology of speech recognition and character strings, which is applied in the fields of electrical digital data processing, natural language data processing, and special data processing applications, and can solve problems such as incorrect comparisons

Pending Publication Date: 2018-11-23
深圳市艾塔文化科技有限公司
View PDF0 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In some special occasions, this general-purpose technology cannot meet people's needs, such as the recognition of people's names, device names, etc. Perhaps the voice recognition algorithm recognizes the word "Yu Guoquan" through the input voice. character string, and the real user may say "Yu Guoquan", because the name itself has little contextual relevance, it will cause the comparison to be incorrect

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0016] The present invention is described in further detail below with preferred embodiment:

[0017] The recognition algorithm of the present invention is based on a "secondary processing" above the usual Chinese character recognition algorithm, which converts the recognized Chinese character string into a pinyin string, and then compares it with the target pinyin string.

[0018] The inventive method comprises the following steps:

[0019] Step 1: Pinyin encoding: encode all Chinese pinyin, which is similar to unicode encoding, and enumerate all Chinese pinyin combinations (it can also include tones as needed). We use two bytes (16 bits) to encode pinyin, the highest bit of the first byte is 1, as shown in the following table:

[0020] Pinyin

Encoding (hexadecimal)

a

8080

ai

8081

an

8082

ang

8083

ao

8084

the b

8085

……

……

zu

820A

zuan

820B

zui

820C

zun

82...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a voice recognition character string processing comparison method based on Pinyin. For application of an existing voice recognition technology to certain special occasions ofperson name recognition, equipment name recognition and the like, errors are generated easily due to incorrect comparison. The method is "secondary processing" based on a general Chinese character recognition algorithm; and recognized Chinese character strings are converted into Pinyin strings, and then the Pinyin strings are compared with target Pinyin strings. The method comprises the followingsteps of 1, performing Pinyin coding: performing coding on all Chinese character Pinyin, wherein the coding is similar to coding of unicode; and enumerating all Chinese character Pinyin combinations;2, performing code conversion: converting the character strings, with coding modes of GBK, Unicode, UTF-8 and the like, for expressing Chinese characters converted into the Pinyin strings; and 3, performing polyphone processing: enumerating polyphones of all family names; performing special processing; and distributing the same Pinyin codes. According to the method, accurate recognition can be rapidly realized, so that misjudgment is avoided.

Description

technical field [0001] The invention relates to the field of digital electronic products, in particular to a phonetic-based phonetic recognition character string processing and comparison method. Background technique [0002] In general, speech recognition is a technology that converts input speech into text through feature recognition. In some special occasions, this general-purpose technology cannot meet people's needs, such as the recognition of people's names, device names, etc. Perhaps the voice recognition algorithm recognizes the word "Yu Guoquan" through the input voice. character string, and the real user may say "Yu Guoquan". Since the name itself has little contextual relevance, the comparison will be incorrect. Contents of the invention [0003] The object of the present invention is to provide a method for processing and comparing character strings for speech recognition based on Pinyin. In the case of small character sets that require "special nouns" to ide...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30G06F17/22
CPCG06F40/126
Inventor 孙涛
Owner 深圳市艾塔文化科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products