Method and system for processing pinyin string in process of inputting Chinese characters

A Chinese pinyin and Chinese character input technology, applied in the Chinese pinyin string processing method and its system field, can solve the problems of not supporting fuzzy sound input, low efficiency, and large resource consumption, so as to reduce system resource consumption, improve processing efficiency, and improve effect of rationality

Inactive Publication Date: 2011-09-28
ALIBABA GRP HLDG LTD
View PDF2 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The embodiment of the present application provides a Chinese pinyin string segmentation method and its system, which are used to solve the problems of large resource consumption, low efficiency and failure to support fuzzy sound input in the existing pinyin string processing technology system

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for processing pinyin string in process of inputting Chinese characters
  • Method and system for processing pinyin string in process of inputting Chinese characters
  • Method and system for processing pinyin string in process of inputting Chinese characters

Examples

Experimental program
Comparison scheme
Effect test

example 1

[0127] The pinyin string input by the user is xianzi, which is segmented according to the segmentation rules to obtain the segmentation path x ianz i; for the segmentation path x ian zi to be expanded, since ian can be expanded to i and the zero-consonant syllable an, z can be expanded to zh, so the split path obtained by extension and the split path obtained by splitting the pinyin string input by the user may include:

[0128] x ian zi

[0129] x i an z i

[0130] x i ang z i

[0131] x rang z i

[0132] x ian zh i

[0133] x ian zh i

[0134] x i ang zh i

[0135] x iang zh i

[0136] Wherein, the first segmentation path is obtained by segmentation according to the pinyin string input by the user, and the subsequent segmentation paths are obtained after expansion. After syllable extraction processing and syllable legality verification processing, the following legal syllable sequence sets are obtained:

[0137] [x,ian][z,i]

[0138] [x,i][,an][z,i]

[0139] [x,i][...

example 2

[0147] The pinyin string input by the user is fangan, which is segmented according to the segmentation rules to obtain the segmentation path fangan; the segmentation path fang an is extended, since ang can be expanded into two results of an and ang g, the resulting segmentation The path and the split path obtained by splitting the pinyin string input by the user may include:

[0148] f ang an

[0149] fan g an

[0150] f an an

[0151] Wherein, the first segmentation path is obtained by segmentation according to the pinyin string input by the user, and the subsequent segmentation paths are obtained after expansion. After syllable extraction processing and syllable legality verification processing, the following legal syllable sequence sets are obtained:

[0152] [f,ang][,an]

[0153] [f, an] [g, an]

[0154] [f,an][,an]

[0155] The above [f, ang] [, an] can be mapped to "scheme", and [f, an] [g, an] can be mapped to "antipathy". It can be seen that through the above proc...

example 3

[0157] The pinyin string input by the user is piao, which is segmented according to the segmentation rules to obtain the segmentation path p iao; the segmentation path p iao is expanded, since iao can be expanded to i ao, so the expanded segmentation path and the segmentation path obtained by the user The segmentation path obtained by segmentation of the input pinyin string may include:

[0158] p iao

[0159] p i ao

[0160] Wherein, the first segmentation path is obtained by segmentation according to the pinyin string input by the user, and the subsequent segmentation paths are obtained after expansion. After syllable extraction processing and syllable legality verification processing, the following legal syllable sequence sets are obtained:

[0161] [p, iao]

[0162] [p, i] [, ao]

[0163] The above [p, iao] can be mapped to "Gone with the Wind", and [p, i][, ao] can be mapped to "skin jacket". It can be seen that through the above processing, the problem that the mapp...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for processing a pinyin string in the process of inputting Chinese characters. The method comprises the steps of: syncopating a received pinyin string, and respectively taking initial consonants and syllable rimes in the pinyin string as syncopated sub-strings to obtain a syncopated sub-string sequence; expanding the syncopated sub-strings in the syncopated sub-string sequence, and generating an expanded sub-string sequence set according to expansion results; extracting syllables of each expanded sub-string sequence in the expanded sub-string sequence set obtained by expression according to syllable composition characteristics to obtain a corresponding syllable sequence; and carrying out valid verification on the syllables in each syllable sequence, and deleting the syllable sequence including invalid syllables according to the verified results. According to the method and system disclosed by the invention, the problems, such as large consumption of system resources, low efficiency, and nonsupport to fuzzy tones in the conventional pinyin string processing technology.

Description

technical field [0001] The present application relates to the technical field of computer Chinese character input, in particular to a method and system for processing Chinese pinyin strings in the process of Chinese character input. Background technique [0002] The Chinese character input method (Input Method Editor, IME) is exactly a kind of method that utilizes keyboard, according to certain encoding rule to input Chinese character. From the point of view of the principle of Chinese character input, it can be divided into two categories: one is font code, such as Wubi input method, which is encoded based on the strokes of Chinese characters; the other is phonetic code, such as Pinyin input method, It is based on the pronunciation of Chinese characters. [0003] The pinyin input method is a method of inputting Chinese characters according to the pinyin. In order to convert the pinyin input by the user into Chinese characters for output, it is necessary to first segment t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/023
Inventor 薛永刚陈培军秦吉胜侯磊
Owner ALIBABA GRP HLDG LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products