Chinese PINYIN quick word segmentation method based on word search tree

A word search tree and Chinese pinyin technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as low query performance, low efficiency, missing words, etc., to improve memory usage efficiency and ensure accuracy , the effect of improving search efficiency
CN102867049BActive Publication Date: 2015-02-25康威通信技术股份有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
康威通信技术股份有限公司
Publication Date
2015-02-25

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a Chinese PINYIN quick word segmentation method based on a word search tree. The method is implemented by a computer or embedded mobile equipment and comprises the following working steps of: 1, building a Chinese character PINYIN search tree according to all the known Chinese character PINYIN lists; 2, combining the search tree with a hash table according to the built word search tree, and segmenting a string of given Chinese PINYINs; 3, working out a word segmentation result; and 4, destroying the search tree and releasing resources. Due to a public prefix of a character string, a construction space is saved, so that unnecessary character string comparison is greatly reduced; by the redundancy hash table with an index, the search efficiency is improved; and the time complexity of an algorithm is reduced to the minimum.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention belongs to the technical field of Chinese information processing of computers or various hand-held embedded mobile devices, and in particular relates to a Chinese pinyin rapid word segmentation method based on a word search tree. Background technique

[0002] From a series of continuous Chinese pinyin, the computer software algorithm can automatically recognize each individual character's pinyin, which is a must-use technology for pinyin input methods and search engines (associating Chinese sentences based on pinyin-type keywords). Use all existing Chinese single-character pinyin as keywords, build a hash table, and perform word segmentation on a string of continuous Chinese pinyin by searching and matching multiple times from the established hash table during word segmentation, but this method is inefficient Not a high question.

[0003] In order to improve efficiency, the above-mentioned hash table is improved as follows in the prior a...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More