Unlock instant, AI-driven research and patent intelligence for your innovation.

Chinese character fragmenting device

A Chinese character and character technology, which is applied in the field of Chinese character segmentation devices, can solve the problems of multiple calculation times, affecting work efficiency, and inapplicability of Chinese character segmentation accuracy.

Inactive Publication Date: 2003-09-17
PANASONIC CORP
View PDF1 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] 3. The relaxation iteration requires repeated calculations, so a lot of calculations are required, which will affect work efficiency
[0008] 4. For some applications such as dictionary translation, 95% of Chinese character segmentation accuracy is not applicable

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese character fragmenting device
  • Chinese character fragmenting device
  • Chinese character fragmenting device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] The term "semantics" in the present invention refers to the meaning of a word (as indicated by the semantic code). The preferred embodiment of the present invention adopts the semantic taxonomy in the 1985 edition of the dictionary published by Kado Kava (Jiaochuan) Bookstore, Japan. According to this classification method, use the four-digit code of hexadecimal as the classification code of the word. The leftmost code indicates the general class. The second code indicates the subclass. The third code indicates the segment. The rightmost code indicates a subsection. All words in the dictionary are grouped into 10 general categories, namely nature, character, change, action, mood, person, aptitude, society, art, and object. Each general category is also divided into 10 subcategories. The following is an example of a semantic taxonomy:

[0035] Semantic Code Description

[0036] 0 is natural

[0037] 02 Meteorological subcategories belonging to the natural cat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The Chinese character segmentation device uses the character voice information in the computer to process the word segmentation of Chinese sentences. The character-to-speech conversion part converts the sentences input by the computer into phonetic symbol strings. The candidate word selection part uses phonetic symbols as retrieval items to extract possible candidate characters or words and relevant information. The best candidate string determination part uses the start and end positions of each candidate character or word to establish a candidate word network for the retrieval item. After getting the overall evaluation, use dynamic programming method to find out the best segmentation way. The device enables word segmentation accuracy to exceed 98%, without troublesome repeated calculations, and can obviously improve work efficiency and accuracy.

Description

technical field [0001] The invention relates to a device for segmenting Chinese characters, which uses a computer to segment Chinese characters in Chinese sentences. Background technique [0002] In contemporary computer application research, using computers to process natural languages ​​such as Chinese and English has become a popular research field. Automatic translation, speech processing, automatic correction of documents, and computer-aided instructions are commonly referred to as natural language processing. In the decomposition processing of natural language sentences, the steps therebetween can be sequentially divided into input, word segmentation, syntax analysis and semantic analysis. Word segmentation is the process of converting the sequence of strings in an input sentence into a sequence of words. For example, if the input sentence is "It rained yesterday", the possible word segmentation results include "Yesterday * sky * Down * rain", "yesterday * Down ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G01F3/22G06F17/00
Inventor 郭俊桔
Owner PANASONIC CORP