Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for dividing Chinese sentences

A Chinese word segmentation and word segmentation technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of low accuracy of Chinese word segmentation methods, and achieve the effect of improving efficiency and accuracy

Inactive Publication Date: 2007-12-05
TENCENT TECH (SHENZHEN) CO LTD
View PDF0 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] The purpose of the present invention is to provide a Chinese word segmentation system, aiming to solve the problem of low accuracy of existing Chinese word segmentation methods

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for dividing Chinese sentences
  • Method and system for dividing Chinese sentences
  • Method and system for dividing Chinese sentences

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] In order to make the objectives, technical solutions and advantages of the present invention clearer, the following further describes the present invention in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, but not to limit the present invention.

[0044] The present invention performs atomic segmentation of the input Chinese text, and performs dictionary word segmentation and specific word recognition respectively based on the atomic sequence, adds the respective independent word segmentation results to the segmentation word map, and then according to each independent segmentation word map. The word segmentation result generates an optimal word segmentation path, and finally the comprehensive word segmentation result is output according to the optimal word segmentation path. Since the technical scheme of the present invention comprehensiv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese word-dividing method and system in the Chinese medicine information processing domain, which comprises the following steps: A. doing atom cutting for the input Chinese test; building initial cutting word pattern of atom sequence; B. cutting the dictionary and specific word identification based on atom sequence; adding respectively individual word-dividing result into the cutting word pattern; C. generating an optimum word-cutting path according to the word-cutting result in the cutting word pattern; outputting the synthetic word-cutting result according to the optimum word-dividing path. The invention improves the accuracy of Chinese word-dividing with high efficiency, which can identify each kind of specific word selectively according to specific condition.

Description

Technical field [0001] The invention relates to the field of Chinese information processing, and more specifically, to a Chinese word segmentation method and system. Background technique [0002] Chinese information processing technology has now been widely used in computer networks, database technology, software engineering and other computer fields. Chinese automatic word segmentation is an important basic work in Chinese information processing. Many Chinese information processing projects involve word segmentation. Problems, such as machine translation, automatic abstracts, automatic classification, full-text retrieval of Chinese literature, search engines, etc. Since the Chinese text is written consecutively and there are no spaces between words, the first problem encountered in Chinese text processing is the problem of word segmentation. The correct segmentation of words is a necessary condition for Chinese text processing. [0003] Chinese word segmentation algorithms can b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/27
Inventor 张会鹏
Owner TENCENT TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products