Unlock instant, AI-driven research and patent intelligence for your innovation.

Method and device for Chinese word segmentation

A Chinese word segmentation and word segmentation technology, applied in the field of word segmentation, can solve the problem of inaccurate word segmentation results, and achieve the effect of avoiding wrong word segmentation combinations and improving accuracy.

Active Publication Date: 2019-07-09
BEIJING GRIDSUM TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The invention provides a method and device for Chinese word segmentation, which can solve the problem of inaccurate word segmentation results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for Chinese word segmentation
  • Method and device for Chinese word segmentation
  • Method and device for Chinese word segmentation

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

[0028] The embodiment of the present invention provides a method for Chinese word segmentation, such as figure 1 As shown, the method includes:

[0029] 101. Perform forward matching word segmentation and reverse matching word segmentation on the same target string, respectively, to obtain a forward word segmentation sequence and a reverse word segmentation sequence.

[0030] The target character string refers to the Chinese chara...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for Chinese word segmentation and relates to the technical field of word segmentation. The method and device are invented to solve the problem that word segmentation results are inaccurate. The method disclosed by the invention comprises the steps that forward matching word segmentation and reverse matching word segmentation are conducted to the same target character string respectively, so a forward word segmentation sequence and a reverse word segmentation sequence can be obtained respectively; conflict words between the forward word segmentation sequence and the reverse word segmentation sequence are searched, wherein the conflict words include a first conflict word which is contained in the forward word segmentation sequence, but is not contained in the reverse word segmentation sequence, as well as a second conflict word which is contained in the reverse word segmentation sequence, but is not contained in the forward word segmentation sequence; a contribution value of the first conflict word is computed and recorded as a first contribution value; a contribution value of the second conflict word is computed and recorded as a second contribution value; sizes of the first contribution value and the second contribution value are compared, and the conflict word with the larger contribution value is recorded as a high-quality conflict word; and the high-quality conflict word is combined with the non-conflict words, so final word segmentation results of the target character string could be determined. The method and device disclosed by the invention are mainly applied to the Chinese word segmentation.

Description

technical field [0001] The invention relates to the technical field of word segmentation, in particular to a method and device for Chinese word segmentation. Background technique [0002] Chinese word segmentation is the basis of text mining. For a piece of Chinese input, successfully performing Chinese word segmentation can achieve the effect that the computer can automatically identify the meaning of the sentence. Chinese word segmentation refers to dividing a Chinese string into individual words, and obtaining a word segmentation sequence composed of these independent words. [0003] Chinese word segmentation is currently the most widely used mechanical word segmentation method, which matches the Chinese character string to be analyzed with an entry in a "sufficiently large" machine dictionary according to a certain strategy. If the characters are the same, the match is successful, that is, a word is recognized. [0004] For a specific word with a specific meaning, ther...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27
CPCG06F40/284
Inventor 胡斌崔维福
Owner BEIJING GRIDSUM TECH CO LTD