Unlock instant, AI-driven research and patent intelligence for your innovation.

Text processing method and text processing system

A text processing and text technology, applied in the computer field, can solve the problems of low efficiency and accuracy of text processing

Active Publication Date: 2013-06-26
新浪技术(中国)有限公司
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] Embodiments of the present invention provide a text processing method and system to solve the problem of low efficiency and accuracy of text processing in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and text processing system
  • Text processing method and text processing system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] In order to shorten the path length when adopting the CRF algorithm to decode, in the embodiment of the present invention, the text is carried out entity recognition with the unit word as the unit, and the entity recognition is carried out with the unit word as the unit, it is necessary to determine the part of speech of each unit word according to the part of speech of each unit word Entity word attributes, and then perform entity recognition according to the entity word attributes of each unit word, so word segmentation, part-of-speech tagging and entity recognition need to be combined.

[0023] Preferred embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings.

[0024] figure 1 The text processing process provided by the embodiment of the present invention specifically includes the following steps:

[0025] S101: Perform word segmentation processing on the text to obtain each unit word in the text.

[002...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text processing method and a text processing system which are used for solving the problems of low efficiency and low accuracy of text processing in the prior art. The text processing system conducts word separating to a text to obtain unit words, aiming at each unit word, the characteristic of the unit word is confirmed according to characters in the unit word and the unit word, and the entity word attribute of the unit word is confirmed according to the characters in the unit word and the unit word. The entity words in the text are identified according to the entity word attributes of the unit words, and the text is processed according to the identified entity words. According to the above method, the text processing system conducts entity identification by using the unit words as a unit, so that the path length is effectively shortened when a CFR algorithm is used for decoding, the efficiency and the accuracy of entity identification are improved, and therefore the efficiency and the accuracy of follow-up text processing based on the identified entity words are improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a text processing method and system. Background technique [0002] At present, text processing has been widely used in various fields. Generally, it is necessary to perform word segmentation, part-of-speech tagging and entity recognition on the text, and then process the text according to the word segmentation results, tagged part-of-speech and recognized entity words. [0003] Among them, substantive words refer to words such as names of people, places, and institutions, such as Andy Lau, Beijing, and the Great Hall of the People. Words other than substantive words are non-substantial words. [0004] In the prior art, the above-mentioned word segmentation, part-of-speech tagging and entity recognition are generally regarded as three independent processes, or word segmentation and part-of-speech tagging are regarded as one process, and entity recognition is regarded as a separ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/21
Inventor 戴明洋
Owner 新浪技术(中国)有限公司