Unlock instant, AI-driven research and patent intelligence for your innovation.

Chinese text processing method and device thereof

A text processing, Chinese text technology, applied in the field of information processing, can solve the problem of high false positive probability of retrieval results and so on

Inactive Publication Date: 2014-05-28
北京系统工程研究所
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] However, the inventors found that in network content forensics, using the Chinese word segmentation method to segment the text to be forensic will lead to a higher probability of false positives in the retrieval results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese text processing method and device thereof
  • Chinese text processing method and device thereof
  • Chinese text processing method and device thereof

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0082] In the Chinese patent application number 200910083457.2 proposed by the inventor, a patent application for the invention name "Network Forensics Method and System" provides a network forensics method, including: capturing the data flow flowing through the network from the monitored network; Extracting the plain text selection and the network connection record corresponding to the plain text selection from the data stream; storing the plain text selection and the network connection record corresponding to the plain text selection; The plain text selection and the network connection record corresponding to the plain text selection are subjected to forensic analysis. Wherein, when storing the plain text fragments, the text segmentation method can be used to segment the plain text selections, and the obtained plain text fragments are respectively mapped with the corresponding IP pairs and then stored; correspondingly, in the forensic analysis stage, for The plain text to be...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a Chinese text processing method and a device thereof; the method comprises the steps of: obtaining a Chinese text to be divided; using a Chinese word segmentation approach to divide the Chinese text to be divided and then obtaining N0 initial text fragments; and conducting M-level aggregation processing to the N0 initial text fragments, wherein N0, NM and M are the integers no less than 1. The method and the device can reduce the misreporting probability of the search results.

Description

technical field [0001] The invention relates to the field of information processing, in particular to a Chinese text processing method and device. Background technique [0002] In the field of information processing, Chinese text segmentation technology is often used. For example, when an Internet search engine searches for text keywords, it first needs to reasonably segment the text content in the Internet in order to achieve more accurate searches. In addition, Chinese text segmentation technology is also needed in the fields of information processing such as machine translation, speech synthesis, automatic classification, automatic summarization, and automatic proofreading. [0003] The existing Chinese text segmentation methods mainly include: fixed-length segmentation method, Chinese word segmentation method and so on. Among them, the fixed-length segmentation method is to segment the text with a fixed length (such as 4 characters) according to the preset field length...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/21G06F17/30
Inventor 邹涛许博义黄敏桓刘丽赵刚
Owner 北京系统工程研究所