Text processing method and device

A text and collection technology, applied in the computer field, can solve problems such as poor text detection results, and achieve the effect of improving computer performance and speeding up

Active Publication Date: 2013-02-13
ALIBABA CLOUD COMPUTING LTD
View PDF5 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] The main purpose of this application is to provide a method and device for processing text to solve the problem of poor text detection in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text processing method and device
  • Text processing method and device
  • Text processing method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.

[0030] figure 1 is a flowchart of the main steps of the method for processing text according to the embodiment of the present application, such as figure 1 As shown, the method mainly includes the following steps:

[0031] Step S11: Get the text to be processed.

[0032] Step S13: Calculate the similarity between the text segment to be processed and multiple text segments in the pre-stored text segment set to obtain multiple similarity values.

[0033] In this step, various existing or future string similarity comparison algorithms can be used for calculation. Examples of string similarity comparison algorithms are: Levenshtein Distance algorithm, LCS algorithm, vector product algorithm...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text processing method and a text processing device, which are used for solving the problem of poor processing effects of text detection in the prior art. The method comprises the following steps of: searching for a keyword in a text segment to be processed in an inverted index, counting the occurrence number of each text segment or an identifier of the text segment in a pre-stored text segment set in a term comprising the keyword, and sequentially selecting a plurality of text segments from the pre-stored text segment set from large occurrence numbers to small occurrence numbers, wherein the inverted index is established for the pre-stored text segment set, and comprises a plurality of terms, each term comprises a keyword, and stores the corresponding text segment or the identifier of the text segment comprising the keyword; processing similarity between the text segment to be processed and each text segment in the selected text segments to obtain a plurality of similarity values; and judging whether a minimum value in the similarity values is within a set range or not, and if the minimum value in the similarity values is within the set range, outputting information comprising preset contents.

Description

technical field [0001] This application relates to computer technology, in particular to a method and device for processing text. Background technique [0002] In the Internet, in order to avoid the dissemination of useless or harmful information, text processing is often required. For example, in the anti-spam setting, the mail receiving device such as the mail receiving client software accurately matches the address of the incoming letter with the pre-stored blacklist address, and if all the characters in the two are the same, the incoming letter is rejected. In this case the processed text is the email address. As another example, in an e-commerce system, some users will perform fraudulent behavior. In order to limit fraudulent behavior, it is necessary to detect the addresses (usually communication addresses) left by these users. Currently, an address blacklist is also used to log each address. Exact match, if all characters in the address are the same as all character...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 许泰清徐磊石胡四海
Owner ALIBABA CLOUD COMPUTING LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products