A text deduplication method and device
A text and text processing technology, applied in the field of text processing, can solve problems such as large amount of calculation, high false positive rate, and complex implementation.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0025] In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the solutions of the present invention in detail with reference to the accompanying drawings and embodiments.
[0026] In the embodiment of the present invention, text deduplication is accomplished through the following three steps:
[0027] Step 1. Build a case library:
[0028] In order to de-duplicate the text, you first need to specify multiple texts as case texts, and process each of the case texts to build a case library.
[0029] The processing of each case text includes the following steps:
[0030] A1. Extract the characteristic words of the case text to obtain a characteristic word string.
[0031] The existing word segmentation method can be used to extract text feature words.
[0032] For example, for the case text: What happened to your car:
[0033] Extract feature words to get the following feature word string: What happened to your ca...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


