SimBlock algorithm for realizing high-quality text similarity calculation and implementation method
A text similarity and high-quality technology, applied in computing, unstructured text data retrieval, text database clustering/classification, etc., can solve problems such as inability to mark similar substrings, and improve overall stability and scheduling performance effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0055] Below in conjunction with specific embodiment, further illustrate the present invention. It should be understood that these examples are only used to illustrate the present invention and are not intended to limit the scope of the present invention. In addition, it should be understood that after reading the teachings of the present invention, those skilled in the art can make various changes or modifications to the present invention, and these equivalent forms also fall within the scope defined by the appended claims of the present application.
[0056] This embodiment discloses a SimBlock algorithm (similar block matrix algorithm) that can realize high-quality text similarity calculations. On the basis of string vectorization and cosine calculation similarity, the local ordered information of strings is supplemented, Specifically include the following steps:
[0057] Convert the strings to be compared 1 and strings 2 to be compared into an ordered stack of each charac...
PUM
Login to View More Abstract
Description
Claims
Application Information
Login to View More 


