Unlock instant, AI-driven research and patent intelligence for your innovation.

Methods for Mass Text Matching

A text and mass technology, applied in the field of text matching, can solve the problems of large storage space, large time consumption, difficult parallel computing, etc., to achieve the effect of saving storage space, improving efficiency, and speeding up the calculation process

Active Publication Date: 2018-09-04
CHINA UNIONPAY
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the existing methods for document matching have the following problems: (1) Since the external documents after vectorization are usually sparse, storing external documents in the form of vectors will take up a large storage space; (2) due to It is necessary to calculate the dot product between every two vectors step by step, so the number of times to calculate the dot product is very large, resulting in a large amount of time spent in the calculation process; (3) It is difficult to perform parallel calculations

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Methods for Mass Text Matching
  • Methods for Mass Text Matching
  • Methods for Mass Text Matching

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0028] figure 1 is a flowchart of a method for massive text matching according to an embodiment of the present invention. like figure 1 As shown, the method for mass text matching disclosed in the present invention includes the following steps: (A1) respectively grouping database documents and external documents, and determining the grouping number s of database documents (that is, the number of groups of database documents) and external documents The grouping number t of documents (that is, the number of groupings of external documents); (A2) respectively calculate the total number n of database documents and the total number m of external documents, and use k real matrices M to represent the vector space of the database documents, and Use k2 sparse matrices P to represent the vector space of the external document; (A3) determine whether there is a sparse matrix P for which no corresponding calculation operation is performed (that is, determine whether there is still an exte...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a mass text matching method. The method includes: grouping database files and external files separately, and determining a number S of groups of the database files; calculating a total number n of the database files and a total number m of the external files separately, using k real matrices M to indicate a vector space of the database files, and using k2 sparse matrices P to indicate a vector space of the external files; using the sparse matrix P, for which corresponding calculations are not made, as a current target sparse matrix P, performing the corresponding calculations to the current target sparse matrix P to obtain a similarity matrix S, and determining the database files in optimal match with the external files represented by the current target sparse matrix P, on the basis of the similarity matrix S. The mass text matching method has the advantages that storage space is saved, time consumption is low, and parallel processing is available.

Description

technical field [0001] The present invention relates to a method for text matching, and more particularly, to a method for massive text matching. Background technique [0002] At present, with the increasing demand for information and data processing and the increasing variety of businesses in different fields, it is becoming more and more important to match external documents with massive database documents. [0003] In the existing methods for document matching, the following methods are usually adopted: establishing a vector space model of external documents and database documents; calculating the similarity between the vector corresponding to each external document and the vector corresponding to the database document one by one; A specific database document with the greatest similarity to an external document is used as the database document that best matches the external document, and this cycle is repeated until the database document that best matches each external do...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/24532G06F16/3347
Inventor 刘军冯兴
Owner CHINA UNIONPAY