Method and device for matching texts

A matching method and text technology, applied in the field of data processing, can solve problems such as large amount of data processing, affecting system performance, slow processing speed, etc., and achieve the goals of improving system performance, simple matching process, strong versatility and universal applicability Effect
CN102411583AInactive Publication Date: 2012-04-11ALIBABA CLOUD COMPUTING LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ALIBABA CLOUD COMPUTING LTD
Publication Date
2012-04-11
Estimated Expiration
Not applicable · inactive patent

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention discloses a method and a device for matching texts. The method comprises the following steps of: acquiring new texts in the current period according to content information collected in the current period and storing the new texts in a database; performing word segmentation on the input new texts, and extracting keywords; calculating the weight of each extracted keyword in each text in the database according to a prestored frequency list of words; periodically updating the frequency list of the words according to the occurrence frequency of each word in each text in the database;calculating the similarity between each new text and each text in the database or calculating the similarity of any two texts in the database according to the calculated weight of each keyword in each text in the database; and determining the relevant text of each text stored in the database according to the calculated similarity. In the method, the problem that all the texts are need to be calculated during matching each time in the prior art is solved in the mode of establishing and updating the frequency list of the words, the matching operation work load is reduced and the system performance is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] This application relates to the field of data processing, in particular to a text matching method and device with a large amount of data. Background technique

[0002] Existing text comparison generally adopts the method of full calculation and matching. When it is necessary to calculate the degree of correlation between texts, it is necessary to calculate all the acquired texts, and finally obtain the similarity between two pairs. In this way, each calculation of similarity The degree of calculation must be calculated for all text data, and the amount of calculation will be very huge, and its running time is on the order of O(N^2). As the number of texts N increases, the calculation time will also be very long .

[0003] This large amount of data calculation comparison has a great impact on the system performance of the equipment, which puts great pressure on the system's I / O communication, data storage, and data network transmission, resulting in sl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More