Text similarity measurement method and device based on multi-model fusion

A technology of text similarity and measurement method, which is applied in the field of text similarity measurement and device based on multi-model fusion, can solve the problems of inability to infer the true meaning of documents, affecting the accuracy of similarity, and failing to consider document semantics. The learning ability is continuously improved, the recall rate and accuracy rate are improved, and the effect of avoiding artificial feature extraction
CN112784587AActive Publication Date: 2021-05-11QUANZHOU POWER SUPPLY COMPANY OF STATE GRID FUJIAN ELECTRIC POWER +2

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
QUANZHOU POWER SUPPLY COMPANY OF STATE GRID FUJIAN ELECTRIC POWER
Publication Date
2021-05-11

Smart Images

  • Figure 1
    Figure 1
  • Figure 2
    Figure 2
  • Figure 3
    Figure 3
Patent Text Reader

Abstract

The invention provides a text similarity measurement method based on multi-model fusion. The method comprises the following steps: preparing a training set and a test set; selecting four deep learning training models: Bert, Paddle, Xlnet and Tree-LSTM; for each training model, acquiring C sub-models; for each sub-model, calculating a similarity score and a loss function of the input data; evaluating the sub-models; selecting and fixing a group of super-parameter combination with the best evaluation value of each sub-model; continuously training each sub-model to converge a loss function, and storing the 4C sub-models at the moment; fusing the 4C sub-models by adopting a Boosting scheme so as to perform weighted addition on similarity scores of the sub-models to obtain a similarity measurement model; and testing and adjusting the similarity measurement model by using the data of the test set. According to the method, the accuracy of similarity measurement is effectively improved, the recall rate and accuracy of similarity judgment are improved, and the generalization ability of the model is improved.
Need to check novelty before this filing date? Find Prior Art

Description

technical field

[0001] The invention relates to a method and device for measuring text similarity based on multi-model fusion. Background technique

[0002] Text similarity measurement refers to the measurement of the similarity between two texts, which has a wide range of applications in many fields. For example, in information retrieval, similarity can be used to identify similar words and improve the recall rate. In the automatic question answering scenario, the similarity can be used to calculate the matching degree between the user's question sentence in natural language and the question in the corpus, and the answer corresponding to the question with the highest matching degree is returned as the most responsive. However, in the application of machine translation, the bilingual translation is completed by analyzing the similarity of sentences. Whether the similarity can be accurately defined and calculated will affect the final translation effect. At this time, the us...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More