Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for calculating theme similarity

A similarity and topic technology, applied in the field of computing similarity of topics, can solve problems such as vocabulary mismatch

Pending Publication Date: 2021-04-30
CHINA ELECTRIC POWER RES INST +2
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Embodiments of the present disclosure provide a method and device for calculating topic similarity, so as to at least solve the problem that most of the topic similarity calculation methods in the prior art are based on vector space models, where each dimension of the vector is represented by a word (term ) to indicate that this method will encounter serious technical problems of vocabulary mismatch when it is applied to the similarity calculation of short text segments

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for calculating theme similarity
  • Method and device for calculating theme similarity
  • Method and device for calculating theme similarity

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0019] According to this embodiment, an embodiment of a method for calculating subject similarity is also provided. It should be noted that the steps shown in the flow charts of the accompanying drawings can be executed in a computer system such as a set of computer-executable instructions , and, although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that shown or described herein.

[0020] The method embodiments provided in this embodiment can be executed in a server or similar computing devices. figure 1 A block diagram of a hardware structure of a computing device for implementing a method for calculating topic similarity is shown. Such as figure 1 As shown, the computing device may include one or more processors (processors may include but not limited to processing devices such as microprocessors MCUs or programmable logic devices FPGAs), memory for storing data, and memory for communicat...

Embodiment 2

[0087] image 3 A device 300 for calculating topic similarity based on a domain word dictionary according to this embodiment is shown, and the device 300 corresponds to the method according to the first aspect of Embodiment 1. refer to image 3 As shown, the device 300 includes: an obtaining domain word module 310, which is used to obtain the text content of the question and the text content of the answer, respectively perform word segmentation on the text content of the question and the text content of the answer, and obtain the domain words of the question and the domain of the answer Words, where questions and answers correspond, domain words are semantic units in the vocabulary domain; determine the topic weight module 320, for utilizing the domain word dictionary established in advance, determine that the domain words of the question are in each category of the problem domain word dictionary The topic weight of the domain words of the answer and the topic weights in each...

Embodiment 3

[0100] Figure 4 A device 400 for calculating topic similarity based on a domain word dictionary according to this embodiment is shown, and the device 400 corresponds to the method according to the first aspect of Embodiment 1. refer to Figure 4 As shown, the device 400 includes: a processor 410; and a memory 420, connected to the processor 410, used to provide the processor 410 with instructions for processing the following processing steps: obtaining the text content of the question and the text content of the answer, and processing the text content of the question The text content and the text content of the answer are divided into words respectively, and the field words of the question and the field words of the answer are obtained, where the question corresponds to the answer, and the field word is a semantic unit in the lexical field; use the pre-established field word dictionary to determine the question The topic weights of the domain words in each category of the qu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and device for calculating theme similarity based on a domain word dictionary. The method comprises the steps: obtaining text content of a question and text content of an answer, carrying out word segmentation on the text content of the question and the text content of the answer, obtaining domain words of the question and domain words of the answer, wherein the question corresponds to the answer, and the domain words are semantic units in the vocabulary domain; utilizing a pre-established domain word dictionary to determine topic weights of domain words of the questions in each category of the domain word dictionary of the questions and topic weights of domain words of the answers in each category of the domain word dictionary of the answers, the domain word dictionary comprising a domain word dictionary of the questions and a domain word dictionary of the answers; and determining the theme similarity between the question and the answer according to the theme weight of the question and the theme weight of the answer.

Description

technical field [0001] This application relates to the field of artificial intelligence, in particular to a method and device for calculating subject similarity. Background technique [0002] At present, in many application scenarios where products are known, the similarity calculation of text fragments is involved. For example, the rank value judgment of the question / answer pair, knowing the question recommendation, and so on. Topic analysis requires the support of a huge domain word dictionary, relying on characteristic words (two-gram and three-gram phrases) with strong text content representation functions to distinguish categories or topics, and index subject words or keywords. Most of the existing topic similarity calculation methods are based on the vector space model, in which each dimension of the vector is represented by a word (term). This method will encounter serious vocabulary mismatch when it is applied to the similarity calculation of short text segments. P...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F40/194G06F40/216G06F40/242G06F40/284G06F40/30
CPCG06F40/194G06F40/216G06F40/242G06F40/284G06F40/30
Inventor 尚怀嬴刘岩郑安刚张琪任民
Owner CHINA ELECTRIC POWER RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products