Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A question similarity calculation method combining a synonym dictionary and a word embedding vector

A similarity calculation and similarity technology, applied in computing, semantic analysis, special data processing applications, etc., can solve problems such as poor word similarity, lack of rare words, accuracy, etc.

Active Publication Date: 2019-04-09
INSPUR FINANCIAL INFORMATION TECH CO LTD
View PDF2 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its disadvantages are: "Synonyms Ci Lin" is compiled manually, and most of the words involved are words involved in daily life. There are often missing professional words or uncommon words in the banking field.
The disadvantage of this method is: since the word vector is automatically generated by the algorithm, the word similarity estimated by this method is not as accurate as the artificial dictionary method

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A question similarity calculation method combining a synonym dictionary and a word embedding vector
  • A question similarity calculation method combining a synonym dictionary and a word embedding vector
  • A question similarity calculation method combining a synonym dictionary and a word embedding vector

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The present invention is described below in conjunction with accompanying drawing.

[0045] as attached figure 1 , 2 Shown is a method for calculating the similarity of a question in combination with a synonym dictionary and a word embedding vector according to the present invention, which is characterized in that it includes a similarity fusion method at the sentence level and a similarity fusion method at the word level;

[0046] (1) Sentence-level similarity fusion method:

[0047] The two questions to be calculated similarity are S 1 , S 2 , and perform word segmentation processing on it, we can get Among them, m and n are questions S 1 , S 2 the number of words contained, Indicates the qth word in the pth question sentence;

[0048] The first step is to calculate the dictionary similarity Sim between the questions dict (S 1 ,S 2 ), for the question S 1 , S 2 any pair of words in Query the dictionary of synonyms and calculate The dictionary simila...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a question similarity calculation method combining a synonym dictionary and a word embedding vector. The method comprises a sentence-level similarity fusion method and a word-level similarity fusion method. The sentence-level similarity fusion method and the word-level similarity fusion method are combined for calculation. The method has the following advantages that compared with a method of only using word vectors, the synonym dictionary compiled artificially is fully utilized, and the accuracy of word similarity calculation is guaranteed; For popular words and professional vocabularies with missing dictionaries, a word vector method is used for calculating the similarity, and the problem that the similarity cannot be calculated under the condition that the vocabularies are missing due to the fact that a dictionary method is singly used is effectively avoided; According to the method, two similarity calculation methods of the synonym dictionary and the word vector are fused, more factors are considered, and the result is more accurate.

Description

technical field [0001] The invention relates to the automatic question answering of service robots in the financial field, in particular to a question sentence similarity calculation method combined with a synonym dictionary and word embedding vectors. Background technique [0002] With the deepening of the application of artificial intelligence technology in the field of financial self-help, more and more banks use robots based on voice interaction technology to assist staff in consulting and handling business. The voice interaction technology mainly recognizes the user's voice, converts it into corresponding text, and then analyzes the semantics of the text on this basis, and extracts the answer closest to the user's question by searching the bank's internal question bank. Finally, the answer is converted into a voice signal through speech synthesis technology (TTS), sent to the robot and sounded through the speaker. [0003] Among them, the understanding of user question...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/27
CPCG06F40/247G06F40/30Y02D10/00
Inventor 张家重赵亚欧王玉奎付宪瑞张金清
Owner INSPUR FINANCIAL INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products