Question and answer matching method and device, computer device and storage medium

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By filtering out candidate questions with the same characters in the question-answering system and adjusting the semantic similarity based on the semantic differences between the user's words and the candidate words, the semantic similarity of the candidate questions is optimized, which solves the matching error of questions with the same literal meaning but different semantics in the existing technology and improves the accuracy of question-answering matching.

CN116361433BActive Publication Date: 2026-06-26INDUSTRIAL AND COMMERCIAL BANK OF CHINA +1

View PDF 3 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: INDUSTRIAL AND COMMERCIAL BANK OF CHINA
Filing Date: 2023-02-02
Publication Date: 2026-06-26

Application Information

Patent Timeline

02 Feb 2023

Application

26 Jun 2026

Publication

CN116361433B

IPC: G06F16/3329; G06F16/334; G06F16/335; G06F40/30

CPC: G06F16/3329; G06F16/3344; G06F16/335; G06F40/30; Y02D10/00

AI Tagging

Technology Topics

Questions and answers Degree of similarity

Technical Efficacy Phrases

improve rationality

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Method for predicting distribution of oil and gas layers in oil well reservoirs
CN122280556Aimprove accuracy improve rationality Porosity Well logging
Heterogeneous double-stage wind power prediction method and system considering physical constraints
CN122292331AImprove fitting abilityImprove model generalizationMicrogrid Algorithm
Rock reservoir brittleness prediction method, device, equipment and medium
CN122260482Aimprove rationality improve accuracy Chemical property prediction Chemical data visualisationShear modulusWell logging
Method and device for determining power consumption load level, electronic equipment and storage medium
CN117076966Bimprove accuracy improve rationality
Self-attention mechanism optimization method based on sentence structure constraint
CN122242483ARetain integration capabilitiesReduce computing scaleSemantic analysis Biological models

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Existing question-answering systems cannot accurately match user intent when faced with questions that are literal but semantically different, resulting in inaccurate answers.

Method used

By acquiring user questions and their candidate questions, target candidate questions with the same characters are filtered out. The semantic similarity is adjusted according to the semantic differences between user words and candidate words to optimize the semantic similarity of candidate questions. Finally, target questions that match user questions are selected.

Benefits of technology

It improves the accuracy of question-and-answer matching, ensures that the answer matches the user's intent in the question, and solves the problem of incorrect matching of questions with the same literal meaning but different semantics.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116361433B_ABST

Patent Text Reader

Abstract

The application relates to a question and answer matching method and device, computer equipment, a storage medium and a computer program product. The method comprises the following steps: obtaining a user question and a plurality of candidate questions corresponding to the user question; screening target candidate questions from the candidate questions; the target candidate question is a candidate question having at least one same character with the user question; for any same character corresponding to any target candidate question, determining a user word in which the same character is located in the user question, and determining a candidate word in which the same character is located in the target candidate question; in the case that the user word and the candidate word are different, adjusting the semantic similarity according to the difference between the semantic represented by the user word and the semantic represented by the candidate word; and screening a target question matched with the user question from the candidate questions according to the new semantic similarity corresponding to each target candidate question. The method can improve the accuracy of question and answer matching.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence technology, and in particular to a question-and-answer matching method, apparatus, computer device, storage medium, and computer program product. Background Technology

[0002] With the development of natural language processing and artificial intelligence technologies, people are increasingly able to use machines to process unstructured natural language data to complete complex tasks, such as question-answering systems. Question-answering systems mainly solve the problems of analyzing the true intent of questions, matching the relationship between questions and answers, understanding user questions described in natural language, and returning the correct matching answer.

[0003] The key technology for question-answering systems to answer user questions is question matching. However, the ambiguity of natural language leads to difficulties in question matching, where literal meanings may differ even when the questions are identical. For example, a user's question and a candidate question may have overlapping literal content, but the question itself may have different meanings. In such cases, the question-answering matching methods used in related technologies may still calculate a high matching degree for the user's question and the candidate question, potentially leading to errors in subsequent semantic determination and failing to provide an answer that accurately reflects the user's intent.

[0004] Therefore, the relevant technologies suffer from low accuracy in question-answering matching. Summary of the Invention

[0005] Therefore, it is necessary to provide a question-and-answer matching method, apparatus, computer equipment, computer-readable storage medium, and computer program product that can improve the accuracy of question-and-answer matching in response to the above-mentioned technical problems.

[0006] Firstly, this application provides a question-and-answer matching method. The method includes:

[0007] Obtain the user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition;

[0008] Target candidate questions are selected from the candidate questions; the target candidate questions are those that have at least one character in common with the user questions.

[0009] For any identical character corresponding to any of the target candidate questions, determine the word in which the identical character is located in the user question as the user word, and determine the word in which the identical character is located in any of the target candidate questions as the candidate word;

[0010] When the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, so as to obtain a new semantic similarity corresponding to any target candidate question;

[0011] Based on the new semantic similarity corresponding to each of the target candidate questions, a target question that matches the user question is selected from the candidate questions; the answer corresponding to the target question is the answer that matches the user question.

[0012] In one embodiment, when the user word and the candidate word are different, adjusting the semantic similarity between the target candidate question and the user question based on the difference between the semantics represented by the user word and the semantics represented by the candidate word, to obtain a new semantic similarity corresponding to the target candidate question, includes:

[0013] Based on the candidate words and the user words, determine the word tuple to be queried;

[0014] If the query term pair is not found in the thesaurus, it is determined that the semantics represented by the user term and the semantics represented by the candidate term are different.

[0015] When the semantics represented by the user word and the semantics represented by the candidate word are different, the semantic similarity corresponding to any target candidate question is adjusted to obtain a new semantic similarity corresponding to any target candidate question.

[0016] In one embodiment, when the semantics represented by the user word and the semantics represented by the candidate word are different, adjusting the semantic similarity corresponding to any target candidate question to obtain a new semantic similarity corresponding to any target candidate question includes:

[0017] Among the identical characters corresponding to any target candidate question, those whose corresponding user words and corresponding candidate words are different, and whose semantic representations are different from those of the corresponding candidate words, are taken as the target identical characters between any target candidate question and the user question;

[0018] The total number of identical characters between any target candidate question and the user question is determined as the first character count, and the number of characters in the user question is determined as the second character count;

[0019] Based on the ratio between the number of the second character and the number of the first character, and a preset similarity adjustment parameter, the semantic similarity corresponding to any target candidate question is adjusted to obtain a new semantic similarity corresponding to any target candidate question; the new semantic similarity is less than the original semantic similarity.

[0020] In one embodiment, the similarity adjustment parameter is a constant with a corresponding value greater than zero; adjusting the semantic similarity of any target candidate question based on the ratio between the number of the second character and the number of the first character and the preset similarity adjustment parameter to obtain a new semantic similarity for any target candidate question includes:

[0021] Determine the ratio between the number of the second character and the number of the first character, and sum it with the preset similarity adjustment parameter;

[0022] The quotient between the semantic similarity corresponding to any target candidate question and the sum is determined to obtain a new semantic similarity corresponding to any target candidate question; the new semantic similarity is directly proportional to the number of the first characters.

[0023] In one embodiment, determining the word containing any identical character in the user question for any target candidate question, as a user word, and determining the word containing any identical character in any target candidate question, as a candidate word, includes:

[0024] The user question and any target candidate question are segmented into words respectively to obtain the segmented user question and the segmented target candidate question.

[0025] In the user's question after word segmentation, determine the word containing any identical character to obtain the user's word;

[0026] Furthermore, the word containing any identical character is determined in the target candidate question after word segmentation, thereby obtaining the candidate word.

[0027] In one embodiment, the step of filtering out the target candidate question from each of the candidate questions includes:

[0028] Among the candidate questions, incomplete matching candidate questions are selected; the incomplete matching candidate questions are those whose corresponding strings are not exactly equal to the strings corresponding to the user questions.

[0029] Among the incompletely matched candidate questions, those that share the same characters as the user's question are selected as the target candidate question.

[0030] In one embodiment, the step of filtering out the target question that matches the user question from the candidate questions based on the new semantic similarity corresponding to each of the target candidate questions includes:

[0031] Based on the new semantic similarity corresponding to each target candidate question and the semantic similarity corresponding to each of the remaining candidate questions, the candidate question with the highest corresponding semantic similarity is selected from each of the candidate questions and used as the candidate question to be compared; each of the remaining candidate questions refers to the candidate questions other than each target candidate question.

[0032] If the semantic similarity of the candidate question to be compared is greater than a preset similarity threshold, the candidate question to be compared is taken as the target question.

[0033] Secondly, this application also provides a question-and-answer matching device. The device includes:

[0034] The acquisition module is used to acquire a user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition;

[0035] The first filtering module is used to filter out target candidate questions from each of the candidate questions; the target candidate question is a candidate question that has at least one character in common with the user question.

[0036] The determination module is used to determine, for any identical character corresponding to any target candidate question, the word in the user question containing the identical character as the user word, and to determine the word in the target candidate question containing the identical character as the candidate word;

[0037] The adjustment module is used to adjust the semantic similarity between any target candidate question and the user question based on the difference between the semantics represented by the user word and the semantics represented by the candidate word when the user word and the candidate word are different, so as to obtain a new semantic similarity corresponding to any target candidate question;

[0038] The second filtering module is used to filter out target questions that match the user's question from among the candidate questions based on the new semantic similarity corresponding to each target candidate question; the answer corresponding to the target question is the answer that matches the user's question.

[0039] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor, the memory storing a computer program, and the processor executing the computer program to perform the following steps:

[0040] Obtain the user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition;

[0041] Target candidate questions are selected from the candidate questions; the target candidate questions are those that have at least one character in common with the user questions.

[0042] For any identical character corresponding to any of the target candidate questions, determine the word in which the identical character is located in the user question as the user word, and determine the word in which the identical character is located in any of the target candidate questions as the candidate word;

[0043] When the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, so as to obtain a new semantic similarity corresponding to any target candidate question;

[0044] Based on the new semantic similarity corresponding to each of the target candidate questions, a target question that matches the user question is selected from the candidate questions; the answer corresponding to the target question is the answer that matches the user question.

[0045] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, performs the following steps:

[0046] Obtain the user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition;

[0047] Target candidate questions are selected from the candidate questions; the target candidate questions are those that have at least one character in common with the user questions.

[0048] For any identical character corresponding to any of the target candidate questions, determine the word in which the identical character is located in the user question as the user word, and determine the word in which the identical character is located in any of the target candidate questions as the candidate word;

[0049] When the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, so as to obtain a new semantic similarity corresponding to any target candidate question;

[0050] Based on the new semantic similarity corresponding to each of the target candidate questions, a target question that matches the user question is selected from the candidate questions; the answer corresponding to the target question is the answer that matches the user question.

[0051] Fifthly, this application also provides a computer program product. The computer program product includes a computer program that, when executed by a processor, performs the following steps:

[0052] Obtain the user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition;

[0053] Target candidate questions are selected from the candidate questions; the target candidate questions are those that have at least one character in common with the user questions.

[0054] For any identical character corresponding to any of the target candidate questions, determine the word in which the identical character is located in the user question as the user word, and determine the word in which the identical character is located in any of the target candidate questions as the candidate word;

[0055] When the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, so as to obtain a new semantic similarity corresponding to any target candidate question;

[0056] Based on the new semantic similarity corresponding to each of the target candidate questions, a target question that matches the user question is selected from the candidate questions; the answer corresponding to the target question is the answer that matches the user question.

[0057] The aforementioned question-and-answer matching method, apparatus, computer equipment, storage medium, and computer program product acquire a user question and multiple candidate questions corresponding to the user question; the literal matching degree between each candidate question and the user question meets a preset matching degree condition; target candidate questions are selected from each candidate question; a target candidate question is a candidate question that has at least one identical character with the user question; for any identical character corresponding to any target candidate question, the word containing that identical character in the user question is determined as the user word, and the word containing that identical character in any target candidate question is determined as the candidate word; when the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, resulting in a new semantic similarity corresponding to any target candidate question; based on the new semantic similarity corresponding to each target candidate question, a target question matching the user question is selected from each candidate question; the answer corresponding to the target question is the answer matching the user question.

[0058] Thus, after determining multiple candidate questions corresponding to the user's question through literal matching, the target candidate question is then selected from among the candidate questions to have at least one identical character with the user's question. However, there are cases where the target candidate question and the user's question have overlapping characters, such as single-character overlap or consecutive character overlap. Due to the ambiguity of natural language, question matching faces the challenge of identical characters with different semantics. The semantics represented by the identical characters in the target candidate question and the user's question may differ, resulting in semantic dissimilarity between the target candidate question and the user's question. Furthermore, when the user word corresponding to the same character in the user's question differs from the candidate word corresponding to the same character in the target candidate question, the semantics represented by the user word and the candidate word are compared. By adjusting the semantic similarity between the target candidate question and the user question based on the differences between them, the rationality of the semantic similarity corresponding to the target candidate question can be optimized, resulting in a new semantic similarity corresponding to the target candidate question. This allows for accurate selection of the target question that matches the user question from among the candidate questions based on the new semantic similarity, making the answer determined based on the target question more consistent with the user's intent. This addresses the problem in related technologies where a high semantic similarity is given even when there is single-character or continuous character overlap between the user question and the candidate question, but the semantics are not similar, leading to inaccurate semantic judgment of the user question and an inability to accurately determine the answer that matches the user's intent. This effectively improves the accuracy of question-and-answer matching. Attached Figure Description

[0059] Figure 1 This is a flowchart illustrating a question-and-answer matching method in one embodiment;

[0060] Figure 2 This is a flowchart illustrating the steps for adjusting semantic similarity in one embodiment;

[0061] Figure 3 This is a flowchart illustrating another question-and-answer matching method in one embodiment;

[0062] Figure 4 This is a flowchart illustrating a question-and-answer matching method in another embodiment;

[0063] Figure 5 This is a structural block diagram of a question-and-answer matching device in one embodiment;

[0064] Figure 6 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0065] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0066] It should be noted that the terms "first," "second," etc., used in the specification, claims, and accompanying drawings of this disclosure are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this disclosure described herein can be implemented in orders other than those illustrated or described herein. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this disclosure. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this disclosure as detailed in the appended claims.

[0067] In one embodiment, such as Figure 1 As shown, a question-and-answer matching method is provided. It is understood that this method can also be applied to terminals, and also to systems including terminals and servers, and is implemented through the interaction between the terminal and the server. For example, the system can be an automatic question-and-answer system. The server can be a standalone server or a server cluster composed of multiple servers. This embodiment illustrates the application of this method to a FAQ (Frequently Asked Questions) question-and-answer system including terminals and servers. The method includes the following steps:

[0068] Step S110: Obtain the user's question and multiple candidate questions corresponding to the user's question.

[0069] Among them, the literal matching degree between each candidate question and the user question meets the preset matching degree condition.

[0070] In practice, users can input questions through the terminal. The questions input by the user on the terminal are called user questions. The terminal can respond to the user's question input operation, obtain the user questions, and send the user questions to the server so that the server can obtain the user questions.

[0071] The server's question-and-answer database stores a large number of frequently asked standard questions and their corresponding standard answers. It also stores the different questions, or similar questions, that correspond to each standard question. When a user's question matches a standard or similar question in the database, the server can directly provide the corresponding standard answer to the user.

[0072] Thus, when the server receives a user's question, it can perform a coarse-grained literal matching process, comparing the user's question with a large number of standard questions and their corresponding similar questions in the question-and-answer database. This process filters out standard or similar questions whose literal match with the user's question meets a preset matching threshold, which are then used as candidate questions for the user's question. For example, the server can filter out standard or similar questions whose literal match with the user's question is greater than a preset matching threshold, as candidate questions for the user's question.

[0073] Then, the server uses a deep model-based FAQ (Frequently Asked Questions) semantic matching ranking method to calculate the semantic similarity matching score between the user's question and each candidate question, thus obtaining the semantic similarity between the user's question and each candidate question.

[0074] Step S120: Select the target candidate question from among the candidate questions.

[0075] Among them, the target candidate question is a candidate question that has at least one character in common with the user question.

[0076] In practice, the server can filter out candidate questions from among the candidate questions that have at least one character in common with the user's question, and use them as target candidate questions.

[0077] Among them, when there is at least one candidate question sentence with the same character as the user question sentence, it is determined that there is a character overlap between the candidate question sentence and the user question sentence. Character overlap includes single-character overlap and consecutive-character overlap. Single-character overlap means that the same literal content between the user question sentence and the candidate question sentence is discontinuous single characters. For example, for the user question sentence "fitness card" and the candidate question sentence "check health card", the same literal content is two discontinuous single characters "fit" and "card". Consecutive-character overlap means that the same literal content between the user question sentence and the candidate question sentence is consecutive characters. For example, for the user question sentence "open QR code" and the candidate question sentence "punch in graphic code", the same literal content is consecutive characters "code" and "punch".

[0078] Step S130, for any same character corresponding to any target candidate question sentence, determine the word where the same character is located in the user question sentence as the user word, and determine the word where the same character is located in any target candidate question sentence as the candidate word.

[0079] Specifically, it includes: performing word segmentation on the user question sentence and any target candidate question sentence respectively to obtain the word-segmented user question sentence and the word-segmented target candidate question sentence; determining the word where any same character is located in the word-segmented user question sentence to obtain the user word; and determining the word where any same character is located in the word-segmented target candidate question sentence to obtain the candidate word.

[0080] In specific implementation, for any target candidate question sentence, the server can perform word segmentation on the user question sentence and the any target candidate question sentence respectively to obtain the word-segmented user question sentence and the word-segmented any target candidate question sentence; for any same character among the same characters between the any target candidate question sentence and the user question sentence, the server can determine the word where the same character is located in the word-segmented user question sentence as the user word corresponding to the same character, and determine the word where the character is located in the word-segmented any target candidate question sentence as the candidate word corresponding to the same character.

[0081] For example, if the user's question is "Open XX Application", and any one of the target candidate questions is "Clock in through YY Application", then the segmented user question is "XX / Application / Open", and the segmented any one of the target candidate questions is "Through / YY / Application / Clock in". The common characters between the user question and any one of the target candidate questions include "ying", "yong", and "da". Among them, the words where the common characters "ying" and "yong" are located in the segmented user question are both "Application", that is, the user words corresponding to the common characters "ying" and "yong" are both "Application", and the words where the common characters "ying" and "yong" are located in the segmented any one of the target candidate questions are also both "Application", that is, the candidate words corresponding to the common characters "ying" and "yong" are also both "Application"; while the word where the common character "da" is located in the segmented user question is "Open", and the word where the common character "da" is located in the segmented any one of the target candidate questions is "Clock in", that is, the user word corresponding to the common character "da" is "Open", and the corresponding candidate word is "Clock in".

[0082] Step S140, when the user word and the candidate word are different, adjust the semantic similarity between any one of the target candidate questions and the user question according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, and obtain the new semantic similarity corresponding to any one of the target candidate questions.

[0083] In specific implementation, the server can compare the user word corresponding to any one of the common characters with the corresponding candidate word. When the user word corresponding to any one of the common characters is not equal to the corresponding candidate word, it is determined that the user word corresponding to any one of the common characters is different from the corresponding candidate word, that is, any one of the common characters belongs to different words in the user question and in any one of the target candidate questions. For example, in the above example, the user word "Open" corresponding to the common character "da" is different from the corresponding candidate word "Clock in", then the server can adjust the semantic similarity between any one of the target candidate statements and the user question according to the difference between the semantics represented by the user word corresponding to any one of the common characters and the semantics represented by the candidate word corresponding to any one of the common characters, and obtain the new semantic similarity corresponding to any one of the target candidate questions. And when the user words corresponding to each common character between any one of the target candidate questions and the user question and the corresponding candidate words are the same, there is no need to adjust the semantic similarity corresponding to any one of the target candidate questions, and the new semantic similarity corresponding to any one of the target candidate questions is equal to the original semantic similarity. In this way, the server can determine the new semantic similarities between each target candidate question and the user question.

[0084] Step S150: Based on the new semantic similarity corresponding to each target candidate question, select the target question that matches the user's question from among the candidate questions.

[0085] The answer to the target question is the answer that matches the user's question.

[0086] In practice, the server can select candidate questions that semantically match the user's question based on the new semantic similarity of each target candidate question, and use them as target questions. The server can then return the standard answer corresponding to the target question as the answer that matches the user's question to the terminal for the user to view.

[0087] In the above question-and-answer matching method, the following steps are taken: First, the user's question and multiple candidate questions are obtained. Second, the literal matching degree between each candidate question and the user's question meets a preset matching degree condition. Third, target candidate questions are selected from among the candidate questions. A target candidate question is one that shares at least one character with the user's question. Fourth, for any identical character in any target candidate question, the word containing that identical character in the user's question is determined as the user word, and the word containing that identical character in any target candidate question is determined as the candidate word. Fifth, when the user word and candidate words differ, the semantic similarity between any target candidate question and the user's question is adjusted based on the difference between the semantics represented by the user word and the semantics represented by the candidate words, resulting in a new semantic similarity for each target candidate question. Sixth, based on the new semantic similarity for each target candidate question, a target question matching the user's question is selected from among the candidate questions. Finally, the answer corresponding to the target question is the answer matching the user's question.

[0088] Thus, after determining multiple candidate questions corresponding to the user's question through literal matching, the target candidate question is then selected from among the candidate questions to have at least one identical character with the user's question. However, there are cases where the target candidate question and the user's question have overlapping characters, such as single-character overlap or consecutive character overlap. Due to the ambiguity of natural language, question matching faces the challenge of identical characters with different semantics. The semantics represented by the identical characters in the target candidate question and the user's question may differ, resulting in semantic dissimilarity between the target candidate question and the user's question. Furthermore, when the user word corresponding to the same character in the user's question differs from the candidate word corresponding to the same character in the target candidate question, the semantics represented by the user word and the candidate word are compared. By adjusting the semantic similarity between the target candidate question and the user question based on the differences between them, the rationality of the semantic similarity corresponding to the target candidate question can be optimized, resulting in a new semantic similarity corresponding to the target candidate question. This allows for accurate selection of the target question that matches the user question from among the candidate questions based on the new semantic similarity, making the answer determined based on the target question more consistent with the user's intent. This addresses the problem in related technologies where a high semantic similarity is given even when there is single-character or continuous character overlap between the user question and the candidate question, but the semantics are not similar, leading to inaccurate semantic judgment of the user question and an inability to accurately determine the answer that matches the user's intent. This effectively improves the accuracy of question-and-answer matching.

[0089] In one embodiment, when the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted based on the difference between the semantics represented by the user word and the semantics represented by the candidate word to obtain a new semantic similarity corresponding to any target candidate question. This includes: determining the query word pair based on the candidate word and the user word; if the query word pair is not found in the thesaurus, determining that the semantics represented by the user word and the semantics represented by the candidate word are different; and adjusting the semantic similarity corresponding to any target candidate question to obtain a new semantic similarity corresponding to any target candidate question when the semantics represented by the user word and the semantics represented by the candidate word are different.

[0090] In the specific implementation, when the user word corresponding to any identical character is different from the corresponding candidate word, the server adjusts the semantic similarity between any target candidate question and the user question based on the difference between the semantics represented by the user word and the semantics represented by the candidate word, and in the process of obtaining a new semantic similarity corresponding to any target candidate question, the server can construct a word tuple based on the user word corresponding to any identical character and the corresponding candidate word, as the word tuple to be queried.

[0091] The synonym database in the server includes a synonym list, which is collected and sorted in advance according to common sense knowledge. The synonym list includes a large number of synonym pairs. Each synonym pair includes two words with the same semantics. The format of the synonym list is, for example, [(meal allowance, dining allowance),...,(house, building)]. The server can use the query term pair as the target to query the synonym database to check if there is a synonym pair identical to the query term pair. If a synonym pair identical to the query term pair is found in the synonym list, it is determined that the semantics represented by the user term corresponding to any of the identical characters are the same as the semantics represented by the corresponding candidate term, and the user term and the corresponding candidate term corresponding to any of the identical characters are synonyms of each other; if no synonym pair identical to the query term pair is found in the synonym list, it is determined that the semantics represented by the user term corresponding to any of the identical characters are different from the semantics represented by the corresponding candidate term, and the user term and the corresponding candidate term corresponding to any of the identical characters are not synonyms of each other.

[0092] For example, in the above example, the user term corresponding to the identical character "da" is "open", and the corresponding candidate term is "punch the clock". Then, the query term pair constructed from the user term "open" and the candidate term "punch the clock" is "(open, punch the clock)". After querying in the synonym database, no synonym pair identical to "(open, punch the clock)" is found. Therefore, the semantics represented by the user term "open" corresponding to the identical character "da" are different from the semantics represented by the corresponding candidate term "punch the clock", and they are not synonyms of each other (because the synonym list is collected and sorted based on common sense knowledge, and the above query term pairs do not conform to the synonymous meaning in common sense and thus will not be included in the synonym list).

[0093] Thus, when the semantics represented by the user term corresponding to any of the identical characters are different from the semantics represented by the corresponding candidate term, the server can adjust the semantic similarity corresponding to any of the target candidate questions to obtain a new semantic similarity corresponding to any of the target candidate questions. When, for each identical character between any of the target candidate questions and the user question, the corresponding user term and the corresponding candidate term are the same, or the corresponding user term and the corresponding candidate term are synonyms of each other, there is no need to adjust the semantic similarity corresponding to any of the target candidate questions, and the new semantic similarity corresponding to any of the target candidate questions is equal to the original semantic similarity.

[0094] The technical solution of this embodiment determines the query word tuple based on candidate words and user words. If the query word tuple is not found in the thesaurus, it is determined that the semantics represented by the user words and the candidate words are different. If the semantics represented by the user words and the candidate words are different, the semantic similarity corresponding to any target candidate question is adjusted to obtain a new semantic similarity for that target candidate question. Thus, by searching the thesaurus for query word tuples composed of candidate words and user words with the same characters in the thesaurus, it is possible to accurately determine whether the semantics represented by the candidate words and user words are the same. This allows for adjustment of the semantic similarity between the target candidate question and the user question even when the semantics represented by the candidate words and user words with the same characters are different, optimizing the rationality of the semantic similarity. This solves the problem of inaccurate semantic determination of user questions when there is character overlap between user questions and candidate questions, effectively improving the accuracy of semantic determination of user questions and further enhancing the accuracy of question-answering matching.

[0095] In one embodiment, such as Figure 2 As shown, when the semantics represented by the user's words and the semantics represented by the candidate words are different, the semantic similarity corresponding to any target candidate question is adjusted to obtain a new semantic similarity corresponding to any target candidate question. Specifically, this includes:

[0096] Step S210: Among the identical characters corresponding to any target candidate question, the identical characters whose corresponding user words and corresponding candidate words are different, and whose semantic representations are different from those of the corresponding candidate words, are taken as the target identical characters between any target candidate question and the user question.

[0097] In specific implementation, when the semantics represented by the user word corresponding to any identical character are different from the semantics represented by the corresponding candidate word, in the process of adjusting the semantic similarity of any target candidate question to obtain a new semantic similarity of any target candidate question, the server may first take the identical character among at least one identical character between any target candidate question and user question, where the corresponding user word and the corresponding candidate word are different, and the semantics represented by the corresponding user word and the corresponding candidate word are different, as the target identical character between any target candidate question and user question.

[0098] Step S220: Determine the total number of identical characters between any target candidate question and the user question, as the first character count, and determine the number of characters in the user question, as the second character count.

[0099] In this way, the server can determine the total number of identical characters between any target candidate question and the user question, as the first character count, and determine the number of characters in the user question, as the second character count.

[0100] Step S230: Based on the ratio between the number of second characters and the number of first characters and the preset similarity adjustment parameters, adjust the semantic similarity of any target candidate question to obtain a new semantic similarity for any target candidate question.

[0101] Among them, the new semantic similarity is less than the semantic similarity.

[0102] The similarity adjustment parameter is a constant with a corresponding value greater than zero.

[0103] In practice, after determining the number of the first character and the number of the second character, the server can adjust the semantic similarity of any target candidate question according to the ratio between the number of the second character and the number of the first character and the preset similarity adjustment parameters, so as to obtain a new semantic similarity of any target candidate question.

[0104] Specifically, the similarity adjustment parameter is a constant with a value greater than zero. In the process of adjusting the semantic similarity of any target candidate question based on the ratio between the number of second characters and the number of first characters, and the preset similarity adjustment parameter, to obtain a new semantic similarity for that target candidate question, the server can determine the sum of the ratio between the number of second characters and the number of first characters and the preset similarity adjustment parameter. Then, it determines the quotient between the semantic similarity of that target candidate question and the above sum to obtain a new semantic similarity for that target candidate question. The new semantic similarity is directly proportional to the number of first characters.

[0105] The technical solution of this embodiment identifies the same characters in any target candidate question that are different from the corresponding user words and candidate words, and whose semantic representations differ from those of the corresponding user words and candidate words. These are defined as target identical characters between the target candidate question and the user question. The total number of target identical characters between the target candidate question and the user question is determined as the first character count, and the number of characters in the user question is determined as the second character count. Based on the ratio between the second character count and the first character count, and a preset similarity adjustment parameter, the semantic similarity of the target candidate question is adjusted to obtain a new semantic similarity. The new semantic similarity is less than the original semantic similarity. The similarity adjustment parameter is a constant with a value greater than zero. Specifically, the ratio between the second character count and the first character count can be determined, along with the sum of the preset similarity adjustment parameter. The quotient between the semantic similarity of the target candidate question and the sum is determined to obtain the new semantic similarity. The new semantic similarity is directly proportional to the first character count.

[0106] Thus, this method determines a new semantic similarity corresponding to the target candidate question. When the same characters in the user question and the target candidate question belong to different words and the words represent different meanings, the new semantic similarity corresponding to the target candidate question is less than the original semantic similarity. When there is single-character overlap or continuous character overlap between the user question and the target candidate question but the meanings are not similar, the rationality of the semantic similarity between the target candidate question and the user question is optimized. This method can improve the accuracy of semantic determination of user questions and further improve the accuracy of question-answering matching when there is single-character overlap or continuous character overlap between the user question and the target candidate question.

[0107] In one embodiment, selecting a target candidate question from each candidate question includes: selecting incompletely matching candidate questions from each candidate question; an incompletely matching candidate question is a candidate question whose corresponding string is not exactly equal to the string corresponding to the user question; among each incompletely matching candidate question, selecting incompletely matching candidate questions that have the same characters as the user question, and using them as the target candidate question.

[0108] In the specific implementation, during the process of filtering target candidate questions from each candidate question, the server can select candidate questions whose corresponding strings are not exactly equal to the strings corresponding to the user question, as incomplete match candidate questions. Then, the server can select incomplete match candidate questions from each incomplete match candidate question that have at least one identical character to the user question, as the target candidate question corresponding to the user question.

[0109] In this way, the server can filter out target candidate questions that have character overlap with the user's question, and adjust the semantic similarity between the target candidate questions and the user's question based on word segmentation matching and synonym matching between the user's question and the target candidate questions.

[0110] The technical solution of this embodiment filters out incompletely matching candidate questions from each candidate question. An incompletely matching candidate question is one whose corresponding string is not exactly equal to the string corresponding to the user question. From these incompletely matching candidate questions, those that share characters with the user question are selected as target candidate questions. In this way, target candidate questions with character overlap with the user question can be accurately selected from multiple candidate questions corresponding to the user question. This allows for adjustment of the semantic similarity of the target candidate questions when there is character overlap (e.g., single-character overlap or consecutive character overlap) but the semantics are not similar.

[0111] In one embodiment, the process of selecting a target question that matches the user's question from among the candidate questions based on the new semantic similarity corresponding to each target candidate question includes: selecting the candidate question with the highest corresponding semantic similarity from among the candidate questions based on the new semantic similarity corresponding to each target candidate question and the semantic similarity corresponding to each of the remaining candidate questions in each candidate question, and using it as the candidate question to be compared; the remaining candidate questions are the candidate questions other than each target candidate question; and if the semantic similarity corresponding to the candidate question to be compared is greater than a preset similarity threshold, the candidate question to be compared is used as the target question.

[0112] In the specific implementation, during the process of filtering out the target question that matches the user's question from among the candidate questions based on the new semantic similarity corresponding to each target candidate question, the server can filter out the candidate question with the highest corresponding semantic similarity from among the candidate questions, based on the new semantic similarity corresponding to each target candidate question and the semantic similarity corresponding to the other candidate questions in each candidate question, and use it as the candidate question to be compared; compare the candidate question to be compared with a preset similarity threshold, and if the semantic similarity corresponding to the candidate question to be compared is greater than the preset similarity threshold, then the candidate question to be compared is used as the target question.

[0113] The technical solution of this implementation selects the candidate question with the highest semantic similarity from multiple candidate questions corresponding to the user's question. If the semantic similarity of the candidate question to be compared is greater than a preset similarity threshold, the candidate question to be compared is used as the target question to match the user's question. This can improve the matching degree between the target question and the user's question, so that the answer corresponding to the returned target question matches the user's question intent, and further improve the accuracy of question-answer matching.

[0114] In one embodiment, an alternative question-and-answer matching method is provided, illustrated by applying this method to the aforementioned FAQ question-and-answer system that includes a terminal and a server. Figure 3 As shown, it includes the following steps:

[0115] Step S310: Using a deep model-based FAQ (Frequently Asked Questions) semantic matching and ranking method, calculate the semantic similarity matching score between the user's question and each candidate question to obtain the semantic similarity between the user's question and each candidate question.

[0116] Specifically, the server can retrieve the text q of a single user question. u The text q of N candidate questions i A list consisting of (1 <= i <= N), and a trained semantic similarity calculation model. User questions can be, for example, "I want to open my health card" or "gym card," and candidate questions can be, for example, "How do I open my electronic health card" or "View health card." The trained semantic similarity calculation model is used to calculate semantic similarity scores using existing deep model-based FAQ semantic matching and ranking methods. For example, the method could be a question semantic matching and ranking method based on a pre-trained dual encoder. The semantic similarity calculation model (referred to as Model M) D It consists of a user question encoding module, a candidate question encoding module, and an encoding matching module.

[0117] Then, the server can output the semantic similarity matching score S between one user question and N candidate questions. Specifically, the server can use model M... D Calculate the user's question text q u The corresponding vector code E u Calculate the text q for each candidate question. i The vector code E corresponding to (1<=i<=N) c (q i E u E is a 1*h-dimensional real vector. c Given an N*h dimensional real vector, then using E u With E cCalculate the semantic similarity matching score and store it in S, where S is a 1*N dimensional real vector. i For q u With q i The matching score (1 <= i <= N). The specific process is as follows:

[0118] Step 1: Extract the text q of the user's question. u Input model M D Vector encoding is performed in the user question encoding module to obtain the vector encoded representation of the entire user question, which serves as the corresponding vector code E for the user question. u .

[0119] For example, model M D The user question encoding module uses the BERT (Bidirectional Encoder Representation from Transformers) model to encode user question vectors as follows:

[0120] 1) Concatenate the user's question into a string: "[CLS]Text of user's question[SEP]";

[0121] 2) Input the above string into the BERT model for vector encoding, and take the output encoding at the [CLS] position as the vector encoding corresponding to the user's question. The dimension is h, and h is usually 768 or 1024.

[0122] For example, the text of the user's question "I want to open my health card" is concatenated into the string "[CLS]I want to open my health card[SEP]", input into the BERT model for vector encoding, and the output code at the [CLS] position is taken as the vector code E corresponding to the user's question. u .

[0123] Step 2: Extract the text q of each candidate question. i Input into model M D The candidate question encoding module performs vector encoding to obtain the vector encoded representation of each candidate question, which is then stored in E as the corresponding vector code for each candidate question. c middle.

[0124] For example, model M D The candidate question encoding module uses the BERT model to encode candidate question vectors as follows:

[0125] 1) Concatenate the text of each candidate question into a string: "[CLS] text of candidate questions [SEP]", where [CLS] represents the start character of the string and [SEP] represents the end character of the string;

[0126] 2) Input the string into the BERT model for vector encoding, and take the output encoding at the [CLS] position as the vector encoding corresponding to the candidate question. The dimension is h, and h is usually 768 or 1024.

[0127] Based on this BERT model text encoder, the text of the candidate questions "How to open the electronic health card" and "View health card" are concatenated into strings according to step 1) above, and then the h-dimensional vector encoding is output according to step 2) above and saved to E. c middle.

[0128] Step 3: Encode the vector E corresponding to the user's question. u Vector encoding E corresponding to candidate questions c Input into model M D The encoding matching module performs encoding matching to obtain the semantic similarity matching score between the vector code corresponding to the user's question and the vector codes corresponding to each candidate question, and stores it in S, where S i For q u With q i The semantic similarity matching score (1 <= i <= N).

[0129] For example, calculating the vector code E corresponding to the user's question. u Vector encoding E corresponding to candidate questions c The inner product of the transposes of the vectors yields a 1*N dimensional real vector. Softmax (normalized exponential function) is then calculated on the N dimensions to normalize the vectors and obtain the semantic similarity probability distribution of the N candidate questions. This distribution serves as the semantic similarity matching score and is stored in S, where S... i For q u With q i The semantic similarity matching score (1 <= i <= N).

[0130] Thus, by matching the semantic similarity scores between the vector encoding corresponding to the user's question and the vector encoding corresponding to each candidate question, the semantic similarity between the user's question and each candidate question can be determined.

[0131] Step S320: In cases where there is character overlap (single character overlap or continuous character overlap) between the user's question and the candidate question, adjust the semantic similarity matching score based on word segmentation matching and synonym matching.

[0132] Specifically, the server can determine the type of query based on the text q of a user's question. u The text q of N candidate questions i The list of (1<=i<=N), the synonym list, and the semantic similarity matching score S between one user question and N candidate questions output in step S310, where S is a 1*N dimensional real vector, Si For q u With q i The semantic similarity matching score (1 <= i <= N) is used to output the adjusted semantic similarity matching score S between one user question and N candidate questions. r S is a 1*N dimensional real vector. ri For q u With q i The adjusted semantic similarity matching score (1 <= i <= N).

[0133] The adjustment process includes: for each candidate question, the text q i (1<=i<=N),

[0134] 1) If q u With q i If it's not a complete match, proceed to step 2); otherwise, q... u With q i A perfect match, without adjusting the semantic similarity score, allows S to... ri =S i End of targeting q i The processing;

[0135] Specifically, q u With q i Whether it is a complete match is determined by q. u With q i Determine if the corresponding strings are completely equal.

[0136] 2) If q u With q i If the same character exists, proceed to step 3); otherwise, q. u With q i No duplicate characters, no adjustment needed, let S ri =S i End of targeting q i The processing;

[0137] Specifically, in judging q u With q i In the process of determining whether there are duplicate characters, the server can first retrieve q respectively. u With q i The corresponding set of all deduplicated characters is then used to find the intersection of the sets. Let L be the number of elements in the intersection. If L > 0, then q u With q i There are identical characters. For example, the user question "gym card" and the candidate question "health card view" have the same characters, while the user question "trip information" and the candidate question "health card view" do not have any identical characters.

[0138] 3) If qu With q i A certain identical character between them is in q u The word in the middle, and in q i If the words in the middle are different and not synonyms, then adjust q. u With q i semantic similarity matching score S i For S ri Otherwise, no adjustment is needed, let S ri =S i End of targeting q i The processing involves selecting candidate questions that share at least one character with the user's question as target candidate questions, and selecting other candidate questions as the remaining candidate questions.

[0139] Among them, determining the same character in q u The word in the middle and in q i The words contained in the text, and the method for determining whether they are synonyms, can be found in the specific implementation methods described in the above embodiments, and will not be repeated here.

[0140] Among them, adjusting q u With q i semantic similarity matching score S i For S ri During the process, S is calculated using the formula. ri =S i / (len(q u ) / L ci +A), where len(q) u ) is the text of the user's question q u The total number of all characters contained (number of the second character), L ci For q u With q i The number of identical characters in different words, and characters in words that are not synonyms (the number of the first character), where A>0, is a preset similarity adjustment parameter. The effect of this formula is the adjusted semantic similarity matching score S. ri Absolutely less than the semantic similarity matching score S i , and q u With q i The fewer the number of first characters between them, the higher the adjusted semantic similarity matching score S. ri The smaller A is, the more it is used to convert S. ri Adjust to a specific range of values; the specific values are determined through experiments in specific scenarios.

[0141] Thus, the adjusted semantic similarity score S corresponding to each candidate question is used to match the results. riThis allows us to determine new semantic similarities between each target candidate question and the user question, as well as semantic similarities between each of the remaining candidate questions and the user question.

[0142] Step S330: Semantic similarity matching score sorting and semantic determination of user questions.

[0143] Specifically, the server can match the adjusted semantic similarity score S between the user's question and N candidate questions output in step S320. r S is a 1*N dimensional real vector. ri For q u With q i The adjusted semantic similarity matching score (1<=i<=N) is used to output the user's question and the matched target question.

[0144] The calculation process includes: for S r Sort the candidate questions from largest to smallest and select the candidate questions with the highest scores as the candidate questions to be compared. If the semantic similarity matching score of the candidate questions to be compared is greater than the preset similarity threshold Theta, then the candidate questions to be compared are returned to the terminal as the target questions to match the user's questions; otherwise, an empty string is returned.

[0145] In another embodiment, such as Figure 4 As shown, a question-and-answer matching method is provided. Taking the application of this method to the aforementioned FAQ question-and-answer system, which includes terminals and servers, as an example, the method includes the following steps:

[0146] Step S410: Obtain the user's question and multiple candidate questions corresponding to the user's question.

[0147] Step S420: Select the target candidate question from among the candidate questions.

[0148] Step S430: For any identical character corresponding to any target candidate question, determine the word in the user question containing the identical character as the user word, and determine the word in any target candidate question containing the identical character as the candidate word.

[0149] Step S440: If the user word and the candidate word are different, determine the query word tuple based on the candidate word and the user word.

[0150] Step S450: If no query word tuple is found in the thesaurus database, the same characters in any target candidate question that are different from the corresponding user word and the corresponding candidate word, and whose semantics are different from those represented by the corresponding user word and the corresponding candidate word, are taken as the target same characters between any target candidate question and the user question.

[0151] Step S460: Determine the total number of target identical characters between any target candidate question and the user question, as the first character count, and determine the number of characters in the user question, as the second character count.

[0152] Step S470: Based on the ratio between the number of second characters and the number of first characters and the preset similarity adjustment parameters, adjust the semantic similarity of any target candidate question to obtain a new semantic similarity for any target candidate question.

[0153] Step S480: Based on the new semantic similarity corresponding to each target candidate question, select the target question that matches the user's question from among the candidate questions.

[0154] It should be noted that the specific limitations of the above steps can be found in the specific limitations of a question-and-answer matching method described above.

[0155] It should be understood that although the steps in the flowcharts of the embodiments described above are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the embodiments described above may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0156] Based on the same inventive concept, this application also provides a question-and-answer matching device for implementing the question-and-answer matching method described above. The solution provided by this device is similar to the solution described in the above method; therefore, the specific limitations in one or more question-and-answer matching device embodiments provided below can be found in the limitations of the question-and-answer matching method described above, and will not be repeated here.

[0157] In one embodiment, such as Figure 5 As shown, a question-and-answer matching device is provided, including: an acquisition module 510, a first filtering module 520, a determination module 530, an adjustment module 540, and a second filtering module 550, wherein:

[0158] The acquisition module 510 is used to acquire the user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question satisfies a preset matching degree condition.

[0159] The first filtering module 520 is used to filter out target candidate questions from each of the candidate questions; the target candidate question is a candidate question that has at least one character in common with the user question.

[0160] The determining module 530 is configured to, for any identical character corresponding to any target candidate question, determine the word in the user question containing the identical character as a user word, and determine the word in the target candidate question containing the identical character as a candidate word.

[0161] The adjustment module 540 is used to adjust the semantic similarity between any target candidate question and the user question based on the difference between the semantics represented by the user word and the semantics represented by the candidate word when the user word and the candidate word are different, so as to obtain a new semantic similarity corresponding to the target candidate question.

[0162] The second filtering module 550 is used to filter out target questions that match the user question from among the candidate questions based on the new semantic similarity corresponding to each target candidate question; the answer corresponding to the target question is the answer that matches the user question.

[0163] In one embodiment, the adjustment module 540 is specifically configured to: determine a query word pair based on the candidate words and the user words; if the query word pair is not found in the thesaurus, determine that the semantics represented by the user words and the semantics represented by the candidate words are different; if the semantics represented by the user words and the semantics represented by the candidate words are different, adjust the semantic similarity corresponding to any target candidate question to obtain a new semantic similarity corresponding to any target candidate question.

[0164] In one embodiment, the adjustment module 540 is specifically configured to: identify the identical characters in any target candidate question that are different from the corresponding user words and the corresponding candidate words, and whose semantic representations differ from those of the corresponding user words and the corresponding candidate words; determine the total number of the target identical characters between the target candidate question and the user question as a first character count, and determine the number of characters in the user question as a second character count; adjust the semantic similarity of the target candidate question according to the ratio between the second character count and the first character count and a preset similarity adjustment parameter, to obtain a new semantic similarity for the target candidate question; the new semantic similarity is less than the original semantic similarity.

[0165] In one embodiment, the similarity adjustment parameter is a constant with a corresponding value greater than zero; the adjustment module 540 is specifically used to determine the ratio between the number of the second character and the number of the first character, and the sum of the ratio and the preset similarity adjustment parameter; determine the quotient between the semantic similarity corresponding to any target candidate question and the sum, and obtain a new semantic similarity corresponding to any target candidate question; the new semantic similarity is directly proportional to the number of the first character.

[0166] In one embodiment, the determining module 530 is specifically configured to perform word segmentation processing on the user question and any target candidate question respectively, to obtain the segmented user question and the segmented target candidate question; to determine the word containing any identical character in the segmented user question, to obtain the user word; and to determine the word containing any identical character in the segmented target candidate question, to obtain the candidate word.

[0167] In one embodiment, the first filtering module 520 is specifically used to filter out incompletely matching candidate questions from each of the candidate questions; the incompletely matching candidate questions are candidate questions whose corresponding strings are not completely equal to the strings corresponding to the user questions; among the incompletely matching candidate questions, incompletely matching candidate questions that have the same characters as the user questions are filtered out as the target candidate questions.

[0168] In one embodiment, the second filtering module 550 is specifically configured to, based on the new semantic similarity corresponding to each of the target candidate questions and the semantic similarity corresponding to each of the remaining candidate questions in each of the candidate questions, select the candidate question with the highest corresponding semantic similarity as the candidate question to be compared; each of the remaining candidate questions refers to the candidate questions other than each of the target candidate questions; if the semantic similarity corresponding to the candidate question to be compared is greater than a preset similarity threshold, the candidate question to be compared is selected as the target question.

[0169] The modules in the aforementioned question-and-answer matching device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the operations corresponding to each module.

[0170] In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 6 As shown, this computer device includes a processor, memory, input / output (I / O) interfaces, and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides the environment for the operating system and computer programs stored in the non-volatile storage media. The database stores standard questions, similar questions, standard answers, and a list of synonyms. The I / O interfaces are used for exchanging information between the processor and external devices. The communication interface is used for communicating with external terminals via a network connection. When executed by the processor, the computer program implements a question-and-answer matching method.

[0171] Those skilled in the art will understand that Figure 6 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0172] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.

[0173] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the steps in the above method embodiments.

[0174] In one embodiment, a computer program product is provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.

[0175] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties, and the collection, use and processing of the relevant data shall comply with the relevant laws, regulations and standards of the relevant countries and regions.

[0176] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0177] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0178] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A question-and-answer matching method, characterized in that, The method includes: Obtain the user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition; Target candidate questions are selected from the candidate questions; the target candidate questions are those that have at least one character in common with the user questions. For any identical character corresponding to any of the target candidate questions, determine the word in which the identical character is located in the user question as the user word, and determine the word in which the identical character is located in any of the target candidate questions as the candidate word; When the user word and the candidate word are different, the semantic similarity between any target candidate question and the user question is adjusted according to the difference between the semantics represented by the user word and the semantics represented by the candidate word, so as to obtain a new semantic similarity corresponding to any target candidate question; Based on the new semantic similarity corresponding to each of the target candidate questions, a target question that matches the user question is selected from the candidate questions; the answer corresponding to the target question is the answer that matches the user question. Wherein, when the user word and the candidate word are different, adjusting the semantic similarity between the target candidate question and the user question based on the difference between the semantics represented by the user word and the semantics represented by the candidate word, to obtain a new semantic similarity corresponding to the target candidate question, includes: Based on the candidate words and the user words, determine the word tuple to be queried; If the query term pair is not found in the thesaurus, it is determined that the semantics represented by the user term and the semantics represented by the candidate term are different. Among the identical characters corresponding to any target candidate question, those whose corresponding user words and corresponding candidate words are different, and whose semantic representations are different from those of the corresponding candidate words, are taken as the target identical characters between any target candidate question and the user question; The total number of identical characters between any target candidate question and the user question is determined as the first character count, and the number of characters in the user question is determined as the second character count; The ratio between the number of the second character and the number of the first character is determined, and the sum of this ratio and the preset similarity adjustment parameter is calculated; the similarity adjustment parameter is a constant with a corresponding value greater than zero. The quotient between the semantic similarity corresponding to any target candidate question and the sum is determined to obtain a new semantic similarity corresponding to any target candidate question; the new semantic similarity is less than the semantic similarity, and the new semantic similarity is proportional to the number of the first character.

2. The method according to claim 1, characterized in that, The step of determining, for any identical character corresponding to any of the target candidate questions, the word in which the identical character appears in the user question as a user word, and determining the word in which the identical character appears in any of the target candidate questions as a candidate word, includes: The user question and any target candidate question are segmented into words respectively to obtain the segmented user question and the segmented target candidate question. In the user's question after word segmentation, determine the word containing any identical character to obtain the user's word; Furthermore, the word containing any identical character is determined in the target candidate question after word segmentation, thereby obtaining the candidate word.

3. The method according to claim 1, characterized in that, The step of filtering out the target candidate question from each of the candidate questions includes: Among the candidate questions, incomplete matching candidate questions are selected; the incomplete matching candidate questions are those whose corresponding strings are not exactly equal to the strings corresponding to the user questions. Among the incompletely matched candidate questions, those that share the same characters as the user's question are selected as the target candidate question.

4. The method according to claim 1, characterized in that, The step of filtering out the target question that matches the user's question from among the candidate questions based on the new semantic similarity corresponding to each of the target candidate questions includes: Based on the new semantic similarity corresponding to each target candidate question and the semantic similarity corresponding to each of the remaining candidate questions, the candidate question with the highest corresponding semantic similarity is selected from each of the candidate questions and used as the candidate question to be compared; each of the remaining candidate questions refers to the candidate questions other than each target candidate question. If the semantic similarity of the candidate question to be compared is greater than a preset similarity threshold, the candidate question to be compared is taken as the target question.

5. A question-and-answer matching device, characterized in that, The device includes: The acquisition module is used to acquire a user's question and multiple candidate questions corresponding to the user's question; the literal matching degree between each candidate question and the user's question meets a preset matching degree condition; The first filtering module is used to filter out target candidate questions from each of the candidate questions; the target candidate question is a candidate question that has at least one character in common with the user question. The determination module is used to determine, for any identical character corresponding to any target candidate question, the word in the user question containing the identical character as the user word, and to determine the word in the target candidate question containing the identical character as the candidate word; The adjustment module is used to adjust the semantic similarity between any target candidate question and the user question based on the difference between the semantics represented by the user word and the semantics represented by the candidate word when the user word and the candidate word are different, so as to obtain a new semantic similarity corresponding to any target candidate question; The second filtering module is used to filter out target questions that match the user's question from among the candidate questions based on the new semantic similarity corresponding to each target candidate question; the answer corresponding to the target question is the answer that matches the user's question; The adjustment module is specifically used for: Based on the candidate words and the user words, determine the word tuple to be queried; If the query term pair is not found in the thesaurus, it is determined that the semantics represented by the user term and the semantics represented by the candidate term are different. Among the identical characters corresponding to any target candidate question, those whose corresponding user words and corresponding candidate words are different, and whose semantic representations are different from those of the corresponding candidate words, are taken as the target identical characters between any target candidate question and the user question; The total number of identical characters between any target candidate question and the user question is determined as the first character count, and the number of characters in the user question is determined as the second character count; The ratio between the number of the second character and the number of the first character is determined, and the sum of this ratio and the preset similarity adjustment parameter is calculated; the similarity adjustment parameter is a constant with a corresponding value greater than zero. The quotient between the semantic similarity corresponding to any target candidate question and the sum is determined to obtain a new semantic similarity corresponding to any target candidate question; the new semantic similarity is less than the semantic similarity, and the new semantic similarity is proportional to the number of the first character.

6. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 4.

7. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 4.

8. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 4.

Citation Information

Patent Citations

Multi-model fusion text matching method and device, equipment and storage medium
CN111259144A
Semantic matching method and device
CN111898643A
Question and answer method and system
CN114936272A

Patent Information

AI Technical Summary

Abstract

Description

Patent Citations

Multi-model fusion text matching method and device, equipment and storage medium

Semantic matching method and device

Question and answer method and system