Question answering methods and devices, electronic devices and storage media based on multi-source knowledge
By employing a multi-source knowledge question-answering method and utilizing high-dimensional vector retrieval and a large language model to generate answers, the flexibility and cost issues of the car-use assistant system are resolved, achieving a fast-response and low-cost question-answering system.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHONGQING SELIS PHOENIX INTELLIGENT INNOVATION TECH CO LTD
- Filing Date
- 2026-02-26
- Publication Date
- 2026-06-30
AI Technical Summary
Existing car assistant systems cannot combine flexibility and low system cost, cannot cope with user personalization or long-tail issues, and have high computing power consumption and high response latency, making it difficult to adapt to vehicle function iterations or changes in user needs.
We employ a question-answering method based on multi-source knowledge. By converting the query text into a high-dimensional vector, we first search in the question-answering vector library. If no match is found, we perform a mixed search in the document fragment library and combine it with a large language model to generate the answer.
It achieves low system cost with fast response in the question-answering vector library, and retrieves answers from the document fragment library when no match can be found, achieving a combination of flexibility and low cost.
Smart Images

Figure CN122309652A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of intelligent interaction technology, and in particular to a question-answering method and apparatus, electronic device and storage medium based on multi-source knowledge. Background Technology
[0002] With the increasing intelligence and connectivity of automobiles, users' consultation needs for car-related issues are becoming increasingly diversified, covering multiple scenarios such as vehicle operation, function instructions, troubleshooting, and maintenance. Existing car assistants typically employ two methods: First, retrieval-based question-and-answer (QA) systems, which construct fixed "question-answer" pairs through manual compilation or expert experience. After a user asks a question, the system directly returns a preset answer through keyword or simple semantic matching, primarily used for basic function consultation scenarios and brand-customized interaction scenarios. Second, document retrieval-enhanced QA systems, which mainly fragment and vectorize user manuals, troubleshooting guides, and other documents to build a knowledge base. When a user asks a question, relevant document fragments are first retrieved, and then the information is input into a large language model to generate an answer, covering more complex consultation needs.
[0003] However, the car assistant technology has the following drawbacks or shortcomings: (1) Retrieval-based question answering based on pre-built QA question-answer pairs: The knowledge coverage is limited, and it can only respond to high-frequency questions preset by humans. It cannot cope with personalized or long-tail questions of users. New questions need to be updated manually one by one, which is difficult to adapt to vehicle function iteration or changes in user needs and has poor flexibility. (2) Question answering based on document retrieval enhancement: It has problems such as high computing power consumption, high response latency, and unstable answer results. Moreover, for high-frequency questions, the generation model still needs to be called repeatedly, resulting in high system cost and making it unsuitable for in-vehicle real-time interaction scenarios. Therefore, there is a technical problem in the related technologies that cannot simultaneously achieve both flexibility and low system cost. Summary of the Invention
[0004] This application provides a question-answering method and apparatus, electronic device and storage medium based on multi-source knowledge, to at least solve the technical problem in the related art that it is impossible to have both flexibility and low system cost.
[0005] According to one aspect of the embodiments of this application, a question-answering method based on multi-source knowledge is provided, including: Retrieve query text from the target user; The query text is converted into a high-dimensional vector representation to obtain the query vector; Using the query vector, the system sequentially searches the question-and-answer vector library and the document fragmentation library to obtain the target answer corresponding to the query text. The question-and-answer vector library and the document fragmentation library are knowledge retrieval objects from different sources. The target answer is returned to the target user.
[0006] Optionally, as described above, the step of sequentially searching the question-answer pair database and the document fragment database using the query vector to obtain the target answer corresponding to the query text includes: Based on the query vector, a similarity matching retrieval is performed in the constructed question-and-answer vector library; if a target question-and-answer pair with a similarity greater than or equal to a preset threshold is selected from the question-and-answer vector library, the answer of the target question-and-answer pair is taken as the target answer; If no target question-answer pair with a similarity greater than or equal to a preset threshold is found in the question-answer vector library, a mixed search is performed in the document fragmentation library based on the query text and the query vector to obtain the target answer.
[0007] Optionally, as described above, the step of performing a mixed retrieval in the document fragmentation database based on the query text and the query vector to obtain the target answer includes: Using the query text and the query vector as input, a mixed search is performed in the document segmentation library to obtain the first correlation between each candidate document segment in the document segmentation library and the query text, and the second correlation between each candidate document segment and the query vector; Based on the first correlation and the second correlation, the K target document fragments with the highest correlation to the query text are determined from all candidate document fragments; The query text and the K target document fragments are input as prompts into the target large language model to obtain the target answer corresponding to the query text.
[0008] Optionally, as described above, the method further includes: The target document is parsed to obtain a parsing result containing the logical hierarchy representation of the target document, wherein the target document is an unstructured document related to the target product; The parsing result is divided into multiple candidate document fragments according to the logical hierarchy representation and the preset fragment length. Semantically encode the text content of each candidate document fragment using the target language model to obtain a document fragment vector corresponding to each candidate document fragment; The candidate document fragments and their corresponding document fragment vectors are stored in a vector database. The step of converting the query text into a high-dimensional vector representation to obtain the query vector includes: The query text is converted into a high-dimensional vector representation using the target language model to obtain the query vector.
[0009] Optionally, as described above, the method further includes: The target document is parsed to obtain a parsing result containing the logical hierarchy representation of the target document, wherein the target document is an unstructured document related to the target product; According to the logical hierarchy and the preset segment length, the parsing result is segmented to obtain multiple semantic segment texts; Each semantic segment text is segmented into words to obtain a segmented word sequence corresponding to each semantic segment text, wherein each segmented word sequence contains multiple word terms; The word segmentation sequences corresponding to the multiple semantically segmented texts are combined to obtain a global corpus set; Based on the global corpus, the keywords of each semantic text segment are determined; Based on the keywords of each semantic text segment, generate usable question-answer pairs.
[0010] Optionally, as described above, determining the keywords for each semantic segment of text based on the global corpus includes: Calculate the term frequency and inverse document frequency for each term; The TF-IDF value of each term is determined based on the term frequency and inverse document frequency corresponding to each term. All terms in each semantic segment text are sorted according to the TF-IDF value of each term in each semantic segment text, and the K terms with the highest TF-IDF values are determined, where K is an integer greater than or equal to 1; The K terms are determined as keywords for each semantic segment of text.
[0011] Optionally, as described above, generating usable question-answer pairs based on keywords of each semantically segmented text includes: A prompt word template for generating question-answer pairs is constructed; The semantic segmented text, the keywords of the semantic segmented text, the logical hierarchy of the semantic segmented text in the target document, and the prompt word template are combined to obtain the assembled information; The assembly information is input into the target large language model to obtain all candidate question-answer pairs corresponding to the semantic segmented text output by the target large language model; The candidate question-answer pairs are deduplicated to obtain the usable question-answer pairs.
[0012] Optionally, as described above, the method further includes: Obtain the user's interaction log, wherein the interaction log includes: the original natural language historical query text and the historical feedback corresponding to the historical query text; All historical query texts are clustered according to semantics to obtain at least one semantic cluster, wherein the question intent corresponding to each historical query text in each semantic cluster meets the preset similarity requirements; Generate the target answer corresponding to each semantic cluster; For each semantic cluster, the standard question template corresponding to the semantic cluster and the target answer corresponding to the semantic cluster are used to generate the available question-answer pair corresponding to the semantic cluster.
[0013] According to another aspect of the embodiments of this application, a question-answering device based on multi-source knowledge is also provided, characterized in that it includes: The acquisition module is used to acquire query text from the target user; The conversion module is used to convert the query text into a high-dimensional vector representation to obtain the query vector. The retrieval module is used to sequentially search in the question-and-answer vector library and the document fragment library using the query vector to obtain the target answer corresponding to the query text, wherein the question-and-answer vector library and the document fragment library are knowledge retrieval objects from different sources; The return module is used to return the target answer to the target user.
[0014] According to another aspect of the embodiments of this application, an electronic device is also provided, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; wherein the memory is used to store a computer program; and the processor is used to execute the method steps of any of the above embodiments by running the computer program stored in the memory.
[0015] According to another aspect of the embodiments of this application, a computer-readable storage medium is also provided, wherein a computer program is stored therein, wherein the computer program is configured to execute the method steps of any of the above embodiments when running.
[0016] In this embodiment, the following approach is adopted: obtaining query text from a target user; converting the query text into a high-dimensional vector representation to obtain a query vector; using the query vector, sequentially searching in a question-and-answer vector library and a document fragmentation library to obtain the target answer corresponding to the query text, wherein the question-and-answer vector library and the document fragmentation library are knowledge retrieval objects from different sources; and returning the target answer to the target user. By first searching in the question-and-answer vector library and then in the document fragmentation library to obtain the answer, the aim of low system cost can be achieved when the answer can be obtained from the question-and-answer vector library, and the answer can be obtained from the document fragmentation library when the answer cannot be obtained from the question-and-answer vector library, thereby achieving the goal of flexibility. This achieves the technical effect of combining flexibility and low system cost, thus solving the technical problem in related technologies that cannot simultaneously achieve flexibility and low system cost. Attached Figure Description
[0017] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.
[0018] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, for those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0019] Figure 1 This is a schematic diagram of the hardware environment for an optional question-answering method based on multi-source knowledge according to an embodiment of this application; Figure 2 This is a flowchart illustrating an optional question-answering method based on multi-source knowledge according to an embodiment of this application; Figure 3 This is a flowchart illustrating another optional question-answering method based on multi-source knowledge according to an embodiment of this application; Figure 4 This is a flowchart illustrating another optional question-answering method based on multi-source knowledge according to an embodiment of this application; Figure 5 This is a structural block diagram of an optional question-answering device based on multi-source knowledge according to an embodiment of this application; Figure 6 This is a structural block diagram of an optional electronic device according to an embodiment of this application. Detailed Implementation
[0020] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present application, and not all embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative effort should fall within the scope of protection of the present application.
[0021] It should be noted that the terms "first," "second," etc., in the specification, claims, and accompanying drawings of this application are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It should be understood that such data can be interchanged where appropriate so that the embodiments of this application described herein can be implemented in orders other than those illustrated or described herein. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion; for example, a process, method, system, product, or apparatus that comprises a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to such processes, methods, products, or apparatus.
[0022] According to one aspect of the embodiments of this application, a question-answering method based on multi-source knowledge is provided. Optionally, in this embodiment, the above-described question-answering method based on multi-source knowledge can be applied to, for example... Figure 1 The hardware environment shown consists of terminal 1402 and server 1404. For example... Figure 1 As shown, server 1404 is connected to terminal 1402 via a network and can be used to provide services (such as game services, application services, etc.) to the terminal or clients installed on the terminal. A database can be set up on the server or independently of the server to provide data storage services for server 1404.
[0023] The aforementioned network may include, but is not limited to, at least one of the following: wired network, wireless network. The aforementioned wired network may include, but is not limited to, at least one of the following: wide area network, metropolitan area network, local area network. The aforementioned wireless network may include, but is not limited to, at least one of the following: Wi-Fi (Wireless Fidelity), Bluetooth. The terminal may not be limited to PC, mobile phone, tablet computer, etc.
[0024] The question-answering method based on multi-source knowledge in this application can be executed by a server, a terminal, or both. Specifically, the execution of the question-answering method based on multi-source knowledge in this application can also be performed by a client installed on the terminal.
[0025] Taking the question-answering method based on multi-source knowledge in this embodiment as an example, which is executed by the client, Figure 2 A question-answering method based on multi-source knowledge, provided for embodiments of this application, includes the following steps: Step S202: Obtain the query text from the target user.
[0026] The question-answering method based on multi-source knowledge in this embodiment can be applied to scenarios on the vehicle side where it is necessary to respond to user questions and provide specific answers, such as scenarios where it is necessary to answer questions about how to use a certain device, troubleshooting a certain device, or maintaining a certain device. It can also be used to answer other types of questions, which will not be listed here.
[0027] Specifically, when a target user initiates a natural language query to the car assistant, the query text can be obtained through an interactive interface or voice capture. In other words, this query text is text information used for natural language queries.
[0028] Step S204: Convert the query text into a high-dimensional vector representation to obtain the query vector.
[0029] Specifically, after obtaining the query text, it can be cleaned and normalized, for example, including but not limited to removing irrelevant characters, correcting obvious spelling errors, and performing word segmentation. Then, using the same target language model as when building the document fragmentation library, the preprocessed query text is converted into a high-dimensional vector representation to obtain the query vector corresponding to the query text.
[0030] Step S206: Using the query vector, the system sequentially searches the question-and-answer vector library and the document fragmentation library to obtain the target answer corresponding to the query text.
[0031] Specifically, the question-and-answer vector library and the document fragment library are knowledge retrieval objects from different sources. After obtaining the query vector, a search can be performed in the question-and-answer vector library using the query vector. If a question-and-answer pair with a similarity greater than or equal to a preset threshold is found, the answer in that pair is used as the target answer corresponding to the query text. If no question-and-answer pair with a similarity greater than or equal to the preset threshold is found, a search is performed in the question-and-answer vector library based on the query vector to obtain the target answer corresponding to the query text.
[0032] Step S208: Return the target answer to the target user.
[0033] Specifically, after obtaining the target answer, it can be returned to the target user through display on the interactive interface or voice playback.
[0034] In this embodiment, the method involves obtaining query text from a target user; converting the query text into a high-dimensional vector representation to obtain a query vector; sequentially searching the question-and-answer vector library and the document fragment library using the query vector to obtain the target answer corresponding to the query text; and returning the target answer to the target user. By first searching the question-and-answer vector library and then the document fragment library to obtain the answer, this approach achieves low system cost when the answer can be obtained from the question-and-answer vector library, and obtains the answer from the document fragment library when the answer cannot be obtained from the question-and-answer vector library, thus achieving flexibility. This approach achieves the technical effect of combining flexibility and low system cost, thereby solving the technical problem in related technologies that cannot simultaneously achieve flexibility and low system cost.
[0035] As an optional implementation, the method described above can be implemented through the following steps: Step S206 involves sequentially searching the question-answer pair library and the document fragment library using the query vector to obtain the target answer corresponding to the query text. Based on the query vector, similarity matching is performed in the pre-built question-and-answer vector library. If a target question-and-answer pair with a similarity greater than or equal to a preset threshold is selected from the library, its answer is taken as the target answer. In other words, this step involves searching within the question-and-answer vector library (i.e., the QA library). Based on the pre-processed query vector, similarity matching is performed in the pre-built library. QA pairs with a similarity ≥ a preset threshold (e.g., 0.9) are selected as exact matches. Furthermore, if multiple question-and-answer pairs have a similarity ≥ the preset threshold, the pair with the highest similarity is selected as the target pair. After identifying the target pair, its answer is taken as the target answer.
[0036] If no target question-answer pair with a similarity greater than or equal to a preset threshold is found in the question-answer vector library, a hybrid search is performed in the document fragment library based on the query text and query vector to obtain the target answer. In other words, if no question-answer pair has a similarity greater than or equal to the preset threshold, the document fragment library is further searched. Using the aforementioned query file and the query vector obtained from the query file, a hybrid search is performed in the document fragment library to obtain the target vector.
[0037] As an optional implementation, the method described above can achieve a hybrid retrieval of the document fragmentation database based on the query text and query vector to obtain the target answer through the following steps: Using query text and query vector as input, a hybrid retrieval is performed in the document fragmentation database to obtain the first relevance between each candidate document fragment and the query text, and the second relevance between each candidate document fragment and the query vector. Specifically, using query text and its corresponding vector as input, a hybrid retrieval is performed in the document fragmentation database based on sparse and dense retrieval techniques to return a list of document fragments most relevant to the query. Sparse retrieval is based on the BM25 algorithm, using the query text to retrieve data from the fragmented text; sparse retrieval is based on approximate nearest neighbor search, using the query text vector as input to retrieve data from the fragmented vector. Through sparse and dense retrieval, the first relevance between each candidate document fragment and the query text, and the second relevance between each candidate document fragment and the query vector can be obtained.
[0038] Based on the first and second relevance, the K target document fragments with the highest relevance to the query text are identified from all candidate document fragments. Specifically, the first and second relevance are represented in the same way, and then all first and second relevances are sorted to obtain the top-K document fragments with the highest relevance, which are then used as the target document fragments.
[0039] The query text and K target document fragments are input as prompts into the target large language model to obtain the target answer corresponding to the query text.
[0040] Specifically, the query text and retrieved document fragments can be organized into a prompt, which is then input into the target large language model (LLM) after fine-tuning the instructions. The answer generated by the target large language model is then returned to the user.
[0041] Optionally, a basic Prompt template design example is as follows: You are a professional in-car assistant. Please answer the user's questions based on the following reference information.
[0042] If the information provided is sufficient to answer the question, please generate an accurate, concise, and user-friendly answer.
[0043] If the reference information is insufficient or vague, please clearly inform the user that the exact answer cannot be found from the known materials, and suggest that they consult the official manual or contact customer service.
[0044] User question: {query text}.
[0045] Reference information: {Document Segment List} Please generate an answer based on the above reference information.
[0046] like Figure 3 As shown, as an optional implementation, the method described above further includes the following steps: Step S302: Parse the target document to obtain a parsing result containing the logical hierarchy representation of the target document, wherein the target document is an unstructured document related to the target product.
[0047] Specifically, the target documents parsed using the document parsing library can include, but are not limited to, unstructured documents such as user manuals, function specifications, and maintenance guides. After parsing the target document, the chapter structure, heading levels, paragraph content, and related metadata are extracted to form a logical hierarchical representation of the document. This enhanced metadata preserves contextual semantics during subsequent sharding and retrieval processes.
[0048] Step S304: The parsing result is divided into multiple candidate document fragments according to the logical hierarchy and the preset fragment length.
[0049] Optionally, before segmenting the parsed results, pre-cleaning processing can be performed, such as removing headers, footers, watermarks, blank lines, and special symbols (e.g., ©, ®, etc.) from the document, and standardizing the text encoding to UTF-8. After pre-cleaning, the parsed results can be segmented according to the aforementioned logical hierarchy and preset segment length. The logical hierarchy can be the aforementioned chapter structure, heading level, or paragraph content, and the preset segment length can be a pre-defined maximum token limit or maximum character count, etc.
[0050] Specifically, a hybrid approach can be adopted, primarily using recursive sharding and supplemented by multiple strategies, to control the shard length while ensuring semantic integrity. The specific process is as follows: Step 1: First-level segmentation (i.e., segmenting according to chapters in the chapter structure). Using the parsed heading levels, segment the document by chapter. If the chapter length is less than the maximum token limit (i.e., one of the preset segment lengths), retain the complete chapter as a candidate document segment. If the chapter is too long (i.e., the chapter length is greater than or equal to the maximum token limit), proceed to second-level segmentation.
[0051] Step 2: Secondary segmentation (i.e., segmenting according to paragraphs within the paragraph content). Long chapters are divided by paragraphs or subheadings, and the length of each paragraph (number of characters or tokens) is calculated. Paragraphs exceeding the threshold (i.e., the preset segment length) are further split into sub-paragraphs or sliding window segments (which can overlap by 50~100 tokens to ensure contextual continuity). If the paragraph length is less than the preset segment length, the chapter is considered a candidate document segment.
[0052] Step 3: Three-level segmentation (sentence / sliding window). If a single paragraph still exceeds the preset segmentation length, a sliding window segmentation based on sentences is used. The window length is controlled within the model input limit. The overlap tokens of each candidate document segment in the preceding and following chains can be set (e.g., 10~20%) to maintain semantic continuity.
[0053] Step S306: Semantically encode the text content of each candidate document segment using the target language model to obtain the document segment vector corresponding to each candidate document segment.
[0054] Specifically, after obtaining the candidate document fragments of the target document, rich metadata can be attached to each fragment to facilitate filtering and tracing. This mainly includes: chunk_id (unique identifier of the candidate document fragment), doc_id (ID of the original document to which the candidate document fragment belongs, which in this embodiment is the ID of the target document), and parent_headings (path of the chapter to which the candidate document fragment belongs).
[0055] A pre-trained target language model based on Transformer (such as BERT, ERNIE-3.0, etc., adapted to Chinese semantic understanding) can be used to semantically encode the text content of each candidate document segment, generating a document segment vector corresponding to each candidate document segment.
[0056] Step S308: Store the candidate document fragments and their corresponding document fragment vectors into the vector database.
[0057] In other words, for any candidate document fragment, after obtaining the document fragment vector corresponding to the candidate document fragment, the generated document fragment vector, the candidate document fragment, and the aforementioned enhanced metadata are all stored in the vector database.
[0058] For example: { "chunk_id": "3b2236e6-3c4a-475f-bce5-1f38fb1e963a_0", "doc_id": "3b2236e6-3c4a-475f-bce5-1f38fb1e963a", "doc_name": "L1 User Manual", "text": "Installation Precautions\nBefore officially installing the L1, please carefully read the following precautions:\n1. Before installation, please clean the optical window with alcohol or a cleaning cloth. During use, also ensure the optical window is clean, as dust or other dirt may affect the L1's scanning performance.\n2. During installation, please be careful not to obstruct its FOV. Even installing a transparent glass plate on the optical window will affect the L1's performance.\n3. The L1 can be installed in any orientation through the mounting holes at the bottom.\n4. The L1's mounting structure only ensures its own reliability; the camera body cannot withstand additional loads.\n5. Please leave sufficient space on all four sides during installation to prevent poor airflow from affecting heat dissipation. ……", "vector": [0.06304931640625,0.039337158203125,-0.01727294921875,0.0112380981445 3125,0.0162811279296875,0.07147216796875,0.014556884765625,0.01065063 4765625, 0.0086517333984375, 0.008880615234375, -0.060089111328125, 0.00806427001953125, 0.026947021484375, -0.01180267333984375, -0.020523071289 0625, -0.042694091796875, 0.07354736328125…], "parent_headings": "User Manual -> Installation -> Installation Notes" } The aforementioned step S204 converts the query text into a high-dimensional vector representation to obtain the query vector, including: converting the query text into a high-dimensional vector representation using a target language model to obtain the query vector. In other words, the same target language model used to semantically encode the text content of candidate document fragments is used to convert the query text into a high-dimensional vector representation to obtain the query vector.
[0059] like Figure 4 As shown, as an optional implementation, the method described above further includes the following steps: Step S402 involves parsing the target document to obtain a parsing result containing the logical hierarchy representation of the target document, where the target document is an unstructured document related to the target product. Specifically, step S402 of this embodiment can be implemented with reference to the method described in step S302 above, and will not be repeated here.
[0060] Step S404 involves segmenting the parsing result according to the logical hierarchy and the preset segment length to obtain multiple semantic text segments. Specifically, step S404 of this embodiment can be implemented with reference to the method described in step S304 above, and will not be repeated here.
[0061] Step S406: Perform word segmentation on each semantic segment text to obtain a word segmentation sequence corresponding to each semantic segment text, wherein each word segmentation sequence contains multiple word terms.
[0062] Specifically, for each semantic text segment, before word segmentation, newline characters, tabs, special symbols, and garbled characters can be removed to unify the text encoding format, resulting in processed text segments. Then, a Chinese word segmentation algorithm is used to segment the processed text segments. During segmentation, a pre-built domain dictionary is introduced to avoid splitting key domain terms into multiple meaningless terms. After word segmentation and stop word removal, the word segmentation sequence corresponding to each semantic text segment is output.
[0063] Step S408 involves combining all word segmentation sequences corresponding to multiple semantic text segments to obtain a global corpus. In other words, the word segmentation sequences of all semantic text segments are combined to form a global corpus.
[0064] Step S410: Determine the keywords for each semantic segment of text based on the global corpus set.
[0065] As an alternative implementation, the method described above can be used to determine the keywords of each semantic segment of text based on a global corpus through the following steps: Calculate the term frequency and inverse document frequency for each term; The TF-IDF value of each term is determined based on the term frequency and inverse document frequency corresponding to each term. Specifically, the term frequency (TF) and inverse document frequency (IDF) of each word can be calculated using the following method:
[0066]
[0067]
[0068] Among them, it means that, Indicates the first 1 term, Indicates the first Each segment, Indicates terms In semantic segmentation The number of times it appears in Semantic fragmentation The total number of different terms in the text. This represents the total number of semantic segments in the corpus. This indicates that the corpus contains terms. The number of semantic segments.
[0069] Example: The semantic segmented text reads: "The zoned air conditioning supports independent temperature settings for the driver and passenger seats, and the temperature of each zoned air conditioning unit can be adjusted separately." After word segmentation and removal of stop words, we get: Zoned air conditioning / support / driver and passenger / independent / temperature / setting / zoned air conditioning / separate / adjust / temperature.
[0070] The term frequency (TF) values are shown in Table 1 below (total number of words in this segment = 10):
[0071] Table 1 If the total number of semantically segmented texts N = 1000, and the term "zoned air conditioner" appears in 10 semantically segmented texts, then IDF(zoned air conditioner) = log((1+1000) / (1+10))+1≈4.51, and TF-IDF(zoned air conditioner) = 0.2 in the above semantically segmented texts. 4.51 = 0.902, that is, TF-IDF = .
[0072] For each semantic segment text, all terms are sorted according to their TF-IDF values, and the K terms with the highest TF-IDF values are identified, where K is an integer greater than or equal to 1. These K terms are then designated as the keywords for each semantic segment text. In other words, for each semantic segment text, the TF-IDF values of all terms in that segment are obtained, sorted from highest to lowest, and the top K terms are selected as the keywords for that segment.
[0073] Step S412: Generate usable question-answer pairs based on the keywords of each semantic segment text.
[0074] Specifically, questions and answers can be generated based on the keywords of each semantic segment text, thereby obtaining usable question-and-answer pairs corresponding to each semantic segment text.
[0075] As an optional implementation, the method described above can be implemented by the following steps: Step S412 generates usable question-answer pairs based on the keywords of each semantic segment of text. A prompt template for generating question-and-answer pairs is constructed. Specifically, this can be done by setting prompt information including: role definition, text fragment content, generation requirements, and output format.
[0076] Example: # Role You are a professional QA generation expert, skilled at extracting or generating accurate, comprehensive, and valuable question-and-answer pairs from given document fragments.
[0077] # Document snippet content [Document Title]: {Insert document title here} [Chapter Information]: {Insert chapter information here} [Core Keywords]: {Insert keywords extracted from this segment here} [Document Content]: {Insert text snippet here} # Generation Requirements 1. Question Types: Please generate the following three types of questions: - Factual: Addresses facts, parameters, steps, and definitions explicitly stated in the document. For example: "What is the default value for X?" or "What are the steps to change X?" - Explanatory type: Requires understanding and explaining concepts, principles, and reasons. For example: "Why is it necessary to perform X?" or "How does X work?"
[0078] - Inferential (optional): Simple logical inferences can be made only when there is sufficient information in the document. For example: "What if...?"
[0079] 2. Question Quality: Each question must be a clear, complete, and unambiguous interrogative sentence. Avoid using words with vague references such as "it" or "this".
[0080] 3. Answer Quality: Answers should be accurate, concise, and complete, directly addressing the corresponding question. Answers must be strictly based on the content of the provided document and must not be fabricated.
[0081] 4. Quantity and Distribution: Generate 3-5 question-answer pairs for a given segment. Ensure coverage of different question types, prioritizing content related to the [core keywords].
[0082] # Output Format Please strictly follow the son format in the following examples when outputting results; no additional explanation is required. [ {{ Question: What is a latent fault? Answer: "Latent faults refer to faults that do exist in the engine computer control system, but there are no obvious symptoms, and the cause is difficult to determine. Its symptoms are that the fault characteristics in the engine computer control system are not obvious; it is usually a hidden state of the fault, deeply concealed, and difficult to detect under normal circumstances. Symptoms typically only appear under specific conditions, therefore, special attention should be paid to the vehicle's routine and secondary maintenance performance checks." }}, {{ Question: How many air conditioning vents does the X5 have? Answer: 6. }}, {{ Question: "If the CO / HC concentration does not meet the standard, how should I troubleshoot the problem?" Answer: "If the CO / HC concentration does not meet the standard, the oxygen sensor should be checked first to ensure it is functioning correctly. Identify the possible causes and perform appropriate checks and repairs. This may involve checking multiple aspects such as the fuel system, ignition system, and intake system to pinpoint the exact location of the fault." }} ] The semantic fragment text, its keywords, its logical hierarchy within the target document, and the prompt template are combined to obtain the assembled information. In other words, for any given semantic fragment text, the semantic fragment text, the keywords extracted from it, its logical hierarchy (i.e., the document title and chapter information to which the semantic fragment text belongs), and the aforementioned prompt template are combined sequentially to assemble a complete user message, which is then recorded as the assembled information.
[0084] The assembled information is input into the target large language model (LLM) to obtain all candidate question-answer pairs corresponding to the semantically segmented text, as output by the target LLM. In other words, the LLM is sent to the target LLM and the returned result is obtained. If the LLM returns a JSON string, responses that cannot be parsed into the specified JSON format or have missing fields are filtered out. The filtered result is then parsed using methods such as `json.loads()` to extract the QA pair list, which represents all candidate question-answer pairs corresponding to that semantically segmented text.
[0085] All candidate question-answer pairs are deduplicated to obtain usable pairs. In other words, after obtaining all candidate question-answer pairs from the semantically segmented text, deduplication can be performed on all candidate pairs to filter out one question-answer pair corresponding to each type of question, thus obtaining usable pairs. Specifically, candidate question-answer pairs from different channels, or different candidate question-answer pairs from the same channel, may present complex situations where "the expressions are different but the semantics are the same" or "the questions are the same but the answers differ." Therefore, a multi-level deduplication strategy can be adopted to filter redundant candidate question-answer pairs step by step. A question-answer vector library is then constructed based on the deduplicated usable pairs.
[0086] (1) First, preliminary screening of near-repeating candidate question-answer pairs can be performed based on MinHash. Question text segmentation in candidate question-answer pairs: Each normalized question text is segmented into words to generate a term set, which serves as the initial feature set.
[0087] Minhash fingerprint generation: Select k different hash functions (e.g., k=64), calculate the minimum hash signature for the feature set of each question text, and combine them to form a fingerprint array. Examples are shown in Table 2 below:
[0088] Table 2 Locality-Sensitive Hash Candidate Clustering: Divide the Minhash signature into b segments, each segment containing r lines (satisfying b...). (r=k), each segment is hashed to a hash bucket. QA questions within the same bucket are considered near-duplicate candidates, and QA questions from different buckets do not need to be compared; example bucket grouping (assuming 4 segments of 16 bits each) is shown in Table 3 below:
[0089] Table 3 Initial screening (MinHash similarity threshold): Calculate the MinHash similarity for QA questions within the same bucket. If the similarity is ≥ the question similarity threshold (e.g., 0.8), it is marked as a near-duplicate candidate. Examples are shown in Table 4 below:
[0090] Table 4 (2) Semantic precision deduplication Using a pre-trained semantic vector model, candidate pairing questions are converted into vectors, and then the cosine similarity between candidate pairing questions is calculated to obtain an accurate semantic similarity score. If the similarity threshold is greater than a given threshold (e.g., 0.85), it is considered a duplicate question.
[0091] For duplicate questions, if there is a conflict in the answers, the answer from the corresponding source will be retained based on the preset authority priority (manually constructed question-answer pairs > automatically mined question-answer pairs based on user behavior logs > pre-generated question-answer pairs based on unstructured documents).
[0092] Furthermore, question-answer pairs can be manually constructed using the method described below: For high-frequency core usage scenarios, core functions, safety precautions, regulatory requirements, and customized interaction needs, business experts and the technical team collaborate to build a standardized "question-standard answer" QA set, which serves as the "foundational knowledge base" of the system.
[0093] Example: Operation: "How to turn on the dual-zone climate control?" → "Click on the 'Air Conditioning' interface on the central control screen, select 'Dual-zone Control', and you can independently adjust the temperature of the driver's seat / passenger seat." Warning type: "What should I do if the tire pressure warning light comes on?" → "Please slow down and drive to a safe area, check the tire pressure, and do not continue driving if the tire pressure is abnormal." For brand customization: "What is your brand's intelligent interaction concept?" → "Our brand adheres to the intelligent interaction concept of 'safety, convenience, and personalization,' and through extremely simple operation logic, we make in-vehicle interaction more in line with the needs of driving scenarios."
[0094] As an optional implementation, the method described above further includes the following steps: The system acquires user interaction logs, which include historical query texts in raw natural language and corresponding historical responses. Specifically, during the operation of the in-vehicle assistant, the system logs the user's interactions with the question-and-answer system. This collection process primarily involves gathering the user's raw natural language questions (i.e., historical query texts) and the system's response results (i.e., historical responses).
[0095] All historical query texts are clustered according to semantics to obtain at least one semantic cluster, wherein the question intent corresponding to each historical query text in each semantic cluster meets the preset similarity requirement.
[0096] Specifically, before clustering historical query texts, the logs can be pre-filtered to remove invalid historical query texts and obtain standardized user questions. The main steps may include: Problems that are too short or lack semantic meaning: Remove problems where the character length is less than the preset character length limit (e.g., 3), the number of valid tokens is less than the preset token limit (e.g., 2), the number of semantic words (nouns / verbs) is 0, or the subject / object is a pronoun; Operation commands: Filter operation commands based on system response results, such as "increase volume".
[0097] Examples are shown in Table 5 below:
[0098] Table 5 The cleaned question set is clustered using semantic similarity calculation to merge questions with different expressions but the same intent. The specific process is as follows: The standardized user questions are vectorized using the same pre-trained semantic vector model as described above, resulting in a question vector corresponding to each standardized user question.
[0099] Efficient clustering algorithms (such as density-based DBSCAN) are employed to cluster these problem vectors. The algorithm parameters can be adaptively adjusted according to the data distribution to ensure that semantically similar questions (such as "How do I turn on the air conditioner?" and "How do I turn on the air conditioner?") are grouped into the same semantic cluster.
[0100] For each semantic cluster, the sum of the frequencies of all standardized user questions within the semantic cluster is calculated as the aggregate heat of the "question intent" represented by that semantic cluster.
[0101] Generate the target answer corresponding to each semantic cluster. Specifically, for the extracted question, the target answer corresponding to each semantic cluster can be automatically generated based on unstructured document retrieval and large model generation. The specific answer generation method can be implemented by referring to the target answer generation method in the previous embodiment of "inputting the query text and K target document fragments as prompts into the target large language model to obtain the target answer corresponding to the query text", which will not be elaborated here.
[0102] For each semantic cluster, the standard question template corresponding to the semantic cluster and the target answer corresponding to the semantic cluster are used to generate a usable question-answer pair for the semantic cluster.
[0103] Furthermore, high-popularity semantic clusters exceeding a preset threshold or TOP-P (e.g., the top 50) can be selected for subsequent processes. For each high-popularity semantic cluster, a standard question template representing a specific type of question corresponding to that cluster is generated. This standard question template can be generated by selecting the original question from the cluster center or by synthesizing it through text summarization / generation techniques.
[0104] Therefore, the method in this embodiment can convert real user-generated questions into standard question-answer pairs, thereby eliminating the need for the system to rely on RAG inference in the long term, effectively reducing response costs and improving stability.
[0105] According to another aspect of the embodiments of this application, an electronic device is also provided, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other through the communication bus; wherein the memory is used to store a computer program; and the processor is used to execute the method steps of any of the above embodiments by running the computer program stored in the memory.
[0106] According to another aspect of the embodiments of this application, a computer-readable storage medium is also provided, wherein a computer program is stored therein, wherein the computer program is configured to execute the method steps of any of the above embodiments when running.
[0107] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this application is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this application. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily essential to this application.
[0108] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods according to the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM (Read-Only Memory) / RAM (Random Access Memory), magnetic disk, optical disk), and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, or network device, etc.) to execute the methods described in the various embodiments of this application.
[0109] According to another aspect of the embodiments of this application, a multi-source knowledge-based question answering apparatus is also provided for implementing the above-described multi-source knowledge-based question answering method. Figure 5 This is a structural block diagram of an optional question-answering device based on multi-source knowledge according to an embodiment of this application, such as... Figure 5 As shown, the device may include: The acquisition module 51 is used to acquire query text from the target user.
[0110] The conversion module 52 is used to convert the query text into a high-dimensional vector representation to obtain the query vector.
[0111] The retrieval module 53 is used to sequentially search in the question-and-answer vector library and the document fragment library using the query vector to obtain the target answer corresponding to the query text, wherein the question-and-answer vector library and the document fragment library are knowledge retrieval objects from different sources.
[0112] Return module 54 is used to return the target answer to the target user.
[0113] It should be noted that the acquisition module 51 in this embodiment can be used to perform the above step S202, the conversion module 52 in this embodiment can be used to perform the above step S204, the retrieval module 53 in this embodiment can be used to perform the above step S206, and the return module 54 in this embodiment can be used to perform the above step S208.
[0114] In addition to the modules described above, the apparatus in this embodiment may also include modules that execute any method as described in any of the foregoing embodiments of the question-answering method based on multi-source knowledge.
[0115] It should be noted that the examples and application scenarios implemented by the above modules and corresponding steps are the same, but are not limited to the content disclosed in the above embodiments. It should also be noted that the above modules, as part of a device, can operate in environments such as... Figure 1 The hardware environment shown can be implemented through software or hardware, and the hardware environment includes the network environment.
[0116] According to another aspect of the embodiments of this application, an electronic device for implementing the above-described question-answering method based on multi-source knowledge is also provided. The electronic device may be a server, a terminal, or a combination thereof.
[0117] According to another embodiment of this application, an electronic device is also provided, comprising: Figure 6 As shown, the electronic device may include: a processor 1501, a communication interface 1502, a memory 1503, and a communication bus 1504, wherein the processor 1501, the communication interface 1502, and the memory 1503 communicate with each other through the communication bus 1504.
[0118] Memory 1503 is used to store computer programs; When processor 1501 executes the program stored in memory 1503, it performs the following steps: Step S202: Obtain the query text from the target user.
[0119] Step S204: Convert the query text into a high-dimensional vector representation to obtain the query vector.
[0120] Step S206: Using the query vector, the question-and-answer vector library and the document fragment library are searched sequentially to obtain the target answer corresponding to the query text. The question-and-answer vector library and the document fragment library are knowledge retrieval objects from different sources.
[0121] Step S208: Return the target answer to the target user.
[0122] Optionally, in this embodiment, the communication bus can be a PCI (Peripheral Component Interconnect) bus or an EISA (Extended Industry Standard Architecture) bus, etc. This communication bus can be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is used to represent it in the figure, but this does not mean that there is only one bus or one type of bus. The communication interface is used for communication between the aforementioned electronic device and other devices.
[0123] The memory may include random access memory (RAM) or non-volatile memory (NVM), such as at least one disk storage device. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
[0124] As an example, the memory 1503 described above may include, but is not limited to, the acquisition module 51, conversion module 52, retrieval module 53, and return module 54 of the question-answering device based on multi-source knowledge. Furthermore, it may include, but is not limited to, other module units of the question-answering device based on multi-source knowledge, which will not be elaborated upon in this example.
[0125] The processor mentioned above can be a general-purpose processor, including but not limited to: CPU (Central Processing Unit), NP (Network Processor), etc.; it can also be DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
[0126] This application also provides a computer-readable storage medium, which includes a stored program, wherein the program executes the method steps of the above method embodiments when it runs.
[0127] Optionally, in this embodiment, the storage medium may include, but is not limited to, various media capable of storing program code, such as USB flash drives, ROMs, RAMs, portable hard drives, magnetic disks, or optical disks.
[0128] The sequence numbers of the embodiments in this application are for descriptive purposes only and do not represent the superiority or inferiority of the embodiments.
[0129] If the integrated units in the above embodiments are implemented as software functional units and sold or used as independent products, they can be stored in the aforementioned computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause one or more computer devices (which may be personal computers, servers, or network devices, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application.
[0130] In the above embodiments of this application, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
[0131] In the several embodiments provided in this application, it should be understood that the disclosed client can be implemented in other ways. The device embodiments described above are merely illustrative; for example, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces, indirect coupling or communication connection between units or modules, and may be electrical or other forms.
[0132] The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of the solution provided in this embodiment, depending on actual needs.
[0133] Furthermore, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The integrated unit can be implemented in hardware or as a software functional unit.
[0134] The above description is only a preferred embodiment of this application. It should be noted that for those skilled in the art, several improvements and modifications can be made without departing from the principle of this application, and these improvements and modifications should also be considered within the scope of protection of this application.
Claims
1. A question-answering method based on multi-source knowledge, characterized in that, include: Retrieve query text from the target user; The query text is converted into a high-dimensional vector representation to obtain the query vector; Using the query vector, the system sequentially searches the question-and-answer vector library and the document fragmentation library to obtain the target answer corresponding to the query text. The question-and-answer vector library and the document fragmentation library are knowledge retrieval objects from different sources. The target answer is returned to the target user.
2. The method according to claim 1, characterized in that, The step of retrieving the target answer corresponding to the query text by sequentially searching the question-answer pair database and the document fragment database using the query vector includes: Based on the query vector, a similarity matching retrieval is performed in the constructed question-and-answer vector library; if a target question-and-answer pair with a similarity greater than or equal to a preset threshold is selected from the question-and-answer vector library, the answer of the target question-and-answer pair is taken as the target answer; If no target question-answer pair with a similarity greater than or equal to a preset threshold is found in the question-answer vector library, a mixed search is performed in the document fragmentation library based on the query text and the query vector to obtain the target answer.
3. The method according to claim 2, characterized in that, The process of performing a mixed search in the document fragmentation database based on the query text and the query vector to obtain the target answer includes: Using the query text and the query vector as input, a mixed search is performed in the document segmentation library to obtain the first correlation between each candidate document segment in the document segmentation library and the query text, and the second correlation between each candidate document segment and the query vector; Based on the first relevance and the second relevance, the K target document fragments with the highest relevance to the query text are determined from all candidate document fragments; The query text and the K target document fragments are input as prompts into the target large language model to obtain the target answer corresponding to the query text.
4. The method according to claim 1, characterized in that, The method further includes: The target document is parsed to obtain a parsing result containing the logical hierarchy representation of the target document, wherein the target document is an unstructured document related to the target product; The parsing result is divided into multiple candidate document fragments according to the logical hierarchy representation and the preset fragment length; Semantically encode the text content of each candidate document fragment using the target language model to obtain a document fragment vector corresponding to each candidate document fragment; The candidate document fragments and their corresponding document fragment vectors are stored in a vector database. The step of converting the query text into a high-dimensional vector representation to obtain the query vector includes: The query text is converted into a high-dimensional vector representation using the target language model to obtain the query vector.
5. The method according to claim 1, characterized in that, The method further includes: The target document is parsed to obtain a parsing result containing the logical hierarchy representation of the target document, wherein the target document is an unstructured document related to the target product; According to the logical hierarchy and the preset segment length, the parsing result is segmented to obtain multiple semantic segment texts; Each semantic segment text is segmented into words to obtain a segmented word sequence corresponding to each semantic segment text, wherein each segmented word sequence contains multiple word terms; The word segmentation sequences corresponding to the multiple semantically segmented texts are combined to obtain a global corpus set; The keywords for each semantic text segment are determined based on the global corpus set; Based on the keywords of each semantic text segment, generate usable question-answer pairs.
6. The method according to claim 5, characterized in that, The process of determining the keywords for each semantic text segment based on the global corpus includes: Calculate the term frequency and inverse document frequency for each term; Based on the term frequency and inverse document frequency corresponding to each term, the TF-IDF value of each term is determined; All terms in each semantic segment text are sorted according to the TF-IDF value of each term in each semantic segment text, and the K terms with the highest TF-IDF values are determined, where K is an integer greater than or equal to 1; The K terms are determined as keywords for each semantic segment text.
7. The method according to claim 5, characterized in that, The step of generating usable question-answer pairs based on the keywords of each semantic text segment includes: A prompt word template for generating question-answer pairs is constructed; The semantic segmented text, the keywords of the semantic segmented text, the logical hierarchy of the semantic segmented text in the target document, and the prompt word template are combined to obtain the assembled information; The assembly information is input into the target large language model to obtain all candidate question-answer pairs corresponding to the semantic segmented text output by the target large language model; The candidate question-answer pairs are deduplicated to obtain the usable question-answer pairs.
8. The method according to claim 1, characterized in that, The method further includes: Obtain the user's interaction log, wherein the interaction log includes: the original natural language historical query text and the historical feedback corresponding to the historical query text; All historical query texts are clustered according to semantics to obtain at least one semantic cluster, wherein the question intent corresponding to each historical query text in each semantic cluster meets the preset similarity requirements; Generate the target answer corresponding to each semantic cluster; For each semantic cluster, the standard question template corresponding to the semantic cluster and the target answer corresponding to the semantic cluster are used to generate the available question-answer pair corresponding to the semantic cluster.
9. A question-answering device based on multi-source knowledge, characterized in that, include: The acquisition module is used to acquire query text from the target user; The conversion module is used to convert the query text into a high-dimensional vector representation to obtain the query vector. The retrieval module is used to sequentially search in the question-and-answer vector library and the document fragment library using the query vector to obtain the target answer corresponding to the query text, wherein the question-and-answer vector library and the document fragment library are knowledge retrieval objects from different sources; The return module is used to return the target answer to the target user.
10. An electronic device comprising a processor, a communication interface, a memory, and a communication bus, wherein, The processor, the communication interface, and the memory communicate with each other via the communication bus, characterized in that... The memory is used to store computer programs; The processor is configured to perform the method of any one of claims 1 to 8 by running the computer program stored in the memory.
11. A computer-readable storage medium, characterized in that, The storage medium stores a computer program, wherein the computer program is configured to execute the method described in any one of claims 1 to 8 when run on a processor.