Artificial intelligence-based question processing method and device, computer device and medium

By receiving question texts, querying relevant documents, extracting information, and performing reasoning processing in a local knowledge base question-and-answer system, the problem of low answer accuracy has been solved, achieving efficient and accurate question-and-answer processing in the medical field.

CN119202155BActive Publication Date: 2026-06-19PING AN TECH (SHENZHEN) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
PING AN TECH (SHENZHEN) CO LTD
Filing Date
2024-08-21
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing local knowledge base question answering systems generate answers with low accuracy, and the traditional system processing methods increase the burden on the model, resulting in lengthy and unfocused answers that affect the user experience.

Method used

By receiving the question text input by the user, preprocessing it, querying relevant documents from a pre-set medical knowledge base, using a target language model for information extraction and reasoning, generating the answer text, and optimizing it to improve accuracy.

Benefits of technology

A local knowledge-based question-and-answer system for the medical field was built, which improved the accuracy and processing efficiency of the answers and ensured that the generated answers were concise and met the user's needs.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN119202155B_ABST
    Figure CN119202155B_ABST
Patent Text Reader

Abstract

This application belongs to the fields of artificial intelligence and digital healthcare, and relates to a problem-solving method, apparatus, computer equipment, and storage medium based on artificial intelligence. The method includes: receiving question text input by a user; preprocessing the question text to obtain target question text; retrieving target documents from a medical knowledge base that match the target question text; generating a first model input based on the target question text and target documents, and extracting target information from the first model input using a target language model; generating a second model input based on the target question text and target information, and performing reasoning processing on the second model input using a target language model to obtain answer text; optimizing the answer text to obtain target answer text; and returning the target answer text to the user. Furthermore, this application also relates to blockchain technology, allowing the target answer text to be stored in the blockchain. This application improves the processing efficiency and accuracy of medical question answering.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the fields of artificial intelligence development technology and digital healthcare, and in particular to artificial intelligence-based problem-solving methods, devices, computer equipment, and storage media. Background Technology

[0002] With the rapid development of artificial intelligence technology, local knowledge base question-answering systems based on large models have become an important means to improve the quality and efficiency of information services, especially in the medical field, where accurate and efficient information acquisition has immeasurable value for clinical decision-making, patient education, and scientific research. However, user privacy protection has become a core issue that cannot be ignored when building such systems. While advanced large language models such as ChatGPT on the market possess powerful natural language processing and knowledge reasoning capabilities, their high deployment costs, the complexity of data privacy handling, and dependence on external data pose numerous challenges to their direct application in local knowledge base question-answering systems.

[0003] First, large, open-source or proprietary models, despite their limited capabilities, fall short when dealing with complex problems in the medical field. Medical data is highly specialized, private, and time-sensitive, requiring models to not only possess deep medical knowledge but also to perform efficient and accurate information extraction and reasoning while protecting privacy. Currently available models often fall short of practical needs in terms of automatic summarization, reasoning, and personalized responses within massive amounts of medical data.

[0004] Secondly, the workflow of traditional local knowledge base question answering systems has significant flaws. These systems typically rely on simple search recall mechanisms, directly concatenating relevant documents and questions as input to a large language model. This approach not only increases the processing burden on the model, requiring it to possess extremely high text understanding and generation capabilities, but also easily leads to lengthy, unfocused, and inaccurate answers, thus negatively impacting the user experience. Summary of the Invention

[0005] The purpose of this application is to propose a problem-solving method, apparatus, computer device, and storage medium based on artificial intelligence, so as to solve the technical problem of low accuracy of generated answers in existing local knowledge base question-answering systems.

[0006] To address the aforementioned technical problems, this application provides an artificial intelligence-based problem-solving method, employing the following technical solution:

[0007] Receive question text input from the user;

[0008] The question text is preprocessed to obtain the corresponding target question text;

[0009] Retrieve target documents from a pre-defined medical knowledge base that match the pre-defined relevance to the target question text;

[0010] A first model input is generated based on the target question text and the target document, and information is extracted from the first model input using a preset target language model to extract target information related to the question text from the target document.

[0011] Based on the target question text and the target information, a corresponding second model input is generated, and the target language model is used to perform reasoning processing on the second model input to obtain the corresponding answer text.

[0012] The response text is optimized to obtain the corresponding target response text;

[0013] The target answer text is returned to the user.

[0014] Furthermore, the step of retrieving target documents from a preset medical knowledge base that match a preset relevance to the target question text specifically includes:

[0015] Call the pre-trained document vector representation model;

[0016] Based on the vector representation model, the question text is converted into a corresponding question vector;

[0017] Calculate the similarity between the question vector and the vector of each document in the medical knowledge base;

[0018] All the documents are sorted in descending order of similarity to obtain the corresponding sorted document list;

[0019] Select the target number of first documents from the document sorting list;

[0020] The first document is processed to extract key content, resulting in the corresponding second document;

[0021] The second document is used as the target document.

[0022] Furthermore, the step of optimizing the answer text to obtain the corresponding target answer text specifically includes:

[0023] The response text is processed to remove redundancy, resulting in the corresponding first response text;

[0024] The first answer text is subjected to grammatical correction to obtain the corresponding second answer text;

[0025] The second answer text is used as the target answer text.

[0026] Furthermore, the step of returning the target answer text to the user specifically includes:

[0027] Obtain the target display format;

[0028] Based on the target display format, the target answer text is converted to obtain the converted target answer text.

[0029] Call the preset user interface;

[0030] The target answer text is displayed to the user based on the user interface.

[0031] Furthermore, the step of preprocessing the question text to obtain the corresponding target question text specifically includes:

[0032] The question text is processed to remove irrelevant information, resulting in the corresponding first question text;

[0033] Get the preset standard format;

[0034] Based on the standard format, the first question text is converted to obtain the corresponding second question text;

[0035] The second question text is used as the target question text.

[0036] Furthermore, before the step of generating a corresponding first model input based on the target question text and the target document, and extracting information from the first model input using a preset target language model to extract target information related to the question text from the target document, the method further includes:

[0037] Constructing medical question-answer pairs based on a pre-defined large language model;

[0038] The medical question-and-answer pairs are labeled to obtain the corresponding medical sample data;

[0039] The medical sample data is divided into a training set and a test set;

[0040] Invoke the preset language model;

[0041] Obtain the preset improved cross-entropy loss function;

[0042] The language model is fine-tuned using the improved cross-entropy loss function and the training set to obtain the corresponding specified language model;

[0043] Performance testing of the specified language model is performed based on the test set.

[0044] If the specified language model passes the performance test, then the specified language model will be used as the target language model.

[0045] Furthermore, the step of obtaining the preset improved cross-entropy loss function specifically includes:

[0046] Obtain the cross-entropy loss function;

[0047] Obtain the preset weight adjustment strategy;

[0048] The cross-entropy loss function is adjusted based on the weight adjustment strategy to obtain the adjusted cross-entropy loss function.

[0049] The adjusted cross-entropy loss function is used as the improved cross-entropy loss function.

[0050] To address the aforementioned technical problems, this application also provides an artificial intelligence-based problem-solving device, employing the following technical solution:

[0051] The receiving module is used to receive the question text input by the user;

[0052] The preprocessing module is used to preprocess the question text to obtain the corresponding target question text;

[0053] The query module is used to retrieve target documents from a preset medical knowledge base that match the preset relevance to the target question text;

[0054] The extraction module is used to generate a corresponding first model input based on the target question text and the target document, and to extract information from the first model input using a preset target language model, so as to extract target information related to the question text from the target document;

[0055] The reasoning module is used to generate a corresponding second model input based on the target question text and the target information, and to perform reasoning processing on the second model input through the target language model to obtain the corresponding answer text.

[0056] The optimization module is used to optimize the response text to obtain the corresponding target response text;

[0057] The return module is used to return the target answer text to the user.

[0058] To address the aforementioned technical problems, this application also provides a computer device that employs the following technical solution:

[0059] Receive question text input from the user;

[0060] The question text is preprocessed to obtain the corresponding target question text;

[0061] Retrieve target documents from a pre-defined medical knowledge base that match the pre-defined relevance to the target question text;

[0062] A first model input is generated based on the target question text and the target document, and information is extracted from the first model input using a preset target language model to extract target information related to the question text from the target document.

[0063] Based on the target question text and the target information, a corresponding second model input is generated, and the target language model is used to perform reasoning processing on the second model input to obtain the corresponding answer text.

[0064] The response text is optimized to obtain the corresponding target response text;

[0065] The target answer text is returned to the user.

[0066] To address the aforementioned technical problems, this application also provides a computer-readable storage medium, employing the technical solution described below:

[0067] Receive question text input from the user;

[0068] The question text is preprocessed to obtain the corresponding target question text;

[0069] Retrieve target documents from a pre-defined medical knowledge base that match the pre-defined relevance to the target question text;

[0070] A first model input is generated based on the target question text and the target document, and information is extracted from the first model input using a preset target language model to extract target information related to the question text from the target document.

[0071] Based on the target question text and the target information, a corresponding second model input is generated, and the target language model is used to perform reasoning processing on the second model input to obtain the corresponding answer text.

[0072] The response text is optimized to obtain the corresponding target response text;

[0073] The target answer text is returned to the user.

[0074] Compared with the prior art, the embodiments of this application have the following main advantages:

[0075] This application first receives a question text input by a user; then preprocesses the question text to obtain a corresponding target question text; next, it queries a preset medical knowledge base to find target documents that match the target question text with a preset correlation; then, it generates a corresponding first model input based on the target question text and the target document, and extracts information from the first model input using a preset target language model to extract target information related to the question text from the target document; subsequently, it generates a corresponding second model input based on the target question text and the target information, and performs reasoning processing on the second model input using the target language model to obtain a corresponding answer text; further, it optimizes the answer text to obtain a corresponding target answer text; finally, it returns the target answer text to the user. This application constructs a local knowledge question answering system for the medical field based on a preset target language model. By using the local knowledge question answering system, it can accurately answer the question text input by the user, improving the processing efficiency of medical question answering and the accuracy of the generated target answer text. Attached Figure Description

[0076] To more clearly illustrate the solutions in this application, the accompanying drawings used in the description of the embodiments of this application will be briefly introduced below. Obviously, the accompanying drawings described below are some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0077] Figure 1 This is an exemplary system architecture diagram to which this application can be applied;

[0078] Figure 2 A flowchart of an embodiment of the AI-based problem-solving method according to this application;

[0079] Figure 3 This is a schematic diagram of a structure of an embodiment of the artificial intelligence-based problem processing device according to this application;

[0080] Figure 4 This is a schematic diagram of the structure of one embodiment of the computer device according to this application. Detailed Implementation

[0081] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application pertains; the terminology used herein in the specification of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having," and any variations thereof, in the specification, claims, and foregoing drawings of this application, are intended to cover non-exclusive inclusion. The terms "first," "second," etc., in the specification, claims, or foregoing drawings of this application are used to distinguish different objects, not to describe a particular order.

[0082] In this document, the term "embodiment" means that a particular feature, structure, or characteristic described in connection with an embodiment may be included in at least one embodiment of this application. The appearance of this phrase in various places throughout the specification does not necessarily refer to the same embodiment, nor is it a separate or alternative embodiment mutually exclusive with other embodiments. It will be explicitly and implicitly understood by those skilled in the art that the embodiments described herein can be combined with other embodiments.

[0083] To enable those skilled in the art to better understand the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

[0084] like Figure 1 As shown, system architecture 100 may include terminal devices 101, 102, and 103, a network 104, and a server 105. Network 104 serves as the medium for providing communication links between terminal devices 101, 102, and 103 and server 105. Network 104 may include various connection types, such as wired or wireless communication links, or fiber optic cables, etc.

[0085] Users can use terminal devices 101, 102, and 103 to interact with server 105 via network 104 to receive or send messages, etc. Various communication client applications can be installed on terminal devices 101, 102, and 103, such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social media platform software, etc.

[0086] Terminal devices 101, 102, and 103 can be various electronic devices with displays and support web browsing, including but not limited to smartphones, tablets, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III), MP4 players (Moving Picture Experts Group Audio Layer IV), laptops, and desktop computers, etc.

[0087] Server 105 can be a server that provides various services, such as a backend server that supports the pages displayed on terminal devices 101, 102, and 103.

[0088] It should be noted that the problem-solving method based on artificial intelligence provided in this application is generally executed by a server / terminal device, and correspondingly, the problem-solving device based on artificial intelligence is generally set in the server / terminal device.

[0089] It should be understood that Figure 1 The number of terminal devices, networks, and servers shown is merely illustrative. Depending on implementation needs, any number of terminal devices, networks, and servers can be included.

[0090] Continue to refer to Figure 2 This document illustrates a flowchart of an embodiment of the AI-based question-solving method according to this application. The order of steps in the flowchart can be changed, and some steps can be omitted, depending on different needs. The AI-based question-solving method provided in this application can be applied to any scenario requiring medical knowledge question-answering, and therefore can be applied to products in these scenarios, such as medical knowledge question-answering in the digital healthcare field. The AI-based question-solving method includes the following steps:

[0091] Step S201: Receive the question text input by the user.

[0092] In this embodiment, the problem-solving method based on artificial intelligence runs on an electronic device (e.g., Figure 1The server / terminal device shown can acquire the question text via wired or wireless connection. It should be noted that the aforementioned wireless connection methods may include, but are not limited to, 3G / 4G / 5G connections, Wi-Fi connections, Bluetooth connections, WiMAX connections, Zigbee connections, UWB (ultra-Width band) connections, and other currently known or future wireless connection methods. This application can be applied to business scenarios of medical knowledge question answering in the field of digital healthcare, and the implementing entity of this application can be a medical knowledge question answering system. Specifically, a user-friendly interface is pre-designed in the medical knowledge question answering system, allowing users to input questions. The system receives the question text input by the user in the user-friendly interface through relevant interfaces or front-end form submission mechanisms. The question text is specifically a medical-related question, such as: "The pregnancy test shows one very clear line and one very faint line, am I pregnant?".

[0093] Step S202: Preprocess the question text to obtain the corresponding target question text.

[0094] In this embodiment, the specific implementation process of preprocessing the problem text to obtain the corresponding target problem text will be further described in detail in subsequent specific embodiments of this application, and will not be elaborated on here.

[0095] Step S203: Query the preset medical knowledge base to find the target document that matches the preset correlation with the target question text.

[0096] In this embodiment, the aforementioned medical knowledge base is a local medical knowledge base. The process of constructing the medical knowledge base includes: data collection: collecting medical-related documents from various sources (such as medical books, research papers, medical websites, government reports, etc.); cleaning and organizing: cleaning the collected documents, including removing irrelevant information and standardizing the format; and storage: storing the cleaned and organized documents in a local database to ensure fast access. The specific implementation process of retrieving target documents from the preset medical knowledge base that match the preset relevance to the target question text will be further described in detail in subsequent embodiments of this application and will not be elaborated upon here.

[0097] Step S204: Generate a first model input corresponding to the target question text and the target document, and extract information from the first model input using a preset target language model to extract target information related to the question text from the target document.

[0098] In this embodiment, the target question text and the target document are concatenated into a format understandable by the target language model, which is then used as the input to the first model. The target language model is then used to extract information from the first model input using an extraction method, extracting information related to the target question text. This extracted information is then organized into a structured form (such as JSON, XML, etc.) to obtain the target information. The model construction process of the target language model will be described in further detail in subsequent embodiments and will not be elaborated upon here.

[0099] Step S205: Generate a corresponding second model input based on the target question text and the target information, and perform reasoning processing on the second model input through the target language model to obtain the corresponding answer text.

[0100] In this embodiment, the target question text and target information are concatenated into a format that the target language model can understand, and this concatenation is used as input to the second model. Then, the target language model is used again to generate a coherent and accurate answer text based on the input information in a generative manner.

[0101] Step S206: Optimize the response text to obtain the corresponding target response text.

[0102] In this embodiment, the specific implementation process of optimizing the answer text to obtain the corresponding target answer text will be further described in detail in subsequent specific embodiments of this application, and will not be elaborated on here.

[0103] Step S207: Return the target answer text to the user.

[0104] In this embodiment, the specific implementation process of returning the target answer text to the user will be described in more detail in subsequent specific embodiments of this application, and will not be elaborated on here.

[0105] This application first receives a question text input by a user; then preprocesses the question text to obtain a corresponding target question text; next, it queries a preset medical knowledge base to find target documents that match the target question text with a preset correlation; then, it generates a corresponding first model input based on the target question text and the target document, and extracts information from the first model input using a preset target language model to extract target information related to the question text from the target document; subsequently, it generates a corresponding second model input based on the target question text and the target information, and performs reasoning processing on the second model input using the target language model to obtain a corresponding answer text; further, it optimizes the answer text to obtain a corresponding target answer text; finally, it returns the target answer text to the user. This application constructs a local knowledge question answering system for the medical field based on a preset target language model. By using the local knowledge question answering system, it can accurately answer the question text input by the user, improving the processing efficiency of medical question answering and the accuracy of the generated target answer text.

[0106] In some alternative implementations, step S203 includes the following steps:

[0107] Call the pre-trained document vector representation model.

[0108] In this embodiment, a document vector representation model capable of converting documents into high-dimensional vectors can be constructed by selecting a subset of documents from a medical knowledge base as a training dataset and training a pre-selected initial model. The initial model is a Transformer-based model, such as BERT or RoBERTa.

[0109] The question text is converted into a corresponding question vector based on the vector representation model.

[0110] In this embodiment, the question text is input into the vector representation model, so that the question text is converted into a corresponding question vector by the vector representation model.

[0111] Calculate the similarity between the question vector and the vector of each document in the medical knowledge base.

[0112] In this embodiment, a similarity algorithm can be used to calculate the similarity between the question vector and the vector of each document in the medical knowledge base. The selection of the similarity algorithm is not specifically limited; for example, cosine similarity, Euclidean distance, etc., can be used.

[0113] All documents are sorted in descending order of similarity to obtain the corresponding sorted document list.

[0114] In this embodiment, all documents can be sorted in descending order of similarity, and the document with the highest similarity score is most relevant to the user's input question.

[0115] Select the target number of first documents from the document sorting list.

[0116] In this embodiment, the first document can be selected from the sorted document list, choosing the top m most relevant documents (doc_1, doc_2, ..., doc_m). The value of the target number m is not specifically limited and can be set according to actual business needs; for example, it can be set to 5.

[0117] The first document is processed to extract key content, resulting in the corresponding second document.

[0118] In this embodiment, by performing key content extraction processing on the first document, key paragraphs in the first document are extracted, thereby obtaining the aforementioned second document. This reduces the workload of data processing during subsequent information extraction, thereby improving the efficiency of information extraction.

[0119] The second document is used as the target document.

[0120] This application utilizes a pre-trained document vector representation model to convert the question text into a corresponding question vector. It then calculates the similarity between the question vector and the vector of each document in the medical knowledge base. Following this, all documents are sorted in descending order of similarity to obtain a sorted document list. A target number of documents are then sequentially selected from the sorted list. Key content extraction is performed on these first documents to obtain corresponding second documents. Finally, the second documents are used as the target documents. This application, based on the use of a document vector representation model and the vector similarity calculation method, performs document query processing between the medical knowledge base and the target question text. This enables rapid and accurate retrieval of the most relevant target documents from the medical knowledge base, improving query efficiency and ensuring the accuracy of the obtained target document data.

[0121] In some optional implementations of this embodiment, step S206 includes the following steps:

[0122] Redundancy is removed from the response text to obtain the corresponding first response text.

[0123] In this embodiment, irrelevant content, such as interjections and stop words, is removed from the response text to remove redundancy and obtain the corresponding first response text.

[0124] The first answer text is grammatically corrected to obtain the corresponding second answer text.

[0125] In this embodiment, the above-mentioned grammatical correction processing refers to correcting grammatical errors in the first answer text, thereby ensuring the grammatical accuracy of the obtained second answer text.

[0126] The second answer text is used as the target answer text.

[0127] This application removes redundancy from the answer text to obtain a first answer text; then, it corrects the syntax of the first answer text to obtain a second answer text; subsequently, the second answer text is used as the target answer text. After obtaining the corresponding answer text by reasoning through the second model input using the target language model, this application further removes redundancy and corrects the syntax of the answer text to obtain the corresponding target answer text, effectively ensuring the conciseness and accuracy of the target answer text. This facilitates a better user experience when the target answer text is subsequently returned to the user.

[0128] In some alternative implementations, step S207 includes the following steps:

[0129] Obtain the target display format.

[0130] In this embodiment, the selection of the above-mentioned target display method is not specifically limited, but can be any suitable display method, such as HTML, Markdown, etc.

[0131] The target response text is converted based on the target display format to obtain the converted target response text.

[0132] In this embodiment, the converted target answer text is obtained by converting the target answer text into a format that matches the target display format.

[0133] Call the preset user interface.

[0134] In this embodiment, the aforementioned user interface is a pre-built interface for data interaction with the user.

[0135] The target answer text is displayed to the user based on the user interface.

[0136] In this embodiment, the target answer text is displayed to the user through the aforementioned user interface. This may include displaying it directly in the client or notifying the user via email, SMS, or other means.

[0137] This application obtains a target display format; then converts the target answer text based on the target display format to obtain a converted target answer text; subsequently, it calls a preset user interface; and finally, it displays the target answer text to the user based on the user interface. After optimizing the answer text to obtain the corresponding target answer text, this application intelligently converts the target answer text based on the target display format to obtain a converted target answer text, thus converting the target answer text into a format suitable for display. Subsequently, it displays the target answer text to the user based on the preset user interface, thereby enabling the user to comfortably view the target answer text, improving the user experience, and enhancing the intelligence of the target answer text display.

[0138] In some alternative implementations, step S202 includes the following steps:

[0139] Irrelevant information is removed from the question text to obtain the corresponding first question text.

[0140] In this embodiment, the aforementioned irrelevant information may specifically include irrelevant symbols.

[0141] Get the preset standard format.

[0142] In this embodiment, the aforementioned standard format can be a pre-set text standard format, such as a uniform lowercase format or a uniform uppercase format.

[0143] The first question text is converted based on the standard format to obtain the corresponding second question text.

[0144] In this embodiment, the first question text is converted into a format that matches the aforementioned standard format to obtain the corresponding second question text.

[0145] The second question text is used as the target question text.

[0146] This application obtains a first question text by removing irrelevant information from the question text; then, it acquires a preset standard format; subsequently, it performs format conversion processing on the first question text based on the standard format to obtain a second question text; finally, it uses the second question text as the target question text. This application achieves rapid and intelligent preprocessing of the question text by removing irrelevant information and performing format conversion based on a standard format, ensuring the accuracy of the obtained target question text.

[0147] In some optional implementations of this embodiment, before step S204, the electronic device may further perform the following steps:

[0148] Medical question-answer pairs are constructed based on a pre-defined large language model.

[0149] In this embodiment, the aforementioned large language model can specifically be a large language model such as ChatGPT. The process of constructing medical question-answer pairs includes: designing a series of medical-related question templates based on actual medical treatment needs, or directly using the ChatGPT model to automatically generate medical questions. Then, using a basic dataset or external knowledge base (such as online medical forums or question-answer communities) to answer the aforementioned medical questions, corresponding medical question-answer pairs are generated.

[0150] The medical question-and-answer pairs are labeled to obtain the corresponding medical sample data.

[0151] In this embodiment, the generated medical question-and-answer pairs can be manually reviewed and labeled by medical experts or trained data labelers to ensure their accuracy, relevance, and rationality, thereby obtaining the corresponding initial medical sample data. Furthermore, the labeled initial medical sample data is then formatted into a suitable format for model training to obtain the aforementioned medical sample data.

[0152] The medical sample data is divided into a training set and a test set.

[0153] In this embodiment, the medical sample data can be divided into a training set and a test set according to a preset division ratio. The specific value of the division ratio is not limited; for example, a 7:3 ratio can be used.

[0154] Call the preset language model.

[0155] In this embodiment, the language model can specifically be BERT, GPT, Transformer, or other language models. Based on the actual model building requirements, the learning rate, batch size, training epochs, and other hyperparameters of the language model are set. An optimizer (such as Adam or AdamW) and a scheduler (such as Warmup-Cosine Annealing) are also configured. Fine-tuning typically uses a smaller dataset, thus employing a smaller batch size and more training epochs. Furthermore, the model architecture can be adjusted as needed, such as adding task-specific layers (such as classification layers, decoding layers, etc.) to support medical question-answering tasks.

[0156] Obtain the preset improved cross-entropy loss function.

[0157] In this embodiment, the specific implementation process of obtaining the preset improved cross-entropy loss function will be further described in detail in subsequent specific embodiments of this application, and will not be elaborated on here.

[0158] The language model is fine-tuned using the improved cross-entropy loss function and the training set to obtain the corresponding specified language model.

[0159] In this embodiment, the language model learns from manually labeled answers during training to achieve better generation results. This step uses the manually labeled answers as the standard, and the training objective is to increase the probability of the language model generating the sentence, or in other words, to maximize the log-likelihood function of the response under the language model. Specifically, the language model is fine-tuned using a training set and an improved cross-entropy loss function. During training, metrics such as loss value and accuracy are monitored, and parameters are adjusted or the model is optimized as needed. Early stopping techniques are employed to prevent overfitting, thereby training the specified language model.

[0160] The performance of the specified language model is tested based on the test set.

[0161] In this embodiment, the test set is input into the specified language model for testing, and the model performance (question-answering accuracy and generation quality) of the fine-tuned specified language model is evaluated on the test set. If the obtained question-answering accuracy is greater than the preset accuracy threshold and the generation quality meets the preset quality standard, the specified language model is determined to have passed the performance test; otherwise, the specified language model is determined to have failed the performance test.

[0162] If the specified language model passes the performance test, then the specified language model will be used as the target language model.

[0163] In this embodiment, if the specified language model passes the performance test, it is determined that the predictive ability of the specified language model meets the construction requirements, and thus the specified language model is used as the target language model. Subsequently, the trained target language model is deployed to the production environment for use, and the model performance of the target language model is monitored in real time, user feedback is collected, and continuous optimization is carried out.

[0164] This application constructs medical question-answer pairs based on a pre-defined large language model; then, it annotates the medical question-answer pairs to obtain corresponding medical sample data, and divides the medical sample data into training and testing sets; subsequently, it calls the pre-defined language model and obtains a pre-defined improved cross-entropy loss function; subsequently, it uses the improved cross-entropy loss function and the training set to fine-tune the language model to obtain a corresponding specified language model; further, it performs performance testing on the specified language model based on the testing set; if the specified language model passes the performance test, it is used as the target language model. This application constructs medical question-answer pairs based on a pre-defined large language model, annotates the medical question-answer pairs to obtain corresponding medical sample data, and then uses the improved cross-entropy loss function and the medical sample data to fine-tune and test the pre-defined language model, thereby achieving rapid and intelligent construction of a target language model that meets the requirements, improving the construction efficiency of the target language model, and ensuring the model effect of the obtained target language model. In addition, by fine-tuning the language model using an improved cross-entropy loss function, the overfitting phenomenon of the language model can be alleviated during training, the training effect of the language model can be enhanced, and thus the performance of the language model in question answering tasks can be improved.

[0165] In some optional implementations of this embodiment, obtaining the preset improved cross-entropy loss function includes the following steps:

[0166] Obtain the cross-entropy loss function.

[0167] In this embodiment, the cross-entropy loss function specifically includes: L = ∑t i log(p i Where L is the cross-entropy loss function, t i The value can be 0 or 1. A value of 1 indicates that the actual label is equal to i, p i Let be the probability that the model predicts for label i.

[0168] Obtain the preset weight adjustment strategy.

[0169] In this embodiment, the cross-entropy loss function assigns consistent weights to each token. This can easily lead to the following situation: the model quickly fits the predictions of simple tokens, rapidly reducing the level of the loss function to a very low level, thus getting trapped in a local optimum. Meanwhile, the model's prediction results for difficult tokens remain poor. Therefore, this application proposes the following improvement: the weight adjustment strategy includes the following: for each predicted token i, the probability distribution predicted by the model is now P. i By calculating the entropy of the probability distribution: E i =Entropy(P i For distributions with higher entropy, greater uncertainty indicates a lack of confidence in the model's prediction. Therefore, the loss for this token needs to be given greater weight: β. i =*( i )+b, where a is a positive number used to control the degree of influence of entropy on weights. b is a constant (used to ensure that the weights are not zero even if the entropy is very small).

[0170] The cross-entropy loss function is adjusted based on the weight adjustment strategy to obtain the adjusted cross-entropy loss function.

[0171] In this embodiment, the cross-entropy loss function can be adjusted according to the strategy content of the aforementioned weight adjustment strategy to obtain the adjusted cross-entropy loss function. Specifically, the adjusted cross-entropy loss function, i.e., the improved cross-entropy loss function, specifically includes: L = ∑β i *t i *log(p i Where L is the improved cross-entropy loss function, β i The weight of each predicted token, t i The value can be 0 or 1. A value of 1 indicates that the actual label is equal to i, p i Let be the probability that the model predicts for label i.

[0172] The adjusted cross-entropy loss function is used as the improved cross-entropy loss function.

[0173] In this embodiment, an improved cross-entropy loss function is introduced, which weights the loss of each token according to its prediction uncertainty, in order to improve the model's prediction ability on difficult tokens.

[0174] This application obtains a cross-entropy loss function; then obtains a preset weight adjustment strategy; subsequently, it adjusts the cross-entropy loss function based on the weight adjustment strategy to obtain an adjusted cross-entropy loss function; and finally, it uses the adjusted cross-entropy loss function as the improved cross-entropy loss function. This application achieves intelligent generation of the required improved cross-entropy loss function by adjusting the cross-entropy loss function using a preset weight adjustment strategy. This allows for the subsequent fine-tuning of the language model using the improved cross-entropy loss function and the training set, mitigating overfitting during language model training, enhancing the training effect, and ultimately improving the language model's performance in question-answering tasks.

[0175] It should be understood that the sequence number of each step in the above embodiments does not imply the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.

[0176] It should be emphasized that, to further ensure the privacy and security of the aforementioned target response text, the target response text can also be stored in a node of a blockchain.

[0177] The blockchain referred to in this application is a novel application model of computer technologies such as distributed data storage, peer-to-peer transmission, consensus mechanisms, and encryption algorithms. Essentially, a blockchain is a decentralized database, a chain of data blocks linked together using cryptographic methods. Each data block contains information about a batch of network transactions, used to verify the validity of the information (anti-counterfeiting) and generate the next block. A blockchain can include an underlying blockchain platform, a platform product service layer, and an application service layer.

[0178] The embodiments of this application can acquire and process relevant data based on artificial intelligence technology. Artificial intelligence (AI) is the theory, method, technology, and application system that uses digital computers or machines controlled by digital computers to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use that knowledge to obtain optimal results.

[0179] Foundational technologies for artificial intelligence generally include sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, operating / interactive systems, and mechatronics. AI software technologies mainly encompass computer vision, robotics, biometrics, speech processing, natural language processing, and machine learning / deep learning.

[0180] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing related hardware through computer-readable instructions. These computer-readable instructions can be stored in a computer-readable storage medium. When the program is executed, it can include the processes of the embodiments of the methods described above. The aforementioned storage medium can be a non-volatile storage medium such as a magnetic disk, optical disk, or read-only memory (ROM), or random access memory (RAM).

[0181] It should be understood that although the steps in the flowcharts of the accompanying figures are shown sequentially as indicated by the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the accompanying figures may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times, and their execution order is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.

[0182] Further reference Figure 3 As a response to the above Figure 2 To implement the method shown, this application provides an embodiment of an artificial intelligence-based problem-solving device, which is similar to... Figure 2 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.

[0183] like Figure 3 As shown, the AI-based problem-solving device 300 described in this embodiment includes: a receiving module 301, a preprocessing module 302, a query module 303, an extraction module 304, a reasoning module 305, an optimization module 306, and a return module 307. Wherein:

[0184] The receiving module 301 is used to receive the question text input by the user;

[0185] Preprocessing module 302 is used to preprocess the question text to obtain the corresponding target question text;

[0186] The query module 303 is used to query target documents from a preset medical knowledge base that match the preset relevance relationship with the target question text;

[0187] The extraction module 304 is used to generate a corresponding first model input based on the target question text and the target document, and to extract information from the first model input using a preset target language model, so as to extract target information related to the question text from the target document;

[0188] The reasoning module 305 is used to generate a corresponding second model input based on the target question text and the target information, and to perform reasoning processing on the second model input through the target language model to obtain the corresponding answer text.

[0189] Optimization module 306 is used to optimize the answer text to obtain the corresponding target answer text;

[0190] The return module 307 is used to return the target answer text to the user.

[0191] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the artificial intelligence-based problem-solving method in the aforementioned implementation method, and will not be repeated here.

[0192] In some optional implementations of this embodiment, the query module 303 includes:

[0193] The first calling submodule is used to call the pre-trained document vector representation model;

[0194] The first conversion submodule is used to convert the question text into a corresponding question vector based on the vector representation model.

[0195] The calculation submodule is used to calculate the similarity between the question vector and the vector of each document in the medical knowledge base;

[0196] The sorting submodule is used to sort all the documents in descending order of similarity to obtain the corresponding document sorting list;

[0197] The filtering submodule is used to sequentially filter out a target number of first documents from the document sorting list;

[0198] The extraction submodule is used to extract key content from the first document to obtain the corresponding second document;

[0199] The first determining submodule is used to select the second document as the target document.

[0200] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the artificial intelligence-based problem-solving method in the aforementioned implementation method, and will not be repeated here.

[0201] In some optional implementations of this embodiment, the optimization module 306 includes:

[0202] The first processing submodule is used to remove redundancy from the answer text to obtain the corresponding first answer text.

[0203] The second processing submodule is used to perform grammatical correction processing on the first answer text to obtain the corresponding second answer text.

[0204] The second determining submodule is used to use the second answer text as the target answer text.

[0205] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the problem-solving method based on artificial intelligence in the aforementioned implementation method, and will not be repeated here.

[0206] In some optional implementations of this embodiment, the return module 307 includes:

[0207] The first acquisition submodule is used to acquire the target display format;

[0208] The second conversion submodule is used to convert the target answer text based on the target display format to obtain the converted target answer text.

[0209] The second submodule is used to call the preset user interface;

[0210] The display submodule is used to display the target answer text to the user based on the user interface.

[0211] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the artificial intelligence-based problem-solving method in the aforementioned implementation method, and will not be repeated here.

[0212] In some optional implementations of this embodiment, the preprocessing module 302 includes:

[0213] The third processing submodule is used to remove irrelevant information from the question text to obtain the corresponding first question text.

[0214] The second acquisition submodule is used to acquire preset standard formats;

[0215] The third conversion submodule is used to perform format conversion processing on the first question text based on the standard format to obtain the corresponding second question text.

[0216] The third determining submodule is used to use the second question text as the target question text.

[0217] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the artificial intelligence-based problem-solving method in the aforementioned implementation method, and will not be repeated here.

[0218] In some optional implementations of this embodiment, the artificial intelligence-based problem-solving device further includes:

[0219] The building module is used to construct medical question-answer pairs based on a pre-defined large language model;

[0220] The annotation module is used to annotate the medical question-and-answer pairs to obtain corresponding medical sample data.

[0221] A partitioning module is used to divide the medical sample data into a training set and a test set;

[0222] The calling module is used to invoke the preset language model;

[0223] The acquisition module is used to acquire a preset improved cross-entropy loss function;

[0224] The fine-tuning module is used to fine-tune the language model using the improved cross-entropy loss function and the training set to obtain the corresponding specified language model.

[0225] The testing module is used to perform performance testing on the specified language model based on the test set.

[0226] The determination module is used to select the specified language model as the target language model if the specified language model passes the performance test.

[0227] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the artificial intelligence-based problem-solving method in the aforementioned implementation method, and will not be repeated here.

[0228] In some optional implementations of this embodiment, the acquisition module includes:

[0229] The third submodule is used to obtain the cross-entropy loss function;

[0230] The fourth acquisition submodule is used to acquire the preset weight adjustment strategy;

[0231] The adjustment submodule is used to adjust the cross-entropy loss function based on the weight adjustment strategy to obtain the adjusted cross-entropy loss function.

[0232] The fourth determining submodule is used to use the adjusted cross-entropy loss function as the improved cross-entropy loss function.

[0233] In this embodiment, the operations performed by the above modules or units correspond one-to-one with the steps of the artificial intelligence-based problem-solving method in the aforementioned implementation method, and will not be repeated here.

[0234] To address the aforementioned technical problems, embodiments of this application also provide a computer device. Please refer to [link / reference needed]. Figure 4 , Figure 4 This is a basic structural block diagram of the computer device in this embodiment.

[0235] The computer device 4 includes a memory 41, a processor 42, and a network interface 43 that are interconnected via a system bus. It should be noted that only the computer device 4 with components 41-43 is shown in the figure; however, it should be understood that it is not required to implement all the shown components, and more or fewer components can be implemented alternatively. Those skilled in the art will understand that the computer device described here is a device capable of automatically performing numerical calculations and / or information processing according to pre-set or stored instructions. Its hardware includes, but is not limited to, microprocessors, application-specific integrated circuits (ASICs), programmable gate arrays (FPGAs), digital digital processors (DSPs), embedded devices, etc.

[0236] The computer device can be a desktop computer, laptop, handheld computer, or cloud server, etc. The computer device can interact with the user via a keyboard, mouse, remote control, touchpad, or voice control.

[0237] The memory 41 includes at least one type of readable storage medium, including flash memory, hard disk, multimedia card, card-type memory (e.g., SD or DX memory), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as the hard disk or memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, smart memory card (SMC), secure digital card (SD) card, flash card, etc. of the computer device 4. Of course, the memory 41 may also include both the internal storage unit and the external storage device of the computer device 4. In this embodiment, the memory 41 is typically used to store the operating system and various application software installed on the computer device 4, such as computer-readable instructions for problem-solving methods based on artificial intelligence. In addition, the memory 41 can also be used to temporarily store various types of data that have been output or will be output.

[0238] In some embodiments, the processor 42 may be a central processing unit (CPU), a controller, a microcontroller, a microprocessor, or other data processing chip. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is used to execute computer-readable instructions stored in the memory 41 or to process data, for example, to execute computer-readable instructions of the artificial intelligence-based problem-solving method.

[0239] The network interface 43 may include a wireless network interface or a wired network interface, which is typically used to establish communication connections between the computer device 4 and other electronic devices.

[0240] Compared with the prior art, the embodiments of this application have the following main advantages:

[0241] In this embodiment, the system first receives a question text input by a user; preprocesses the question text to obtain a corresponding target question text; then queries a preset medical knowledge base to find target documents that match the target question text with a preset correlation; subsequently, a first model input is generated based on the target question text and the target document, and information extraction is performed on the first model input using a preset target language model to extract target information related to the question text from the target document; subsequently, a second model input is generated based on the target question text and the target information, and reasoning processing is performed on the second model input using the target language model to obtain a corresponding answer text; the answer text is further optimized to obtain a corresponding target answer text; finally, the target answer text is returned to the user. This application constructs a local knowledge question answering system for the medical field based on a preset target language model. By using the local knowledge question answering system, accurate answers can be processed for user-input question text, improving the processing efficiency of medical question answering and the accuracy of the generated target answer text.

[0242] This application also provides another embodiment, namely, providing a computer-readable storage medium storing computer-readable instructions that can be executed by at least one processor to cause the at least one processor to perform the steps of the artificial intelligence-based problem-solving method described above.

[0243] Compared with the prior art, the embodiments of this application have the following main advantages:

[0244] In this embodiment, the system first receives a question text input by a user; preprocesses the question text to obtain a corresponding target question text; then queries a preset medical knowledge base to find target documents that match the target question text with a preset correlation; subsequently, a first model input is generated based on the target question text and the target document, and information extraction is performed on the first model input using a preset target language model to extract target information related to the question text from the target document; subsequently, a second model input is generated based on the target question text and the target information, and reasoning processing is performed on the second model input using the target language model to obtain a corresponding answer text; the answer text is further optimized to obtain a corresponding target answer text; finally, the target answer text is returned to the user. This application constructs a local knowledge question answering system for the medical field based on a preset target language model. By using the local knowledge question answering system, accurate answers can be processed for user-input question text, improving the processing efficiency of medical question answering and the accuracy of the generated target answer text.

[0245] Through the above description of the embodiments, those skilled in the art can clearly understand that the methods of the above embodiments can be implemented by means of software plus necessary general-purpose hardware platforms. Of course, they can also be implemented by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, can be embodied in the form of a software product. This computer software product is stored in a storage medium (such as ROM / RAM, magnetic disk, optical disk), and includes several instructions to cause a terminal device (which may be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in the various embodiments of this application.

[0246] Obviously, the embodiments described above are only some embodiments of this application, not all embodiments. The accompanying drawings show preferred embodiments of this application, but do not limit the patent scope of this application. This application can be implemented in many different forms; rather, the purpose of providing these embodiments is to provide a more thorough and comprehensive understanding of the disclosure of this application. Although this application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical solutions described in the foregoing specific embodiments, or make equivalent substitutions for some of the technical features. Any equivalent structures made using the content of this application's specification and drawings, directly or indirectly applied to other related technical fields, are similarly within the scope of patent protection of this application.

Claims

1. A problem-solving method based on artificial intelligence, characterized in that, Includes the following steps: Receive question text input from the user; The question text is preprocessed to obtain the corresponding target question text; Retrieve target documents from a pre-defined medical knowledge base that match the pre-defined relevance to the target question text; A first model input is generated based on the target question text and the target document, and information is extracted from the first model input using a preset target language model to extract target information related to the question text from the target document. Based on the target question text and the target information, a corresponding second model input is generated, and the target language model is used to perform reasoning processing on the second model input to obtain the corresponding answer text. The response text is optimized to obtain the corresponding target response text; Return the target answer text to the user; The method further includes, prior to the step of generating a corresponding first model input based on the target question text and the target document, and extracting information from the first model input using a preset target language model to extract target information related to the question text from the target document: Constructing medical question-answer pairs based on a pre-defined large language model; The medical question-and-answer pairs are labeled to obtain the corresponding medical sample data; The medical sample data is divided into a training set and a test set; Invoke the preset language model; Obtain the cross-entropy loss function; Obtain the preset weight adjustment strategy; The cross-entropy loss function is adjusted based on the weight adjustment strategy to obtain the adjusted cross-entropy loss function. The adjusted cross-entropy loss function is used as the improved cross-entropy loss function; The language model is fine-tuned using the improved cross-entropy loss function and the training set to obtain the corresponding specified language model; Performance testing of the specified language model is performed based on the test set. If the specified language model passes the performance test, then the specified language model will be used as the target language model. The weight adjustment strategy includes: for each predicted token, calculating the entropy of the probability distribution predicted by the model; the greater the entropy, the greater the weight is assigned to the loss of that token.

2. The problem-solving method based on artificial intelligence according to claim 1, characterized in that, The step of retrieving target documents from a preset medical knowledge base that match a preset relevance to the target question text specifically includes: Call the pre-trained document vector representation model; Based on the document vector representation model, the target question text is converted into a corresponding question vector; Calculate the similarity between the question vector and the vector of each document in the medical knowledge base; All the documents are sorted in descending order of similarity to obtain the corresponding sorted document list; Select the target number of first documents from the document sorting list; The first document is processed to extract key content, resulting in the corresponding second document; The second document is used as the target document.

3. The problem-solving method based on artificial intelligence according to claim 1, characterized in that, The step of optimizing the answer text to obtain the corresponding target answer text specifically includes: The response text is processed to remove redundancy, resulting in the corresponding first response text; The first response text is subjected to grammatical correction to obtain the corresponding second response text; The second answer text is used as the target answer text.

4. The problem-solving method based on artificial intelligence according to claim 1, characterized in that, The step of returning the target answer text to the user specifically includes: Obtain the target display format; Based on the target display format, the target answer text is converted to obtain the converted target answer text. Call the preset user interface; The target answer text is displayed to the user based on the user interface.

5. The problem-solving method based on artificial intelligence according to claim 1, characterized in that, The step of preprocessing the question text to obtain the corresponding target question text specifically includes: The question text is processed to remove irrelevant information, resulting in the corresponding first question text; Get the preset standard format; Based on the standard format, the first question text is converted to obtain the corresponding second question text; The second question text is used as the target question text.

6. A problem-solving device based on artificial intelligence, characterized in that, include: The receiving module is used to receive the question text input by the user; The preprocessing module is used to preprocess the question text to obtain the corresponding target question text; The query module is used to retrieve target documents from a preset medical knowledge base that match the preset relevance to the target question text; The extraction module is used to generate a corresponding first model input based on the target question text and the target document, and to extract information from the first model input using a preset target language model, so as to extract target information related to the question text from the target document; The reasoning module is used to generate a corresponding second model input based on the target question text and the target information, and to perform reasoning processing on the second model input through the target language model to obtain the corresponding answer text. The optimization module is used to optimize the response text to obtain the corresponding target response text; The return module is used to return the target answer text to the user; The AI-based problem-solving device also includes: The building module is used to construct medical question-answer pairs based on a pre-defined large language model; The annotation module is used to annotate the medical question-and-answer pairs to obtain corresponding medical sample data. A partitioning module is used to divide the medical sample data into a training set and a test set; The calling module is used to invoke the preset language model; The acquisition module is used to acquire a preset improved cross-entropy loss function; The fine-tuning module is used to fine-tune the language model using the improved cross-entropy loss function and the training set to obtain the corresponding specified language model. The testing module is used to perform performance testing on the specified language model based on the test set. A determination module is used to select the specified language model as the target language model if the specified language model passes the performance test. The acquisition module includes: The third submodule is used to obtain the cross-entropy loss function; The fourth acquisition submodule is used to acquire the preset weight adjustment strategy; The adjustment submodule is used to adjust the cross-entropy loss function based on the weight adjustment strategy to obtain the adjusted cross-entropy loss function. The fourth determining submodule is used to use the adjusted cross-entropy loss function as the improved cross-entropy loss function; The weight adjustment strategy includes: for each predicted token, calculating the entropy of the probability distribution predicted by the model; the greater the entropy, the greater the weight is assigned to the loss of that token.

7. A computer device, characterized in that, The method includes a memory and a processor, wherein the memory stores computer-readable instructions, and the processor executes the computer-readable instructions to implement the steps of the problem-solving method based on artificial intelligence as described in any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-readable instructions, which, when executed by a processor, implement the steps of the problem-solving method based on artificial intelligence as described in any one of claims 1 to 5.