Medical field dialogue method and device based on weak supervision and thought chain and readable medium
By training a small-scale large language model using weak supervision and thought chain methods, the problems of generating personalized responses and high parameter counts in task-oriented dialogue scenarios by large language models are solved, thus achieving efficient and accurate customer service dialogue.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- XIAMEN KUAISHANGTONG TECH CORP LTD
- Filing Date
- 2025-07-15
- Publication Date
- 2026-06-26
AI Technical Summary
Existing large language models cannot effectively generate targeted and personalized responses in open-domain task-oriented dialogue scenarios, and the large number of parameters in large models leads to low efficiency.
By employing a weakly supervised approach and a thought chain method, a small-scale large language model is trained by constructing thought chain prompts and first-order logic rules to generate customer service responses that meet business needs.
It enables the efficient generation of accurate customer service intent and responses with small amounts of data, reducing model costs and training volume, and improving the efficiency and accuracy of dialogue services.
Smart Images

Figure CN120996180B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of dialogue in the medical field, and more specifically to a method, apparatus, and readable medium for dialogue in the medical field based on weak supervision and thought chain. Background Technology
[0002] Large language models are a type of self-supervised generative pre-trained language model (PLM). The core idea is to learn general language representations from massive amounts of text through self-supervised learning, and then fine-tune them to adapt them for downstream tasks. Existing large models such as Deepseek and Qwen are autoregressive generative models based on unidirectional Transformers. Because autoregressive models model p(x) using conditional probabilities... t |x <t It can predict the next token (character) and generate it step by step, making it very suitable for query-answer dialogue modes.
[0003] However, existing technologies directly generate answers, which are responses to questions in the dialogue history. In open-domain, task-oriented dialogue scenarios, such as customer service or legal consultation, they cannot effectively provide responses that are meaningful to the task based on the existing dialogue environment. They are either too specific or fail to include specific personalized information, which cannot meet the needs of current practical applications.
[0004] Another issue is that the large models that can perform well in text understanding based on thought chains have a large number of parameters, which can lead to efficiency problems in applications. Therefore, how to achieve the capabilities of large models with smaller models is also a very important question. Summary of the Invention
[0005] The purpose of this application is to propose a dialogue method, device, and readable medium in the medical field based on weak supervision and thought chain to address the aforementioned technical problems.
[0006] In a first aspect, the present invention provides a dialogue method in the medical field based on weak supervision and thought chain, comprising the following steps:
[0007] Obtain a dialogue dataset containing several dialogue samples, each including visitor statements and customer service statements; construct a first thought chain prompt word to generate corresponding dialogue information summaries and customer service intentions based on the input dialogue samples; input each dialogue sample in the dialogue dataset and the first thought chain prompt word into a trained first large language model to obtain the dialogue information summary and customer service intention corresponding to each dialogue sample. The dialogue information summary includes the symptoms described by the visitor, the visitor's intention, and the visitor's purpose; construct a first training dataset based on the dialogue dataset and the dialogue information summary corresponding to each dialogue sample; construct a second training dataset based on the dialogue dataset and the customer service intention corresponding to each dialogue sample.
[0008] The dialogue information in the first training dataset is summarized and the first prompt word used to guide the generation of customer service intent is obtained through first-order logic rules. The third training dataset is constructed based on the dialogue samples in the first training dataset and their corresponding first prompt words. The third training dataset is mixed with the second training dataset to obtain a mixed training dataset. The second language model trained by the mixed training dataset is fine-tuned using the DoRA training method to obtain the fine-tuned second language model.
[0009] The process involves: acquiring the dialogue history to be replied to; inputting this history into a finely tuned second language model to obtain the corresponding predicted customer service intent; inputting the dialogue history to be replied to along with second thought chain prompts used to generate the corresponding dialogue information summary into a trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be replied to; using the dialogue information summary and the predicted customer service intent as a basis, obtaining second prompts to guide the generation of the reply through first-order logic rules; and finally, inputting the dialogue history to be replied to and the second prompts into the dialogue model to generate the corresponding customer service reply statement.
[0010] As a preferred approach, mind chain prompts include:
[0011] First, identify the symptoms described by the visitor and the visitor's intent based on the input dialogue sample;
[0012] Identify visitor purpose based on the symptoms described by the visitor, visitor intent, and input dialogue samples;
[0013] Based on the symptoms described by the visitor, the visitor's intent, the visitor's purpose, and the input dialogue sample, infer the dialogue logic that meets the business requirements, and map at least one customer service intent based on the dialogue logic.
[0014] As a preferred approach, the dialogue information in the first training dataset is summarized using first-order logic rules to obtain the first prompt words used to guide the generation of customer service intent, specifically including:
[0015] At least one of the symptoms, intentions, and purposes described by the visitor in the dialogue information summary of the first training dataset is connected through corresponding logical gates to obtain several first thought chain statements. All the first thought chain statements are combined to obtain the first prompt word.
[0016] As a preferred approach, the dialogue information corresponding to the dialogue history to be responded to is summarized and the customer service intent is predicted. Then, a second prompt word is obtained using first-order logic rules to guide the generation of the response. Specifically, this includes:
[0017] Connect at least one of the following from the summary of dialogue information corresponding to the dialogue history to be replied to: the visitor's described symptoms, visitor intent, visitor purpose, and predicted customer service intent. This will result in several second thought chain statements. Combining all the second thought chain statements will yield the second prompt word.
[0018] Preferably, during the fine-tuning process of the trained second language model, the dialogue samples and their corresponding first prompt words in the mixed training dataset are input into the trained second language model, or the dialogue samples in the mixed training dataset are input into the trained second language model to obtain the predicted customer service intent corresponding to each dialogue sample.
[0019] As preferred, the first trained language model includes the Qwen2.5-72B model, and the second trained language model includes the Qwen2.5-0.5B-instruct model.
[0020] Secondly, the present invention provides a medical field dialogue device based on weak supervision and thought chain, comprising:
[0021] The dataset construction module is configured to acquire a dialogue dataset containing several dialogue samples, each including visitor statements and customer service statements; construct a first thought chain prompt word to generate corresponding dialogue information summaries and customer service intentions based on the input dialogue samples; input each dialogue sample in the dialogue dataset and the first thought chain prompt word into a trained first large language model to obtain the dialogue information summary and customer service intention corresponding to each dialogue sample. The dialogue information summary includes the symptoms described by the visitor, the visitor's intention, and the visitor's purpose; construct a first training dataset based on the dialogue dataset and the dialogue information summary corresponding to each dialogue sample; and construct a second training dataset based on the dialogue dataset and the customer service intention corresponding to each dialogue sample.
[0022] The fine-tuning module is configured to summarize the dialogue information in the first training dataset and obtain the first prompt word for guiding customer service intent generation through first-order logic rules; construct a third training dataset based on the dialogue samples in the first training dataset and their corresponding first prompt words; mix the third training dataset with the second training dataset to obtain a mixed training dataset; and fine-tune the trained second language model using the mixed training dataset and the DoRA training method to obtain the fine-tuned second language model.
[0023] The response module is configured to: acquire the dialogue history to be responded to; input the dialogue history to be responded to into a finely tuned second language model to obtain the corresponding predicted customer service intent; input the dialogue history to be responded to and the second thought chain prompt words used to generate the corresponding dialogue information summary into a trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be responded to; use the dialogue information summary corresponding to the dialogue history to be responded to and the predicted customer service intent through first-order logic rules to obtain the second prompt words used to guide the generation of the response; input the dialogue history to be responded to and the second prompt words into the dialogue model to generate the corresponding customer service response statement.
[0024] Thirdly, the present invention provides an electronic device including one or more processors; and a storage device for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any implementation of the first aspect.
[0025] Fourthly, the present invention provides a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, implements the method as described in any of the implementations of the first aspect.
[0026] Fifthly, the present invention provides a computer program product, including a computer program that, when executed by a processor, implements the method as described in any of the implementations in the first aspect.
[0027] Compared with the prior art, the present invention has the following beneficial effects:
[0028] (1) The medical field dialogue method proposed in this invention, based on weak supervision and thought chain, analyzes the dialogue information summary and customer service intent of each dialogue sample in the training dataset through the first trained language model, and constructs the first prompt word to guide the generation of customer service intent by combining the first-order logic rules. The second trained language model is then fine-tuned using the dialogue sample and its corresponding first prompt word or customer service intent, so that it can quickly meet the dialogue service needs of customers in any limited scenario and accurately generate and predict customer service intent. The fine-tuning process adopts weak supervision, so the data requirement is not high, and the effect of efficient application at low cost is achieved.
[0029] (2) The medical field dialogue method based on weak supervision and thought chain proposed in this invention can accurately predict the customer service intention using a trained second large language model. This process can be achieved by a large language model with a data size of 0.5B-3B, without the need for the thinking output of the multi-stage thought chain of a large language model with a data size of 72B, thus reducing the cost of using the model and the amount of training.
[0030] (3) The medical field dialogue method based on weak supervision and thought chain proposed in this invention utilizes the customer service intent corresponding to the dialogue history to be replied to generated by the finely tuned second language model, and then combines the dialogue information to obtain the second prompt word for guiding the reply generation through first-order logic rules, so as to further guide the dialogue model to generate customer service reply statements that meet the specific task scenario and accurate intent. Attached Figure Description
[0031] To more clearly illustrate the technical solutions in the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0032] Figure 1 This is a flowchart illustrating a medical field dialogue method based on weak supervision and thought chain, which is an embodiment of this application.
[0033] Figure 2 A schematic diagram of a medical field dialogue device based on weak supervision and thought chain, which is an embodiment of this application;
[0034] Figure 3 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present invention. Detailed Implementation
[0035] To make the objectives, technical solutions, and advantages of this invention clearer, the invention will be further described in detail below with reference to the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of this invention, and not all of them. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this invention.
[0036] Figure 1 An embodiment of this application illustrates a medical field dialogue method based on weak supervision and thought chains, comprising the following steps:
[0037] S1. Obtain a dialogue dataset containing several dialogue samples, each including visitor statements and customer service statements; construct a first thought chain prompt word to generate corresponding dialogue information summaries and customer service intentions based on the input dialogue samples; input each dialogue sample in the dialogue dataset and the first thought chain prompt word into the trained first large language model to obtain the dialogue information summary and customer service intention corresponding to each dialogue sample. The dialogue information summary includes the symptoms described by the visitor, the visitor's intention, and the visitor's purpose; construct a first training dataset based on the dialogue dataset and the dialogue information summary corresponding to each dialogue sample; construct a second training dataset based on the dialogue dataset and the customer service intention corresponding to each dialogue sample.
[0038] In a specific embodiment, the thought chain prompts include:
[0039] First, identify the symptoms described by the visitor and the visitor's intent based on the input dialogue sample;
[0040] Identify visitor purpose based on the symptoms described by the visitor, visitor intent, and input dialogue samples;
[0041] Based on the symptoms described by the visitor, the visitor's intent, the visitor's purpose, and the input dialogue sample, infer the dialogue logic that meets the business requirements, and map at least one customer service intent based on the dialogue logic.
[0042] In a specific embodiment, the first trained large language model includes the Qwen2.5-72B model, and the second trained large language model includes the Qwen2.5-0.5B-instruct model.
[0043] Specifically, embodiments of this application model medical dialogue logic using weakly supervised training combined with CoT (CoT), thereby helping to generate responses from dialogue history that better meet specific business needs. For an initial dialogue dataset D = {d1,d2,d3,...,d...} n}, where each dialogue sample d k Both contain visitor statements and customer service statements.
[0044] By constructing the first thought chain cue (using Prompt) cot This is used to enable the trained first language model to generate a summary of dialogue information and customer intent for the current dialogue sample based on the input dialogue sample. The summary of dialogue information includes: symptoms, visitor intent, and visitor purpose. The first thought chain prompt (Prompt) is used to... cotThe first step is to understand the symptoms and intent described by the visitor. The second step is to understand the visitor's purpose in the dialogue sample based on the symptoms, intent, and dialogue sample. The final step is to infer the dialogue logic that meets the business requirements based on the symptoms, intent, purpose, and dialogue sample, and to map at least one customer service intent. During this process, the dialogue logic can be output simultaneously to facilitate verification that the output of the customer service intent meets the requirements. Visitor intent refers to the intent expressed in the visitor's statements in the dialogue sample, while customer service intent refers to the intent of the statement that the customer service representative needs to respond to. In one example, the first large language model trained uses the Qwen2.5-72B model; in other examples, a large language model with strong reasoning capabilities can also be selected.
[0045] The above method is used to obtain some data for weakly supervised training. The training dataset consists of two parts: the first training dataset D. train1 Second training dataset D train2 The former includes dialogue samples and their corresponding dialogue information summaries, while the latter only includes dialogue samples and their customer service intentions.
[0046] As an example, embodiments of this application propose one of the first thought chain prompts, as shown below:
[0047] Roles and tasks:
[0048] As a senior customer service representative for the domain, you will receive patient inquiries online, build trust through professional consultations, and guide visitors to leave their contact information at appropriate times in the conversation for follow-up services.
[0049] The task is to first identify the symptoms expressed by the visitor and the intent of the text messages they reply based on the visitor's conversation history with customer service, then determine the visitor's main purpose by combining the conversation history, and finally infer the customer service representative's next intention based on the symptoms expressed by the visitor, the intent of the text messages they reply, the visitor's main purpose, the conversation history, and the conversation rounds.
[0050] Task rules and strategies:
[0051] 1. Ensure that the intent description must be selected from the intent list and cannot be generated arbitrarily. The intent list is: [actions];
[0052] 2. Visitor intent: If a visitor's symptoms are identified, a description can be provided in conjunction with those symptoms.
[0053] 3. When assessing a visitor's intent, it's necessary to determine the visitor's status, whether they have explicitly agreed to leave contact or come to the hospital, and whether the visitor has already been diagnosed or received treatment.
[0054] 4. If the visitor describes their medical history, it can be inferred that the customer service's intention includes "asking whether they have a history of treatment at this hospital".
[0055] 5. Do not repeat questions that have been asked in the previous conversation history, or that have already been mentioned or answered by the visitor, and do not generate content that has been answered before.
[0056] step:
[0057] 1. Symptom identification: Analyze the conversation history between visitors and customer service to determine the symptoms described by visitors. If the visitor inquires about medication, infer the symptoms based on the medication's efficacy.
[0058] 2. Identify visitor intent: Based on the identified visitor symptoms and the visitor's conversation history with customer service, identify the visitor's intent.
[0059] 3. Determine the visitor's purpose: Consider why the visitor came to the hospital for consultation and determine the visitor's main purpose, such as wanting to know about prices, pain levels, treatment methods, etc.
[0060] 4. Inferring Customer Service Intent: Based on the identified symptoms, visitor intent, and visitor purpose, combined with the dialogue history and the number of rounds in the dialogue history, infer the customer service intent.
[0061] Require:
[0062] 1. Do not output explanations such as "e.g., ...";
[0063] 2. The intent description should not exceed 20 characters;
[0064] 3. In multiple cases, visitor intent and customer service intent should be separated by commas;
[0065] 4. Your output format must be: Symptom: ...\nVisitor Intent: ...\nVisitor Purpose: ...\nCustomer Intent: ...\n\n
[0066] The above first thought chain prompts are just one example. They can be further optimized and adjusted according to specific requirements to guide the generation of more suitable and accurate dialogue information summaries and customer service intentions.
[0067] S2, the dialogue information in the first training dataset is summarized and the first prompt word used to guide the generation of customer service intent is obtained through first-order logic rules. The third training dataset is constructed based on the dialogue samples in the first training dataset and their corresponding first prompt words. The third training dataset is mixed with the second training dataset to obtain a mixed training dataset. The second language model trained by the mixed training dataset is fine-tuned using the DoRA training method to obtain the fine-tuned second language model.
[0068] In a specific embodiment, the dialogue information in the first training dataset is summarized and processed through first-order logic rules to obtain the first prompt word used to guide the generation of customer service intent, specifically including:
[0069] At least one of the symptoms, intentions, and purposes described by the visitor in the dialogue information summary of the first training dataset is connected through corresponding logical gates to obtain several first thought chain statements. All the first thought chain statements are combined to obtain the first prompt word.
[0070] In a specific embodiment, during the fine-tuning process of the trained second language model, dialogue samples and their corresponding first prompt words from the mixed training dataset are input into the trained second language model, or dialogue samples from the mixed training dataset are input into the trained second language model to obtain the predicted customer service intent corresponding to each dialogue sample.
[0071] Specifically, before training the model, the first training dataset D needs to be... train1 The dialogue information is summarized using first-order logic rules (FOL) to generate different first thought chain statements and combine them as first prompt words. The first-order logic rules employ a superposition rule of AND, OR, and NOT logic gates. When constructing the first prompt words, the first training dataset D... train1At least one of the symptoms, intentions, and purposes described by the visitor is used to construct the first thought chain statement using first-order logic rules. Combining all the first thought chain statements forms the first prompt word. Therefore, the dialogue samples and their corresponding first prompt words in the first training dataset are used to construct the third training dataset. This third training dataset is then mixed with the second training dataset to obtain a hybrid dataset. This hybrid dataset is used to fine-tune the trained second language model. The fine-tuning process employs weak supervision combined with thought chains and uses the DoRA training method, resulting in a very small number of model parameters. DoRA is a weighted LoRA decomposition model, and its specific training process is not detailed here. In one example, the trained second language model uses the Qwen2.5-0.5B-instruct model as the base model; in other examples, a large language model with small parameters can also be chosen. The embodiments of this application can achieve a 5-10 times efficiency improvement compared to multi-step thought chain generation, because multi-step thought chains require at least a 32B model to perform relatively well. After fine-tuning, the second language model becomes a dedicated model capable of generating corresponding predicted customer service intentions for each dialogue history. The trained second language model is weakly supervised by mixing dialogue texts with first cue words and dialogue texts labeled with customer service intentions from a mixed training dataset. Utilizing the first cue words formed by the combination of first thought chain statements, it can incorporate the thought process required for customer service intention recognition. Furthermore, the use of the DoRA training method significantly reduces the amount of training data.
[0072] S3: Obtain the dialogue history to be replied to; input the dialogue history to be replied to into the finely tuned second language model to obtain the corresponding predicted customer service intent; input the dialogue history to be replied to and the second thought chain prompt words used to generate the corresponding dialogue information summary into the trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be replied to; use the dialogue information summary corresponding to the dialogue history to be replied to and the predicted customer service intent through first-order logic rules to obtain the second prompt words used to guide the generation of the reply; input the dialogue history to be replied to and the second prompt words into the dialogue model to generate the corresponding customer service reply statement.
[0073] In a specific embodiment, the dialogue information corresponding to the dialogue history to be replied to and the customer service intent are summarized and predicted, and then a second prompt word is obtained to guide the generation of the reply through first-order logic rules. Specifically, this includes:
[0074] Connect at least one of the following from the summary of dialogue information corresponding to the dialogue history to be replied to: the visitor's described symptoms, visitor intent, visitor purpose, and predicted customer service intent. This will result in several second thought chain statements. Combining all the second thought chain statements will yield the second prompt word.
[0075] Specifically, after fine-tuning the second language model, it can be directly applied to the dialogue system. This fine-tuned second language model generates the corresponding predicted customer service intent based on the dialogue history to be responded to. Furthermore, based on the first thought chain prompts in step S1, a second thought chain prompt is constructed to generate the corresponding dialogue information summary. The dialogue history to be responded to and the second thought chain prompts are input into the trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be responded to. Then, the dialogue information summary corresponding to the dialogue history to be responded to and the predicted customer service intent are processed through first-order logic rules to obtain the second prompts used to guide the response generation. This ensures that the model output is regular natural text, facilitating direct text generation in the next step.
[0076] In one example, the process of obtaining the second cue word through first-order logic rules is as follows:
[0077] Construct a second thought chain by connecting the visitor intent and purpose in the summary of dialogue information corresponding to the dialogue history to be replied to;
[0078] Construct a second thought chain statement by connecting the symptoms described by the visitor and the visitor's purpose in the summary of the dialogue information corresponding to the dialogue history to be replied to using AND, OR, and OR relationships;
[0079] The second thought chain statement is constructed by connecting the visitor intent and the predicted customer service intent in the summary of the dialogue information corresponding to the dialogue history to be replied to.
[0080] The second thought chain statement is constructed by connecting the visitor's purpose and the predicted customer service intent in the summary of the dialogue information corresponding to the dialogue history to be replied to.
[0081] By combining the above second thought chain statements, we obtain the second prompt word.
[0082] As an example, the following is a combination of statements in the second thought chain:
[0083] a+b: The response needs to combine the symptoms described by the visitor in the conversation history "{topic}" and guide the response to the main purpose of the visitor's consultation in the conversation history "{target}".
[0084] b+c: The response should be tailored to the main purpose of the visitor's inquiry in the conversation history, "{target}", and should be based on the visitor's intent, "{action}".
[0085] The second thought chain statements of the two first-order logic rules mentioned above are combined to form the second prompt word.
[0086] Furthermore, by inputting the dialogue history to be replied to and the second prompt word into the dialogue model, the corresponding customer service reply can be generated and output. Ultimately, with the second prompt word obtained, a 14-bit dialogue model can achieve the generation capability of a 72-bit dialogue model. Of course, for large-parameter models with reasoning capabilities, such as deepseek-32B / r1, the efficiency of outputting the thought process is far lower than the method proposed in the embodiments of this application.
[0087] Further reference Figure 2 As an implementation of the methods shown in the above figures, this application provides an embodiment of a medical field dialogue device based on weak supervision and thought chains. This device embodiment is similar to... Figure 1 Corresponding to the method embodiments shown, this device can be specifically applied to various electronic devices.
[0088] This application provides a medical field dialogue device based on weak supervision and thought chain, including:
[0089] Dataset construction module 1 is configured to acquire a dialogue dataset containing several dialogue samples, each including visitor statements and customer service statements; construct a first thought chain prompt word to generate corresponding dialogue information summaries and customer service intentions based on the input dialogue samples; input each dialogue sample in the dialogue dataset and the first thought chain prompt word into a trained first large language model to obtain the dialogue information summary and customer service intention corresponding to each dialogue sample, the dialogue information summary including the symptoms described by the visitor, the visitor's intention, and the visitor's purpose; construct a first training dataset based on the dialogue dataset and the dialogue information summary corresponding to each dialogue sample, and construct a second training dataset based on the dialogue dataset and the customer service intention corresponding to each dialogue sample.
[0090] Fine-tuning module 2 is configured to summarize the dialogue information in the first training dataset and obtain the first prompt word for guiding customer service intent generation through first-order logic rules; construct a third training dataset based on the dialogue samples in the first training dataset and their corresponding first prompt words; mix the third training dataset with the second training dataset to obtain a mixed training dataset; and fine-tune the trained second language model using the mixed training dataset and the DoRA training method to obtain the fine-tuned second language model.
[0091] Response module 3 is configured to acquire the dialogue history to be responded to, input the dialogue history to be responded to into the fine-tuned second language model to obtain the corresponding predicted customer service intent; input the dialogue history to be responded to and the second thought chain prompt words used to generate the corresponding dialogue information summary into the trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be responded to; use the dialogue information summary corresponding to the dialogue history to be responded to and the predicted customer service intent through first-order logic rules to obtain the second prompt words used to guide the generation of the response; input the dialogue history to be responded to and the second prompt words into the dialogue model to generate the corresponding customer service response statement.
[0092] Figure 3 This is a schematic diagram of the hardware structure of an electronic device provided in an embodiment of the present invention. For example... Figure 3 As shown, the electronic device in this embodiment includes a processor 301 and a memory 302; wherein the memory 302 is used to store computer execution instructions; and the processor 301 is used to execute the computer execution instructions stored in the memory to implement the various steps performed by the electronic device in the above embodiment. For details, please refer to the relevant descriptions in the foregoing method embodiments.
[0093] Alternatively, the memory 302 can be either standalone or integrated with the processor 301.
[0094] When the memory 302 is set up independently, the electronic device also includes a bus 303 for connecting the memory 302 and the processor 301.
[0095] This invention also provides a computer storage medium storing computer execution instructions, which, when executed by processor 301, implement the above method.
[0096] This invention also provides a computer program product, including a computer program that, when executed by a processor 301, implements the above-described method.
[0097] In the embodiments provided by this invention, it should be understood that the disclosed devices and methods can be implemented in other ways. For example, the device embodiments described above are merely illustrative; for instance, the division of modules is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple modules may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be indirect coupling or communication connection through some interfaces, devices, or modules, and may be electrical, mechanical, or other forms.
[0098] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to implement the solution of this embodiment according to actual needs.
[0099] Furthermore, the functional modules in the various embodiments of this invention can be integrated into one processing unit, or each module can exist physically separately, or two or more modules can be integrated into one unit. The unit formed by the above modules can be implemented in hardware or in the form of hardware plus software functional units.
[0100] The integrated modules implemented as software functional modules described above can be stored in a computer-readable storage medium. These software functional modules, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor 301 to execute some steps of the methods of the various embodiments of this application.
[0101] It should be understood that the processor 301 described above can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor, or the processor 301 can be any conventional processor 301. The steps of the method disclosed in this invention can be directly manifested as execution by the hardware processor 301, or execution by a combination of hardware and software modules within the processor 301.
[0102] The memory 302 may include high-speed RAM memory, and may also include non-volatile memory (NVM), such as at least one disk storage device, and may also be a USB flash drive, portable hard drive, read-only memory, disk or optical disc, etc.
[0103] Bus 303 can be an Industry Standard Architecture (ISA), a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Bus 303 can be divided into address bus, data bus, control bus, etc. For ease of illustration, the bus 303 in the accompanying drawings of this application is not limited to only one bus 303 or one type of bus 303.
[0104] The aforementioned storage medium can be implemented from any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic storage, flash memory, magnetic disk, or optical disk. The storage medium can be any available medium accessible to general-purpose or special-purpose computers.
[0105] An exemplary storage medium is coupled to a processor 301, enabling the processor 301 to read information from and write information to the storage medium. Alternatively, the storage medium can be an integral part of the processor 301. The processor 301 and the storage medium can reside in an application-specific integrated circuit (ASIC). Alternatively, the processor 301 and the storage medium can exist as discrete components in an electronic device or a host device.
[0106] Those skilled in the art will understand that all or part of the steps of the above-described method embodiments can be implemented by hardware related to program instructions. The aforementioned program can be stored in a computer-readable storage medium. When executed, the program performs the steps of the above-described method embodiments; and the aforementioned storage medium includes various media capable of storing program code, such as ROM, RAM, magnetic disks, or optical disks.
[0107] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.
Claims
1. A dialogue method in the medical field based on weak supervision and thought chains, characterized in that, Includes the following steps: Obtain a dialogue dataset containing several dialogue samples, each including a visitor's statement and a customer service statement; construct a first thought chain prompt word to generate a corresponding dialogue information summary and customer service intent based on the input dialogue samples; input each dialogue sample in the dialogue dataset and the first thought chain prompt word into a trained first large language model to obtain a dialogue information summary and customer service intent corresponding to each dialogue sample, wherein the dialogue information summary includes the symptoms described by the visitor, the visitor's intent, and the visitor's purpose; construct a first training dataset based on the dialogue dataset and the dialogue information summary corresponding to each dialogue sample, and construct a second training dataset based on the dialogue dataset and the customer service intent corresponding to each dialogue sample. The dialogue information in the first training dataset is summarized and the first prompt word used to guide the generation of customer service intent is obtained through first-order logic rules. A third training dataset is constructed based on the dialogue samples in the first training dataset and their corresponding first prompt words. The third training dataset is mixed with the second training dataset to obtain a mixed training dataset. The trained second language model was fine-tuned using the hybrid training dataset and the DoRA training method to obtain the fine-tuned second language model. Obtain the dialogue history to be replied to, and input the dialogue history to be replied to into the fine-tuned second language model to obtain the corresponding predicted customer service intent; The dialogue history to be replied to and the second thought chain prompt words used to generate the corresponding dialogue information summary are input into the trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be replied to. The dialogue information summary corresponding to the dialogue history to be replied to and the predicted customer service intention are used to obtain the second prompt words used to guide the generation of reply through first-order logic rules. The dialogue history to be replied to and the second prompt word are input into the dialogue model to generate the corresponding customer service reply statement.
2. The medical field dialogue method based on weak supervision and thought chain as described in claim 1, characterized in that, The thought chain prompts include: First, identify the symptoms described by the visitor and the visitor's intent based on the input dialogue sample; The visitor's purpose is identified based on the symptoms described by the visitor, the visitor's intent, and the input dialogue sample; Based on the symptoms described by the visitor, the visitor's intent, the visitor's purpose, and the input dialogue sample, a dialogue logic that meets business needs is inferred, and at least one customer service intent is mapped based on the dialogue logic.
3. The medical field dialogue method based on weak supervision and thought chain as described in claim 1, characterized in that, The dialogue information in the first training dataset is summarized and processed using first-order logic rules to obtain the first prompt words used to guide the generation of customer service intent, specifically including: At least one of the symptoms, intentions, and purposes described by the visitor in the dialogue information summary of the first training dataset is connected through corresponding logical gates to obtain several first thought chain statements. All the first thought chain statements are combined to obtain the first prompt word.
4. The medical field dialogue method based on weak supervision and thought chain as described in claim 1, characterized in that, The dialogue information corresponding to the dialogue history to be replied to is summarized and the customer service intent is predicted. Then, a second prompt word is obtained through first-order logic rules to guide the generation of the reply. Specifically, it includes: The symptoms described by the visitor, the visitor's intent, the visitor's purpose, and the predicted customer service intent in the summary of the dialogue information corresponding to the dialogue history to be replied to are connected through corresponding logical gates to obtain several second thought chain statements. All the second thought chain statements are combined to obtain the second prompt word.
5. The medical field dialogue method based on weak supervision and thought chain as described in claim 1, characterized in that, During the fine-tuning process of the trained second language model, the dialogue samples and their corresponding first prompt words in the mixed training dataset are input into the trained second language model, or the dialogue samples in the mixed training dataset are input into the trained second language model to obtain the predicted customer service intent corresponding to each dialogue sample.
6. The medical field dialogue method based on weak supervision and thought chain as described in claim 1, characterized in that, The first trained language model includes the Qwen2.5-72B model, and the second trained language model includes the Qwen2.5-0.5B-instruct model.
7. A dialogue device in the medical field based on weak supervision and thought chain, characterized in that, include: The dataset building module is configured to acquire a dialogue dataset containing several dialogue samples, each including visitor statements and customer service statements; A first thought chain prompt is constructed to generate corresponding dialogue information summaries and customer service intentions based on input dialogue samples. Each dialogue sample in the dialogue dataset and the first thought chain prompt are input into a trained first large language model to obtain the dialogue information summary and customer service intention corresponding to each dialogue sample. The dialogue information summary includes the symptoms described by the visitor, the visitor's intention, and the visitor's purpose. A first training dataset is constructed based on the dialogue dataset and the dialogue information summary corresponding to each dialogue sample. A second training dataset is constructed based on the dialogue dataset and the customer service intention corresponding to each dialogue sample. The fine-tuning module is configured to summarize the dialogue information in the first training dataset and obtain the first prompt word for guiding the generation of customer service intent through first-order logic rules, construct a third training dataset based on the dialogue samples in the first training dataset and their corresponding first prompt words, and mix the third training dataset with the second training dataset to obtain a mixed training dataset. The trained second language model was fine-tuned using the hybrid training dataset and the DoRA training method to obtain the fine-tuned second language model. The response module is configured to obtain the dialogue history to be responded to, input the dialogue history to be responded to into the fine-tuned second language model, and obtain the corresponding predicted customer service intent. The dialogue history to be replied to and the second thought chain prompt words used to generate the corresponding dialogue information summary are input into the trained first language model to obtain the dialogue information summary corresponding to the dialogue history to be replied to. The dialogue information summary corresponding to the dialogue history to be replied to and the predicted customer service intention are used to obtain the second prompt words used to guide the generation of reply through first-order logic rules. The dialogue history to be replied to and the second prompt word are input into the dialogue model to generate the corresponding customer service reply statement.
8. An electronic device, comprising: One or more processors; Storage device for storing one or more programs. When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-6.
9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the program is executed by the processor, it implements the method as described in any one of claims 1-6.
10. A computer program product, comprising a computer program, characterized in that, When the computer program is executed by a processor, it implements the method as described in any one of claims 1-6.