Bad case mining method, device and storage medium
By using large language models for real-time online training and data cleaning, the problem of inefficient bad case discovery in existing technologies has been solved, achieving efficient and accurate bad case discovery and improving user experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- APOLLO INTELLIGENT CONNECTIVITY (BEIJING) TECH CO LTD
- Filing Date
- 2023-05-11
- Publication Date
- 2026-06-23
AI Technical Summary
In existing technologies, manual exploration and offline model training methods are inefficient and incomplete in bad case discovery, resulting in an inability to effectively improve the user experience.
We employ a method of real-time online training and data cleaning of large language models. By acquiring instruction data and classification results from interactive devices, we use large language models for AI classification to identify bad cases.
It achieves efficient and accurate bad case discovery, reduces labor costs, improves user experience, and solves the problems of limited sources of model training data and untimely parameter updates.
Smart Images

Figure CN116610740B_ABST
Abstract
Description
Technical Field
[0001] This disclosure relates to the field of artificial intelligence technology, specifically to technologies such as autonomous driving, vehicle networking, and voice semantics, and in particular to a bad case mining method, apparatus, and storage medium. Background Technology
[0002] When users interact with interactive devices, there are often situations where the device cannot correctly parse the user's input commands (such as voice input). These types of input examples are called bad cases.
[0003] In related technologies, it is usually necessary to identify bad cases so that the device's ability to parse input commands can be further optimized based on the identified bad cases, thereby improving the user experience. Summary of the Invention
[0004] This disclosure provides a method and apparatus for bad example discovery.
[0005] According to a first aspect of this disclosure, a bad example discovery method is provided, comprising:
[0006] Acquire data samples, the data samples including at least one instruction data historically input to the interactive device and the online classification result of the instruction data determined by the interactive device;
[0007] Determine instruction classification information, which is used to classify instruction data;
[0008] The at least one instruction data and the instruction classification information are input into a large language model so that the large language model can determine the AI classification results corresponding to the at least one instruction data based on the instruction classification information.
[0009] Bad examples are identified based on the AI classification results of the instruction data and the corresponding online classification results of the instruction data.
[0010] According to a second aspect of this disclosure, a bad example mining apparatus is provided, comprising:
[0011] The acquisition module is used to acquire data samples, the data samples including at least one instruction data historically input to the interactive device and the online classification result of the instruction data determined by the interactive device;
[0012] A determination module is used to determine instruction classification information, which is used to classify instruction data;
[0013] An input module is used to input the at least one instruction data and the instruction classification information into a large language model, so that the large language model can determine the AI classification results corresponding to the at least one instruction data based on the instruction classification information.
[0014] The mining module is used to mine bad cases based on the AI classification results of the instruction data and the online classification results corresponding to the instruction data.
[0015] According to a third aspect of this disclosure, an electronic device is proposed, comprising at least one processor, and
[0016] A memory that is communicatively connected to at least one processor; wherein,
[0017] The memory stores instructions that can be executed by at least one processor, which enables the at least one processor to perform the bad case mining method according to the first aspect of this disclosure.
[0018] According to a fourth aspect of this disclosure, a non-transitory computer-readable storage medium storing computer instructions is provided, wherein the computer instructions are used to cause a computer to execute a bad case mining method according to an embodiment of a first aspect of this disclosure.
[0019] According to a fifth aspect of this disclosure, a computer program product is proposed, comprising a computer program that, when executed by a processor, implements the steps of the bad case mining method of the first aspect of this disclosure.
[0020] It should be understood that the description in this section is not intended to identify key or essential features of the embodiments of this disclosure, nor is it intended to limit the scope of this disclosure. Other features of this disclosure will become readily apparent from the following description. Attached Figure Description
[0021] The accompanying drawings are provided to better understand this solution and do not constitute a limitation of this disclosure. Wherein:
[0022] Figure 1 This is a flowchart of a bad case mining method according to an embodiment of the present disclosure;
[0023] Figure 2 This is a flowchart of a bad case mining method according to an embodiment of the present disclosure;
[0024] Figure 3 This is a flowchart of a bad case mining method according to an embodiment of the present disclosure;
[0025] Figure 4 This is a structural diagram of a bad sample excavation apparatus according to an embodiment of the present disclosure;
[0026] Figure 5This is a block diagram of an electronic device used to implement embodiments of the present disclosure. Detailed Implementation
[0027] The exemplary embodiments of this disclosure are described below with reference to the accompanying drawings, including various details of the embodiments to aid understanding, and should be considered merely exemplary. Therefore, those skilled in the art will recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of this disclosure. Similarly, for clarity and brevity, descriptions of well-known functions and structures are omitted in the following description.
[0028] Artificial intelligence (AI) is a technical science that studies and develops theories, methods, technologies, and application systems to simulate, extend, and expand human intelligence. Currently, AI technology has the advantages of high automation, high accuracy, and low cost, and has been widely applied.
[0029] Data processing (DP) encompasses the acquisition, storage, retrieval, processing, transformation, and transmission of data. The fundamental purpose of data processing is to extract and derive valuable and meaningful data from large volumes of potentially chaotic and incomprehensible data. Data processing is a fundamental component of systems engineering and automatic control. It permeates all areas of social production and life. The development of data processing technology and the breadth and depth of its applications have profoundly influenced the progress of human society.
[0030] Deep learning (DL) is a new research direction in the field of machine learning (ML). It learns the inherent patterns and hierarchical representations of sample data. The information gained during this learning process greatly aids in interpreting data such as text, images, and sound. Its ultimate goal is to enable machines to possess analytical and learning capabilities like humans, capable of recognizing data such as text, images, and sound. Specific research areas mainly include neural network systems based on convolutional operations, i.e., convolutional neural networks; autoencoder neural networks based on multi-layered neurons; and deep belief networks that are pre-trained using multi-layered autoencoder neural networks and then further optimized by incorporating discriminative information. Deep learning has achieved significant results in search technology, data mining, machine learning, machine translation, natural language processing, multimedia learning, speech recognition, recommendation and personalization technologies, and other related fields. Deep learning enables machines to mimic human activities such as sight, hearing, and thinking, solving many complex pattern recognition problems and leading to significant advancements in artificial intelligence-related technologies.
[0031] Intelligent Traffic System (ITS), also known as Intelligent Transportation System, effectively integrates advanced science and technology (information technology, computer technology, data communication technology, sensor technology, electronic control technology, automatic control theory, operations research, artificial intelligence, etc.) into transportation, service control, and vehicle manufacturing. It strengthens the connection between vehicles, roads, and users, thereby forming a comprehensive transportation system that ensures safety, improves efficiency, improves the environment, and saves energy.
[0032] Optionally, the bad case mining solutions in related technologies include the following two:
[0033] Option 1: Manual exploration, i.e., internal and external personnel manually explore bad cases, and then provide feedback and follow-up;
[0034] Option 2: Train an offline model that can identify bad cases, and then use the trained offline model to identify bad cases.
[0035] However, in Scheme 1 above, relying on manual and subjective exploration results in a relatively small number of identifiable bad cases, leading to low efficiency. In Scheme 2 above, offline model training relies on a large amount of data in the early stages, resulting in a long training cycle and low efficiency. Furthermore, the limited sources of training data for offline models can lead to data similarity, causing the model classification results to tend to be consistent, thus hiding potential bad cases, i.e., incomplete bad case discovery.
[0036] Based on this, this disclosure provides a method for bad example discovery.
[0037] Specifically, the execution entity of the bad case mining method in this embodiment of the disclosure can be the bad case mining device provided in this embodiment of the disclosure. The bad case mining device can be a hardware device with data processing capabilities and / or the necessary software to drive the hardware device. Optionally, the execution entity may include a workstation, server, computer, user terminal, and other devices. The user terminal includes, but is not limited to, mobile phones, computers, intelligent voice interaction devices, smart home appliances, and vehicle terminals.
[0038] Figure 1 This is a flowchart of a bad case discovery method according to an embodiment of this disclosure, as follows: Figure 1 As shown, the method includes the following steps:
[0039] Step S101: Obtain data samples.
[0040] Optionally, the data sample may include at least one instruction data historically input to the interactive device and the online classification results of the instruction data determined by the interactive device.
[0041] Optionally, the "at least one instruction data historically input to the interactive device" can be, for example, instruction data historically input by the user to the interactive device during human-computer interaction. The "online classification result of the instruction data determined by the interactive device" can be, for example, the classification result parsed by the interactive device based on the instruction data historically input by the user.
[0042] Optionally, the aforementioned interactive device may be a voice interaction device, such as an in-vehicle voice assistant.
[0043] Optionally, in some embodiments, the method by which the bad sample mining device acquires data samples may include the following steps:
[0044] Step 1: Obtain raw data from the data log.
[0045] Optionally, the data log can be the data log of any interactive device. Optionally, the raw data can be data that has already been anonymized.
[0046] Step 2: Clean the raw data to obtain at least one instruction data.
[0047] Optionally, the above-mentioned "data cleaning of raw data" may include at least one of the following:
[0048] Command to remove duplicates;
[0049] Length filtering; for example, it can filter out data that is too short or too long.
[0050] Test data filtering; optionally, the test data can be data used during internal testing of the interactive device;
[0051] Specific data filtering; optionally, in some embodiments, the specific data can be: fallback data. Optionally, when the interactive device fails to parse the user's command, a fallback mode will be activated, such as a chat mode. In this fallback mode, the data used by the human interaction device can be categorized as the specific data mentioned above.
[0052] Step 3: Select the online classification results corresponding to each instruction data from the original data.
[0053] Step 4: Determine at least one instruction data and the corresponding online classification result as a data sample.
[0054] As can be seen from the above, this embodiment of the present disclosure performs data cleaning (such as data deduplication, redundancy filtering, and unusable data filtering) on the original data in the data log by executing the above steps 1-4 to obtain the data sample in step S101. This ensures that the obtained data sample does not contain redundant, duplicate, or unusable data. Therefore, when performing bad case mining based on the data sample, the situation of "invalid mining based on redundant, duplicate, and unusable data" can be avoided, ensuring the accuracy and efficiency of bad case mining.
[0055] Step S102: Determine the instruction classification information.
[0056] Optionally, this instruction classification information can be used to classify instruction data.
[0057] Optionally, the instruction classification information may include at least one of the following:
[0058] At least one instruction category;
[0059] The category description corresponding to each instruction category;
[0060] At least one example instruction corresponds to each instruction category.
[0061] Optionally, the above-mentioned instruction category can refer to the category corresponding to different instruction data. For example, the instruction category may include: control instructions and navigation instructions, etc.
[0062] Optionally, the category descriptions corresponding to the above-mentioned instruction categories can be used to describe the characteristics of instruction data belonging to that instruction category. For example, the category descriptions corresponding to control instructions can be: control instructions for vehicle hardware (air conditioning, windows, etc.), control instructions for vehicle infotainment systems (brightness, volume, etc.), etc. When classifying a certain instruction data based on the category descriptions corresponding to the instruction categories, the semantics of the instruction data can be analyzed first, and then the category descriptions of which instruction category the semantics of the instruction data belong to can be determined, thereby determining the instruction category corresponding to the instruction data.
[0063] Optionally, at least one example instruction corresponding to the aforementioned instruction category specifically refers to an example instruction belonging to that instruction category. For instance, example instructions corresponding to control instructions could be: open the car window, open the sunshade, turn up the volume, etc. Furthermore, when classifying instruction data based on example instructions corresponding to an instruction category, the example instruction most semantically similar to the instruction data can be identified, and the instruction category corresponding to this identified example instruction is the instruction category corresponding to the instruction data.
[0064] Optionally, Table 1 is a schematic table of instruction classification information provided in an embodiment of this disclosure.
[0065] Table 1
[0066]
[0067]
[0068] Referring to Table 1, the categories of control commands can include: 1. Control commands for vehicle hardware (air conditioning, windows, etc.); 2. Control commands for the vehicle's infotainment system (brightness, volume, etc.); 3. Control commands for music playback (next track, previous track, etc.); The categories of navigation commands can include: 1. Meeting the need to query locations and initiate navigation via voice, including queries for attractions, restaurants, hotels, parking lots, and gas stations. Query conditions support location name, address, type, distance, rating (for attractions, restaurants, parking lots, and gas stations), price (for attractions, restaurants, parking lots, and gas stations), cuisine (for restaurants), and combinations of the above conditions; 2. Querying the distance between two locations; Example commands for control commands can include: playing FM99.8, playing my favorite songs, etc. Example commands for navigation commands can include: navigating to the Summer Palace via Tsinghua University, gas stations along the way, etc. Based on this, assuming we want to determine the command category of "Command Data: I want to go to Shenzhen North", since "I want to go to Shenzhen North" is a voice query for a location (i.e., Shenzhen North) and an initiation of a navigation request, it corresponds to the category description of navigation commands. Furthermore, "I want to go to Shenzhen North" is similar to the example command of navigation commands: I want to go to Tiananmen Square. Therefore, we determine that the command category of "Command Data: I want to go to Shenzhen North" is: navigation command.
[0069] As described above, the instruction classification information includes at least one instruction category, a category description for each instruction category, and at least one example instruction for each instruction category. Based on this classification information, the instruction category corresponding to each instruction data can be accurately determined. Specifically, the semantics of the instruction data can be analyzed first. Then, based on the semantics of the instruction data and the category descriptions corresponding to each instruction category, it can be determined which instruction category the semantics of the instruction data belongs to, thus determining the instruction category corresponding to the instruction data. Simultaneously, by comparing the instruction data with example instructions from which instruction category it is most similar to, the instruction category corresponding to the instruction data can also be determined. This achieves accurate classification of instruction categories, ensuring the accuracy of bad example detection when subsequently mining bad examples based on instruction categories.
[0070] Step S103: Input at least one instruction data and instruction classification information into the large language model.
[0071] Optionally, by inputting at least one instruction data and instruction classification information into a large language model, the large language model can determine the AI classification result corresponding to at least one instruction data based on the instruction classification information.
[0072] For example, the instruction classification information shown in Table 1 above, as well as the following instruction data, can be input into a large language model:
[0073] 1. Could you open my car window for me?
[0074] 2. I want to go to Shenzhen North.
[0075] 3. Turn off the lights;
[0076] 4. The volume is too low.
[0077] Based on this, the AI classification results output by the large language model can be, for example, the following:
[0078] 1. Control commands;
[0079] 2. Navigation commands;
[0080] 3. Control commands;
[0081] 4. Control commands.
[0082] Another example is when the input commands to the large language model are: "Open my car window," "I want to go to Shenzhen North," "Turn off the lights," and "The volume is too low," the output of the large language model could be:
[0083] According to the instruction classification information, "Open the car window for me" belongs to the control category because it involves the operation of vehicle hardware (air conditioning, windows, etc.), and the operation of this hardware usually needs to be completed through voice or manual controller.
[0084] "I want to go to Shenzhen North" is a navigation command because it describes a specific destination, and destination commands are usually used to guide users to a specific location or destination.
[0085] "Turn off the lights" is a control command because it involves controlling the interior lighting of the vehicle.
[0086] "The volume is too low" is a control command because it involves controlling the volume inside the car.
[0087] Optionally, in embodiments of this disclosure, the large language model can be used to provide online Artificial Intelligence Generated Content (AIGC) services.
[0088] Optionally, the training data (or training data) of the large language model can be updated online in real time, and the model parameters of the large language model can be updated based on the updated training data. This avoids the situation where "the source of training data for the large language model is limited or the model parameters of the large language model are not updated in a timely manner", and thus solves the technical problem of "incomplete bad case mining due to limited source of training data or untimely update of model parameters".
[0089] Meanwhile, since the large language model in this embodiment is online in real time, it can respond to the instruction data input into the large language model by the bad example mining device in real time to obtain the AI classification result, thereby ensuring that the bad example mining device can quickly mine bad examples based on the AI classification result, thus ensuring the efficiency of bad example mining.
[0090] Furthermore, it should be noted that in some embodiments, after the bad example mining device inputs instruction classification information into the large language model once, if the instruction classification information is not updated, the bad example mining device does not need to repeatedly input the instruction classification information into the large language model when inputting instruction data into the large language model to obtain AI classification results. Only when the instruction classification information is updated will the bad example mining device input the updated instruction classification information into the large language model.
[0091] Step S104: Based on the AI classification results of the instruction data and the online classification results corresponding to the instruction data, bad cases are identified.
[0092] Optionally, in some embodiments, the method for mining bad cases based on the AI classification results of the instruction data and the online classification results corresponding to the instruction data in step S104 above may include the following steps:
[0093] The process involves determining whether the AI classification result of the instruction data matches the corresponding online classification result. If the AI classification result matches the online classification result, the online classification result is correct, meaning the interactive device correctly parsed the instruction data. Therefore, the instruction data is considered a non-bad example and is initially included in the non-bad example set, which can then undergo further sampling and quality inspection. If the AI classification result does not match the online classification result, the online classification result is incorrect, meaning the interactive device did not correctly parsed the instruction data. In this case, the instruction data is considered a bad example and is initially included in the bad example set for subsequent optimization based on the bad examples in this set.
[0094] Among them, the method of "mining bad cases by comparing whether the AI classification results of the instruction data are consistent with the online classification results" can ensure the accuracy of bad case mining and the operation process is relatively simple.
[0095] In summary, the bad example mining method provided in this disclosure first acquires a data sample, which includes at least one instruction data historically input to an interactive device and the online classification result of the instruction data determined by the interactive device. Then, instruction classification information is determined, which is used to classify the instruction data. Next, at least one instruction data and the instruction classification information are input into a large language model, so that the large language model determines the AI classification result corresponding to each of the at least one instruction data based on the instruction classification information. Finally, bad examples are mined based on the AI classification result of the instruction data and the corresponding online classification result. Therefore, the bad example mining method of this disclosure requires no manual operation, which can minimize the data volume limitations caused by labor costs, realize large-scale data annotation and mining, improve the efficiency and accuracy of bad example mining, and also enhance the user experience.
[0096] Meanwhile, the large language model in this embodiment can realize online real-time updates of model training data, and can update the model parameters of the large language model based on the updated model training data, thus avoiding the situation of "limited sources of model training data for the large language model or untimely updates of model parameters for the large language model", thereby solving the technical problem of "incomplete bad case mining due to limited sources of model training data or untimely updates of model parameters".
[0097] Furthermore, the bad case mining method of this disclosure can be quickly integrated into existing test mining bad case projects to improve the user experience in human-computer interaction scenarios (the "human-computer interaction scenario" may include, for example, offline / online semantic parsing and data classification scenarios of engines such as in-vehicle intelligent voice assistants and smart speakers).
[0098] Figure 2 This is a flowchart of a bad case discovery method according to an embodiment of this disclosure, as follows: Figure 2 As shown, the method includes the following steps:
[0099] Step S201: Obtain data samples.
[0100] Step S202: Determine the instruction classification information.
[0101] Step S203: Input at least one instruction data and instruction classification information into the large language model.
[0102] For a detailed description of steps S201-S203, please refer to the above embodiment.
[0103] Step S204: Convert the online classification results and AI classification results into recognizable characters.
[0104] Optionally, the bad case discovery device may not be able to recognize the content of the online classification results and AI classification results mentioned above. Based on this, in some embodiments, the online classification results and AI classification results can be converted into recognizable characters (i.e., characters that can be recognized by the bad case discovery device), thereby ensuring that the bad case discovery device can subsequently discover bad cases based on the recognizable characters corresponding to the online classification results and the recognizable characters corresponding to the AI classification results.
[0105] Step S205: Based on the AI classification results of the instruction data, identifyable characters and online classification results of the instruction data, mine bad examples.
[0106] Optionally, it can be determined whether the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data. If they are consistent, it means that the online classification result is correct, that is, the interactive device has correctly parsed the instruction data, thus determining that the instruction data is a non-bad example. If they are inconsistent, it means that the online classification result is incorrect, that is, the interactive device has not correctly parsed the instruction data, thus determining that the instruction data is a bad example.
[0107] Among them, the method of "mining bad cases by comparing whether the AI classification results of the instruction data are consistent with the online classification results" can ensure the accuracy of bad case mining and the operation process is relatively simple.
[0108] In summary, the bad case mining method disclosed herein requires no manual operation, which can minimize the data volume limitation caused by labor costs, realize large-scale data annotation and mining, improve the efficiency and accuracy of bad case mining, and also enhance the user experience.
[0109] Meanwhile, the large language model in this embodiment can realize online real-time updates of model training data, and can update the model parameters of the large language model based on the updated model training data, thus avoiding the situation of "limited sources of model training data for the large language model or untimely updates of model parameters for the large language model", thereby solving the technical problem of "incomplete bad case mining due to limited sources of model training data or untimely updates of model parameters".
[0110] Figure 3 This is a flowchart of a bad case discovery method according to an embodiment of this disclosure, as follows: Figure 3 As shown, the method includes:
[0111] Obtain raw data from the data log;
[0112] The raw data is cleaned to obtain at least one instruction data.
[0113] Select the online classification results corresponding to each instruction data from the original data, and convert the online classification results into recognizable characters to obtain the recognizable characters corresponding to the online classification results;
[0114] Determine the instruction classification information, and input the instruction classification information and at least one instruction data into the large language model to obtain the AI classification result;
[0115] Convert the AI classification results into recognizable characters to obtain the recognizable characters corresponding to the AI classification results;
[0116] The accuracy of online classification is judged based on the online classification results and AI classification results of instruction data;
[0117] When it is determined that the online classification is accurate (i.e., the online classification result is correct), the instruction data is filtered into the non-bad example set for further sampling quality inspection.
[0118] When it is determined that the online classification is inaccurate (i.e., the online classification result is incorrect), the instruction data is filtered into the bad case set, and the R&D requirements are transformed based on the bad cases in the bad case set to optimize the bad cases.
[0119] Figure 4 This is a structural diagram of a bad sample mining apparatus according to an embodiment of this disclosure, as shown below. Figure 4 As shown, the bad sample excavation device includes:
[0120] The acquisition module is used to acquire data samples, the data samples including at least one instruction data historically input to the interactive device and the online classification result of the instruction data determined by the interactive device;
[0121] A determination module is used to determine instruction classification information, which is used to classify instruction data;
[0122] An input module is used to input the at least one instruction data and the instruction classification information into a large language model, so that the large language model can determine the AI classification results corresponding to the at least one instruction data based on the instruction classification information.
[0123] The mining module is used to mine bad cases based on the AI classification results of the instruction data and the online classification results corresponding to the instruction data.
[0124] The bad case mining device disclosed herein can mine bad cases without manual operation, which can minimize the data volume limitation caused by labor costs, realize large-scale data annotation and mining, improve the efficiency and accuracy of bad case mining, and enhance the user experience.
[0125] Meanwhile, the large language model in this embodiment can realize online real-time updates of model training data, and can update the model parameters of the large language model based on the updated model training data, thus avoiding the situation of "limited sources of model training data for the large language model or untimely updates of model parameters for the large language model", thereby solving the technical problem of "incomplete bad case mining due to limited sources of model training data or untimely updates of model parameters".
[0126] Optionally, the training data of the large language model can be updated in real time.
[0127] Optionally, the instruction classification information includes at least one of the following:
[0128] At least one instruction category;
[0129] Each instruction category has a corresponding category description; the category description is used to describe the characteristics of the instruction data belonging to that instruction category;
[0130] At least one example instruction corresponds to each instruction category.
[0131] Optionally, the mining module is further configured to:
[0132] Determine whether the AI classification result of the instruction data is consistent with the online classification result corresponding to the instruction data;
[0133] If the AI classification result of the instruction data is consistent with the online classification result corresponding to the instruction data, the instruction data is determined to be a non-bad example.
[0134] If the AI classification result of the instruction data is inconsistent with the online classification result corresponding to the instruction data, the instruction data is determined to be a bad example.
[0135] Optionally, the acquisition module is further configured to:
[0136] Obtain raw data from the data log;
[0137] The raw data is cleaned to obtain at least one instruction data.
[0138] Select the online classification results corresponding to each instruction data from the original data;
[0139] The at least one instruction data and the corresponding online classification results are determined as data samples.
[0140] Optionally, the device is further used for:
[0141] The online classification results and the AI classification results are converted into recognizable characters respectively.
[0142] Optionally, the mining module is further configured to:
[0143] Determine whether the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data;
[0144] If the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data, the instruction data is determined to be a non-bad example.
[0145] If the recognizable character corresponding to the AI classification result of the instruction data is inconsistent with the recognizable character corresponding to the online classification result of the instruction data, the instruction data is determined to be a bad example.
[0146] According to embodiments of this disclosure, this disclosure also provides an electronic device, a readable storage medium, and a computer program product.
[0147] Figure 5 A schematic block diagram of an example electronic device 500 that can be used to implement embodiments of the present disclosure is shown. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely illustrative and are not intended to limit the implementation of the present disclosure described and / or claimed herein.
[0148] like Figure 5 As shown, device 500 includes a computing unit 501, which can perform various appropriate actions and processes based on a computer program stored in read-only memory (ROM) 502 or a computer program loaded from storage unit 508 into random access memory (RAM) 503. RAM 503 may also store various programs and data required for the operation of device 500. The computing unit 501, ROM 502, and RAM 503 are interconnected via bus 504. Input / output (I / O) interface 505 is also connected to bus 504.
[0149] Multiple components in device 500 are connected to I / O interface 505, including: input unit 506, such as keyboard, mouse, etc.; output unit 507, such as various types of monitors, speakers, etc.; storage unit 508, such as disk, optical disk, etc.; and communication unit 509, such as network card, modem, wireless transceiver, etc. Communication unit 509 allows device 500 to exchange information / data with other devices through computer networks such as the Internet and / or various telecommunications networks.
[0150] The computing unit 501 can be various general-purpose and / or special-purpose processing components with processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various special-purpose artificial intelligence (AI) computing chips, various computing units running machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as the bad example mining method of the first aspect embodiment or the bad example mining method of the second aspect embodiment. For example, in some embodiments, the bad example mining method of the first aspect embodiment or the bad example mining method of the second aspect embodiment can be implemented as a computer software program, which is tangibly included in a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program can be loaded and / or installed on device 500 via ROM 502 and / or communication unit 509. When the computer program is loaded into RAM 503 and executed by the computing unit 501, one or more steps of the bad example mining method of the first aspect embodiment or the bad example mining method of the second aspect embodiment described above can be performed. Alternatively, in other embodiments, the computing unit 501 may be configured by any other suitable means (e.g., by means of firmware) to perform the bad case mining method of the first aspect embodiment or the bad case mining method of the second aspect embodiment.
[0151] Various embodiments of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), systems-on-a-chip (SoCs), payload-programmable logic devices (CPLDs), computer hardware, firmware, software, and / or combinations thereof. These various embodiments may include implementations in one or more computer programs that can be executed and / or interpreted on a programmable system including at least one programmable processor, which may be a dedicated or general-purpose programmable processor, capable of receiving data and instructions from a storage system, at least one input device, and at least one output device, and transmitting data and instructions to the storage system, the at least one input device, and the at least one output device.
[0152] The program code used to implement the methods of this disclosure may be written in any combination of one or more programming languages. This program code may be provided to a processor or controller of a general-purpose computer, special-purpose computer, or other programmable data processing apparatus, such that when executed by the processor or controller, the program code causes the functions / operations specified in the flowcharts and / or block diagrams to be implemented. The program code may be executed entirely on a machine, partially on a machine, as a standalone software package partially on a machine and partially on a remote machine, or entirely on a remote machine or server.
[0153] In the context of this disclosure, a machine-readable medium can be a tangible medium that may include or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium can be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium can be, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
[0154] To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having: a display device for displaying information to the user (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor); and a keyboard and pointing device (e.g., a mouse or trackball) through which the user provides input to the computer. Other types of devices can also be used to provide interaction with the user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form (including sound input, voice input, or tactile input).
[0155] The systems and technologies described herein can be implemented in computing systems that include backend components (e.g., as a data server), or computing systems that include middleware components (e.g., an application server), or computing systems that include frontend components (e.g., a user computer with a graphical user interface or web browser through which a user can interact with embodiments of the systems and technologies described herein), or any combination of such backend, middleware, or frontend components. The components of the system can be interconnected via digital data communication of any form or medium (e.g., a communication network). Examples of communication networks include local area networks (LANs), wide area networks (WANs), and the Internet.
[0156] Computer systems can include clients and servers. Clients and servers are generally located far apart and typically interact via communication networks. Client-server relationships are created by computer programs running on the respective computers and having a client-server relationship with each other. Servers can be cloud servers, servers in distributed systems, or servers incorporating blockchain technology.
[0157] It should be understood that the various forms of processes shown above can be used to rearrange, add, or delete steps. For example, the steps described in this disclosure can be executed in parallel, sequentially, or in different orders, as long as the desired result of the technical solution disclosed in this disclosure can be achieved, and this is not limited herein.
[0158] The specific embodiments described above do not constitute a limitation on the scope of protection of this disclosure. Those skilled in the art should understand that various modifications, combinations, sub-combinations, and substitutions can be made according to design requirements and other factors. Any modifications, equivalent substitutions, and improvements made within the spirit and principles of this disclosure should be included within the scope of protection of this disclosure.
Claims
1. A bad example mining method, characterized in that, include: Acquire data samples, the data samples including at least one instruction data historically input to the interactive device and the online classification result of the instruction data determined by the interactive device; Determine instruction classification information, which is used to classify instruction data. The instruction classification information includes a category description corresponding to each instruction category; the category description is used to describe the characteristics corresponding to the instruction data belonging to the instruction category. The at least one instruction data and the instruction classification information are input into a large language model so that the large language model determines the AI classification results corresponding to the at least one instruction data based on the instruction classification information. The large language model is online in real time, and when the instruction classification information input into the large language model has not been updated, it is not necessary to repeatedly input the instruction classification information into the large language model when inputting instruction data into the large language model to obtain AI classification results. Bad examples are identified based on the AI classification results of the instruction data and the corresponding online classification results of the instruction data. The method of mining bad examples based on the AI classification results of the instruction data and the corresponding online classification results of the instruction data includes: Determine whether the AI classification result of the instruction data is consistent with the online classification result corresponding to the instruction data; If the AI classification result of the instruction data is consistent with the online classification result corresponding to the instruction data, the instruction data is determined to be a non-bad example. If the AI classification result of the instruction data is inconsistent with the online classification result corresponding to the instruction data, the instruction data is determined to be a bad example.
2. The method according to claim 1, characterized in that, The training data for the large language model can be updated in real time.
3. The method according to claim 1 or 2, characterized in that, The instruction classification information also includes at least one of the following: At least one instruction category; At least one example instruction corresponds to each instruction category.
4. The method according to claim 1 or 2, characterized in that, The acquisition of data samples includes: Obtain raw data from the data log; The raw data is cleaned to obtain at least one instruction data. Select the online classification results corresponding to each instruction data from the original data; The at least one instruction data and the corresponding online classification results are determined as data samples.
5. The method according to claim 4, characterized in that, The method further includes: The online classification results and the AI classification results are converted into recognizable characters respectively.
6. The method according to claim 5, characterized in that, The method of mining bad examples based on the AI classification results of the instruction data and the corresponding online classification results of the instruction data includes: Determine whether the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data; If the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data, the instruction data is determined to be a non-bad example. If the recognizable character corresponding to the AI classification result of the instruction data is inconsistent with the recognizable character corresponding to the online classification result of the instruction data, the instruction data is determined to be a bad example.
7. A bad sample excavation device, characterized in that, include: The acquisition module is used to acquire data samples, the data samples including at least one instruction data historically input to the interactive device and the online classification result of the instruction data determined by the interactive device; A determination module is used to determine instruction classification information, which is used to classify instruction data. The instruction classification information includes a category description corresponding to each instruction category; the category description is used to describe the features corresponding to instruction data belonging to the instruction category. An input module is used to input the at least one instruction data and the instruction classification information into a large language model, so that the large language model determines the AI classification results corresponding to the at least one instruction data based on the instruction classification information. The large language model is online in real time, and when the instruction classification information input to the large language model has not been updated, it is not necessary to repeatedly input the instruction classification information into the large language model when inputting instruction data into the large language model to obtain AI classification results. The mining module is used to mine bad cases based on the AI classification results of the instruction data and the online classification results corresponding to the instruction data; The mining module is further used for: Determine whether the AI classification result of the instruction data is consistent with the online classification result corresponding to the instruction data; If the AI classification result of the instruction data is consistent with the online classification result corresponding to the instruction data, the instruction data is determined to be a non-bad example. If the AI classification result of the instruction data is inconsistent with the online classification result corresponding to the instruction data, the instruction data is determined to be a bad example.
8. The apparatus according to claim 7, characterized in that, The training data for the large language model can be updated in real time.
9. The apparatus according to claim 7 or 8, characterized in that, The instruction classification information also includes at least one of the following: At least one instruction category; At least one example instruction corresponds to each instruction category.
10. The apparatus according to claim 7 or 8, characterized in that, The acquisition module is further used for: Obtain raw data from the data log; The raw data is cleaned to obtain at least one instruction data. Select the online classification results corresponding to each instruction data from the original data; The at least one instruction data and the corresponding online classification results are determined as data samples.
11. The apparatus according to claim 10, characterized in that, The device is further used for: The online classification results and the AI classification results are converted into recognizable characters respectively.
12. The apparatus according to claim 11, characterized in that, The mining module is further used for: Determine whether the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data; If the recognizable character corresponding to the AI classification result of the instruction data is consistent with the recognizable character corresponding to the online classification result of the instruction data, the instruction data is determined to be a non-bad example. If the recognizable character corresponding to the AI classification result of the instruction data is inconsistent with the recognizable character corresponding to the online classification result of the instruction data, the instruction data is determined to be a bad example.
13. An electronic device, comprising: At least one processor; as well as A memory communicatively connected to the at least one processor; wherein, The memory stores instructions that can be executed by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer-readable storage medium storing computer instructions, wherein, The computer instructions are used to cause the computer to perform the method according to any one of claims 1-6.
15. A computer program product comprising a computer program that, when executed by a processor, implements the steps of the method according to any one of claims 1-6.