Order assigning intelligent elimination method and system based on large language model

By converting work order information into query vectors using a large language model and retrieving relevant clauses, and combining this with a distillation model to generate structured output results, the problem of low accuracy in judging work orders is solved, and efficient and accurate intelligent work order dispatch and rejection is achieved.

CN122242694APending Publication Date: 2026-06-19BEIJING XINGTIANDI INFORMATION TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING XINGTIANDI INFORMATION TECH CO LTD
Filing Date
2026-05-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

The low accuracy rate of judgment in the government service system leads to low judgment efficiency and inconsistent standards, which affects work efficiency.

Method used

A dispatch intelligent rejection method based on a large language model is adopted. The work order information is converted into query vectors through a vector embedding model, relevant terms are retrieved using a vector database, and structured output results are generated by combining a distillation model to realize automatic analysis and intelligent judgment of the appeal work orders.

Benefits of technology

It improved the accuracy and efficiency of judging work orders, reduced the ineffective use of resources, and enhanced the standardization and consistency of work order rejection.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242694A_ABST
    Figure CN122242694A_ABST
Patent Text Reader

Abstract

This application provides a method and system for intelligent task rejection based on a large language model, relating to the field of data processing technology for government service systems. After acquiring task information, the method uses a first model to convert the task information into a query vector, and then retrieves relevant clauses from a vector database based on the query vector. A dialogue context is then generated based on the relevant clauses and task information. Finally, a second model is used to generate a structured output result including result fields, reason fields, and basis fields based on the dialogue context. This method can perform intelligent task rejection based on a large language model. By combining vector retrieval and deep learning models, it achieves automatic analysis and intelligent judgment of task content, improving the accuracy and efficiency of task rejection, thereby solving the problem of low accuracy in judging request tasks in government service systems.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of data processing technology for government service systems, and in particular to a method and system for intelligent order rejection based on a large language model. Background Technology

[0002] The government service system processes a large number of service requests during its operation. These requests originate from various channels, including government service hotlines, the government service platform's online platform, and mobile applications (APPs), resulting in diverse sources and complex content. Service requests must be dispatched to the relevant authorities for processing according to current regulations. However, some service requests that meet the exclusion criteria can be removed from the dispatch process without being forwarded to the relevant authorities.

[0003] To determine whether a petition needs to be removed from the system, staff in the government service system can manually read and categorize each petition. However, this manual process is inefficient and inconsistent in its interpretation of removal criteria among staff, leading to a backlog of petitions and impacting government efficiency. Therefore, artificial intelligence technology can be used to automatically assess petition content.

[0004] For example, a multi-agent collaborative architecture can be used to break down the processing flow of complaint tickets into stages such as acceptance, classification, dispatch, handling, verification, and evaluation, thereby achieving automatic flow of complaint tickets. Specifically, when classifying complaint tickets, rule engines or machine learning methods can be used to extract key information from the complaint tickets and make classification judgments based on this information. However, the above-mentioned complaint ticket processing method is difficult to adapt to the dynamic updates of exclusion clauses, resulting in low accuracy in judgment. Summary of the Invention

[0005] In view of this, embodiments of this application provide a method and system for intelligent rejection of dispatch orders based on a large language model, in order to solve the problem of low accuracy in judging work orders in government service systems.

[0006] According to a first aspect of this application, a method for intelligent order rejection based on a large language model is provided, the method comprising: Obtain work order information, which includes the text content of the work orders to be processed received through the information sending and receiving interface of the government service system; The work order information is converted into a query vector using a first model, which is a vector embedding model based on a large language model. Relevant clauses are retrieved from a vector database based on the query vector. The vector database stores knowledge entries corresponding to the clauses to be removed. The knowledge entries are vector representations generated by vectorizing the clauses to be removed using the first model. The relevant clauses are retrieved based on the semantic similarity between the knowledge entries in the vector database and the query vector. A dialogue context is generated based on the relevant terms and the work order information, and the dialogue context is injected with preset system prompt words; The second model is used to generate a structured output result based on the dialogue context. The second model is a distillation model based on a large language model. The structured output result includes a result field, a reason field, and a basis field. The result field is used to indicate whether to dispatch the work order corresponding to the work order information.

[0007] In some embodiments, the method further includes: Obtain the exclusion clause document, which is used to store the exclusion clauses; Read the clause text from the removal clause document; The clause text is preprocessed according to preset preprocessing items to obtain clause input data; the preprocessing items include at least one of text cleaning, format conversion, and character processing. The first model is used to perform vectorization processing on the input data of the clause to generate knowledge entries corresponding to the clause to be removed; The knowledge entries are stored in the vector database.

[0008] In some embodiments, storing the knowledge entries in the vector database includes: A dataset is created in the vector database, and the dataset is defined with field dimensions; the field dimensions include a primary key ID, a vector field, and a scalar field; the vector field is used to store the knowledge entries; the scalar field is used to store the clause text; Insert the knowledge entries and the clause text into the data set; The index type and distance metric are specified for the knowledge entries in the dataset. The index type is used to characterize the retrieval methods supported by the knowledge entries. The distance metric includes a preset similarity threshold.

[0009] In some embodiments, the method further includes: Get the text of the new terms; The first model is used to vectorize the newly added clause text into a vector representation to obtain the new knowledge entry. Search the vector database for similar terms based on the newly added knowledge entries; Based on the query results of the similar terms, the vector database is updated using the newly added knowledge entry; wherein, when the similar terms are found, the knowledge entry of the similar terms is replaced with the newly added knowledge entry; when no similar terms are found, the newly added knowledge entry is added to the vector database.

[0010] In some embodiments, retrieving relevant terms from a vector database based on the query vector includes: Calculate the cosine similarity between the query vector and the knowledge entry representation in the vector database; Candidate clauses are determined based on a preset similarity threshold and the cosine similarity, wherein the candidate clauses are the removal clauses corresponding to the knowledge entries whose cosine similarity is greater than or equal to the similarity threshold; The candidate clauses are arranged in descending order of cosine similarity to obtain a sequence of candidate clauses. Extract the relevant clause from the sequence of alternative clauses.

[0011] In some embodiments, a dialogue context is generated based on the relevant terms and the work order information, including: Combine the relevant terms and the work order information into a query context; Obtain preset system prompts, including basis prompts, intent prompts, and output method prompts; The system prompt word is injected into the query context to obtain the dialogue context.

[0012] In some embodiments, the system prompt word is injected into the query context to obtain the dialogue context, including: Read the work order information and related terms from the query context; Based on the work order information and the relevant terms, the injection parameters of the query context are detected. The injection parameters include the injectable position and the prompt word type corresponding to the injectable position. The system prompt is injected into the query context according to the injection parameters to obtain the dialogue context.

[0013] In some embodiments, a second model is used to generate structured output results based on the dialogue context, including: Set the temperature parameters of the second model, wherein the temperature parameters are positive numbers less than or equal to a preset temperature threshold; The dialogue context is input into the second model to calculate the logical value of the dialogue context relative to the classification label using the second model; The logic value is corrected using the temperature parameter, and the classification probability is calculated based on the corrected logic value. The structured output result is generated based on the classification probability.

[0014] In some embodiments, generating the structured output result based on the classification probability includes: The result label is extracted according to the classification probability, and the result label is the classification label with the highest classification probability; The result field is generated based on the result tag; The basis field is generated based on the relevant clauses, and the reason field is generated based on the basis field and the work order information; Obtain the output format of the structured output result; The result field, the reason field, and the basis field are combined into the structured output result according to the output format.

[0015] According to a second aspect of this application, a dispatch intelligent rejection system based on a large language model is provided, the system comprising: The work order acquisition module is used to acquire work order information, which includes the text content of the work orders to be processed received through the information sending and receiving interface of the government service system. The query vector module is used to convert the work order information into query vectors using a first model, wherein the first model is a vector embedding model based on a large language model. The retrieval module is used to retrieve relevant clauses from a vector database based on the query vector. The vector database stores knowledge entries corresponding to the clauses to be removed. The knowledge entries are vector representations generated by vectorizing the clauses to be removed using the first model. The relevant clauses are retrieved based on the semantic similarity between the knowledge entries in the vector database and the query vector. The context generation module is used to generate a dialogue context based on the relevant terms and the work order information, and the dialogue context is injected with preset system prompt words; The result output module is used to generate structured output results using a second model based on the dialogue context. The second model is a distillation model based on a large language model. The structured output results include a result field, a reason field, and a basis field. The result field is used to indicate whether to dispatch the work order corresponding to the work order information.

[0016] According to a third aspect of this application, a computer device is provided, including a storage medium, a processor, and a computer program stored on the storage medium and executable on the processor, wherein the processor executes the program to implement the above-described intelligent order rejection method based on a large language model.

[0017] According to a fourth aspect of this application, a storage medium is provided on which a computer program is stored, which, when executed by a processor, implements the above-described intelligent order rejection method based on a large language model.

[0018] By employing the above technical solutions, embodiments of this application provide a method and system for intelligent work order rejection based on a large language model. After obtaining work order information, the method uses a first model to convert the work order information into a query vector, and then retrieves relevant clauses from a vector database based on the query vector. A dialogue context is then generated based on the relevant clauses and work order information. Finally, a second model is used to generate a structured output result including result fields, reason fields, and basis fields based on the dialogue context. This method can perform intelligent work order rejection based on a large language model. By combining vector retrieval and deep learning models, it achieves automatic analysis and intelligent judgment of work order content, improving the accuracy and efficiency of work order rejection, thereby solving the problem of low accuracy in judging work orders in government service systems.

[0019] The above description is only an overview of the technical solution of this application. In order to better understand the technical means of this application and to implement it in accordance with the contents of the specification, and to make the above and other objects, features and advantages of this application more obvious and understandable, the following are specific embodiments of this application. Attached Figure Description

[0020] The accompanying drawings, which are included to provide a further understanding of this application and form part of this application, illustrate exemplary embodiments and are used to explain this application, but do not constitute an undue limitation of this application. In the drawings: Figure 1 A schematic diagram of the intelligent order rejection method based on a large language model provided in this application embodiment; Figure 2 This is a schematic diagram of the overall process of the intelligent order rejection method provided in the embodiments of this application; Figure 3 This is a schematic diagram of the work order processing flow provided in the embodiments of this application; Figure 4 This is a schematic diagram illustrating the principle of vector semantic retrieval provided in the embodiments of this application; Figure 5 This is a schematic diagram of the process for generating structured output results provided in an embodiment of this application; Figure 6 This is a schematic diagram of the vector database update process provided in an embodiment of this application; Figure 7 This is a schematic diagram of the intelligent order rejection system based on a large language model provided in an embodiment of this application. Detailed Implementation

[0021] The present application will be described in detail below with reference to the accompanying drawings and embodiments. It should be noted that, unless otherwise specified, the embodiments and features described in the embodiments of the present application can be combined with each other.

[0022] In this embodiment, the intelligent rejection method for dispatching requests based on a large language model can be applied to government service systems. These systems process a large number of service requests during operation. These requests originate from various channels, including government service hotlines, the government service platform's online platform, and mobile applications (APPs), resulting in diverse sources and complex content. Service requests need to be dispatched to relevant authorities for processing according to current regulations. However, some service requests that meet the rejection criteria can be rejected without being dispatched to relevant authorities.

[0023] For example, in police work scenarios, some work orders received by the police dispatch system should be transferred to relevant agencies for processing according to current regulations. However, according to relevant exclusion rules, work orders that are recurring complaints about noise pollution, animal issues, dissatisfaction with enforcement or handling, inquiries about case progress, or commendations within the same area in a short period of time actually fall into the category of non-police-related or do not require police intervention and can be excluded from dispatch. However, a large number of these excluding work orders still flood into the police dispatch system, resulting in the ineffective use and waste of police resources.

[0024] To determine whether a petition needs to be removed from the system, staff in the government service system can manually read and categorize each petition. However, this manual process is inefficient and inconsistent in its interpretation of removal criteria among staff, leading to a backlog of petitions and impacting government efficiency. Therefore, artificial intelligence technology can be used to automatically assess petition content.

[0025] For example, a multi-agent collaborative architecture can be used to break down the processing flow of complaint tickets into stages such as acceptance, classification, dispatch, handling, verification, and evaluation, thereby achieving automatic flow of complaint tickets. Specifically, when classifying complaint tickets, rule engines or machine learning methods can be used to extract key information from the complaint tickets and make classification judgments based on this information. However, the above-mentioned complaint ticket processing method is difficult to adapt to the dynamic updates of exclusion clauses, resulting in low accuracy in judgment.

[0026] To address the issue of low accuracy in judging petitions in government service systems, this application provides a method for intelligent petition rejection based on a large language model in some embodiments. This method utilizes a Retrieval-Augmented Generation (RAG) architecture, combining vector retrieval, a vector database, and a large language model to perform intelligent petition rejection. This allows for automatic analysis and judgment of various petitions, achieving intelligent identification and rejection of petitions, significantly improving the accuracy and timeliness of petition processing, reducing ineffective resource consumption, and enhancing the standardization and consistency of petition rejection work.

[0027] The method can be applied to government service systems or electronic devices that establish a communication connection with the government service system and have data processing capabilities. These electronic devices include, but are not limited to, computers, servers, mobile terminals, smart wearable devices, and industrial control machines. For ease of description, this application embodiment uses a government service system as the executing entity of the method. It should be understood that the method can also be applied to other types of executing entities, which are not illustrated in this application embodiment. Figure 1 As shown, the method includes: S101. Obtain work order information.

[0028] When performing intelligent rejection of dispatched work orders, the work order information can be obtained first. This work order information includes the text content of the work orders to be processed received through the government service system's information sending and receiving interface. The work order information can be the original text information of the request work orders received by the government service system through the information sending and receiving interface, or it can be the text information obtained by extracting information from the request work orders.

[0029] To obtain work order information, in some embodiments, the government service system's message sending and receiving interface can be monitored periodically to actively retrieve request work order data. The dispatch status of each request work order in the data is then read, and pending work orders are extracted based on the dispatch status. A list of pending work orders is then generated based on the order of receipt time or the urgency of the request. Finally, text content is extracted from each request work order according to the order in the pending work order list to obtain the work order information.

[0030] For example, for government service systems used in police work, the police work system can receive work order data pushed by external systems or superior platforms through the Representational State Transfer Application Programming Interface (RESTful API) or Message Queue (MQ). The data interaction format can adopt the JavaScript Object Notation (JSON) standard.

[0031] When obtaining work order information, work order data can be actively retrieved through periodic polling. Then, the JSON structured work order data can be parsed to extract key text information of the request work order. Key text information may include work order number, request title, request content, source channel, urgency level, processing time limit, reporter information, attachments, etc.

[0032] S102. Use the first model to convert the work order information into a query vector.

[0033] like Figure 2 As shown, after obtaining the work order information, it can be input as text into the first model, which then converts the work order information into a query vector. The first model is a vector embedding model based on a large language model. Vector embedding models can convert text into vector representations, ensuring that semantically similar texts are close in the vector space. Therefore, the first model can vectorize the input text information to generate corresponding high-dimensional vector representations.

[0034] In some embodiments, as a vector embedding model, the first model may include an input layer, an embedding layer, an encoder, and an output layer. The input layer translates the raw input text into numerical IDs that the vector embedding model can initially understand. Specifically, the input layer can perform tokenization, breaking down the raw input text into multiple tokens, then indexing them, and using a vocabulary to map each token to an integer ID.

[0035] The embedding layer can convert discrete lexical symbols into continuous initial vectors. The embedding layer can then use the integer IDs generated by the input layer to look up the vector in the row corresponding to the integer ID in a pre-constructed lookup table, thus obtaining the initial word embedding vector. The lookup table is a matrix of vocabulary size multiplied by the vector dimension.

[0036] The encoder, through a sophisticated attention mechanism, allows the initial word embedding vectors generated by the embedding layer to interact and incorporate contextual semantics, generating deeper, high-dimensional vector representations. Therefore, the encoder can generate query, key, and value vectors through a self-attention mechanism module. Based on the query vector corresponding to a word, it matches it with the key vectors of all words in the sentence to calculate an attention score. Then, the attention scores are used to weight and sum the value vectors of all words to obtain a vector representation containing contextual information. Finally, a feedforward network performs nonlinear transformations and feature extraction on the vector representation output by the self-attention module, outputting a high-dimensional vector representation.

[0037] The output layer can use pooling to take an average of all the token vectors output by the encoder, and use this averaged vector as the final semantic representation of the entire sentence.

[0038] As can be seen, based on the vectorization processing function of the first model, by inputting work order information into the first model, the vectorized processing result output by the first model can be obtained and used as the query vector for the Retrieval Enhancement Generation (RAG) process. The Retrieval Enhancement Generation process improves the accuracy and interpretability of the answer by combining an external knowledge base with a language model and referencing accurate knowledge sources when answering questions.

[0039] The first model can be a vector embedding model built for government service systems, or it can be a vector embedding model formed by reusing a large language model. For example, the first model can be the QW3-Embedding model, which is a vector embedding model trained on the QW3 large language model and specifically used for text representation, semantic retrieval, and ranking tasks.

[0040] To generate query vectors, after obtaining the work order information, the work order information can be input into the Qwen3-Embedding model, which will then convert the text content corresponding to the work order information into query vectors.

[0041] S103. Retrieve relevant terms from the vector database based on the query vector.

[0042] like Figure 3 As shown, after converting work order information into query vectors, relevant clauses can be retrieved from the vector database based on the query vectors. The vector database stores knowledge entries corresponding to the clauses to be removed. The vector database can efficiently store and retrieve high-dimensional vector data, and quickly find the document fragments most relevant to the query content through similarity calculation.

[0043] The knowledge entries stored in the vector database are vector representations generated by vectorizing the exclusion clauses using the first model. That is, for the first model, when different types of text information are input, vectorization can yield vector representations with different functions. In some embodiments, the first model can also be used to generate vector representations of exclusion clauses for constructing the vector database. Then, in the vector database, the vector representations of exclusion clauses can serve as knowledge entries for enhanced retrieval generation.

[0044] In order to build a vector database, the government service system can first obtain a document of exclusion clauses, wherein the document of exclusion clauses is used to store exclusion clauses, and each exclusion case is stored as an independent knowledge entry by reading the document of exclusion clauses.

[0045] The clause text is then read from the removed clause document and preprocessed according to preset preprocessing items to obtain clause input data. The preprocessing items include at least one of text cleaning, format conversion, and character processing.

[0046] For example, before embedding the clause text, text cleaning can be performed to remove irrelevant characters and prevent the introduction of meaningless vector dimensions. This includes removing HTML tags, Markdown marks, standardizing full-width and half-width characters, and removing invisible control characters. After text cleaning, the clause text can be formatted correctly, such as standardizing capitalization for English content to prevent the model from treating the same content as different words. Then, redundant whitespace can be removed by merging consecutive spaces into one, removing leading and trailing spaces, and removing full-width spaces to reduce invalid tokens and save computational resources. When the clause text contains special characters, the decision to retain or delete these special characters can be made based on the scenario.

[0047] In addition to the preprocessing content mentioned above, word processing can also be performed according to specific business areas, such as removing stop words, stemming, word form restoration, sentence or paragraph segmentation, and retaining special tags, so as to process the clause text into a text data form that the first model can recognize.

[0048] Then, the first model is used to perform vectorization processing on the clause input data to generate knowledge entries corresponding to the removed clauses, and the knowledge entries are then stored in the vector database. When storing knowledge entries in the vector database, a data set can be created in the vector database, wherein the data set is defined with field dimensions; the field dimensions include primary key ID, vector fields, and scalar fields; the vector fields are used to store knowledge entries; and the scalar fields are used to store clause text.

[0049] Then, the knowledge entries and clause texts are inserted into the dataset, and an index type and distance metric are specified for the knowledge entries in the dataset. The index type is used to characterize the retrieval methods supported by the knowledge entries; the distance metric includes a preset similarity threshold.

[0050] For example, vector databases such as Milvus are open-source vector databases specifically designed for storing, indexing, and retrieving vector data generated by embedding models. To store the vector representations corresponding to the generated clause texts in the vector database and establish a searchable vector index, the government service system can, after ensuring the pymilvus library is installed, first connect to the vector database server and then use a Python client to connect to the Milvus instance. Next, a collection is created. When creating the collection, a data schema needs to be defined, specifying the field types and dimensions. Through field planning, the field dimensions are determined to include a primary key ID, vector fields, and scalar fields. Vector fields are used to store the embedding vector representations generated by the first model, serving as knowledge entries. Scalar fields are used to store the original text or metadata corresponding to the clause texts.

[0051] Next, the vector representations of the clause texts generated by the first model, along with the original text information of the clause texts, are inserted into the dataset to obtain multiple knowledge entries. Each knowledge entry is a list of dictionaries, and each dictionary contains predefined fields such as (id, vector, text). After inserting the data into the dataset, the index type and loading method need to be specified for the vector database. This involves specifying the index type and distance metric by creating an index, and loading the index into memory for high-speed retrieval. The index type (index_type) determines the retrieval speed and accuracy; examples include brute-force search, clustering-based indexes, and graph-based indexes. The distance metric (metric_type) serves as a standard for measuring vector similarity; preset similarity thresholds can include cosine distance (COSINE) threshold, Euclidean distance (L2) threshold, and inner product (IP) threshold.

[0052] As can be seen, the method can construct a vector database for the removal clauses as a dedicated knowledge base. It can build more than 500 removal clauses into a searchable structured knowledge base, and use text vectorization technology to independently encode and store each removal case. It establishes a semantic retrieval mechanism based on cosine similarity, enabling the system to understand work order content that is semantically similar but expressed differently, thus overcoming the limitations of keyword matching.

[0053] Relevant terms retrieved from a vector database based on a query vector can be obtained by using the semantic similarity between the knowledge entries in the vector database and the query vector. That is, for example... Figure 4 As shown, in some embodiments, when retrieving relevant terms from a vector database based on a query vector, the cosine similarity between the query vector and the knowledge entry representation in the vector database can be calculated first.

[0054] For example, cosine similarity can be used to calculate the semantic similarity between text vectors. For an n-dimensional query vector... A = ( a 1, a 2, ..., a n ) and n-dimensional knowledge entry vector B = ( b 1, b 2, ..., b n The formula for calculating cosine similarity is:

[0055] in, A Represents the query vector; Indicates the magnitude of the query vector; B Represents a knowledge entry vector; This represents the magnitude of the knowledge entry vector. The cosine similarity value ranges from [-1, 1], and a higher cosine similarity value indicates that the two texts are more semantically similar.

[0056] After calculating the cosine similarity, a preset similarity threshold can be obtained, and candidate clauses can be determined based on the preset similarity threshold and the cosine similarity. The candidate clauses are the removal clauses corresponding to knowledge entries whose cosine similarity is greater than or equal to the similarity threshold.

[0057] The candidate clauses are then sorted in descending order of cosine similarity to obtain a sequence of candidate clauses. Relevant clauses are then extracted from this sequence, where the relevant clauses are either the highest-ranking candidate clauses or the first predetermined number of candidate clauses in the sequence.

[0058] For example, by setting the chat mode to query mode and using the Qwen3-Embedding model to convert the work order content into query vectors, the most relevant exclusion clauses can be retrieved from the Milvus vector database. In the vector retrieval process, the Qwen3-Embedding model can first be used to convert the work order text into query vectors. A And calculate the query vector in the Milvus vector database. A With all knowledge entry vectors in the vector database Bi The cosine similarity is then used to rank the knowledge entries in descending order, and the Top-K (e.g., K=5) most relevant knowledge entries are selected as the search results to obtain relevant terms.

[0059] By employing context building and knowledge referencing mechanisms within the query mode, a query mode can be used instead of a chat mode during the vector retrieval stage, ensuring retrieval accuracy. Furthermore, the retrieved Top-K relevant exclusion clauses can be combined with the original work order content to construct a complete query context, enabling the model to accurately reference specific legal bases during inference, thus addressing the issues of lack of professional knowledge support and poor interpretability in general-purpose large language models.

[0060] S104. Generate a dialogue context based on relevant terms and work order information.

[0061] After retrieving and identifying relevant terms, context construction can be performed, that is, generating a dialogue context based on the relevant terms and work order information. This dialogue context is injected with preset system prompts. During context construction, the retrieved relevant terms and work order information are first combined to construct a complete query context. Then, preset system prompts are injected into the query context to obtain the dialogue context.

[0062] To generate a dialogue context, in some embodiments, when generating the dialogue context based on relevant terms and work order information, the relevant terms and work order information can first be combined into a query context, and then preset system prompts can be obtained. These system prompts include basis prompts, intent prompts, and output method prompts. The system prompts are then injected into the query context to obtain the dialogue context.

[0063] Furthermore, when injecting system prompts into the query context, the work order information and related terms in the query context can be read first, and the injection parameters of the query context can be detected based on the work order information and related terms. These injection parameters include injectable positions and the prompt type corresponding to the injectable positions. Then, the system prompts are injected into the query context according to the injection parameters to obtain the dialogue context.

[0064] For example, when generating dialogue context, the police work system can, in conjunction with work order information, generate query context for clauses such as "the incident belongs to noise disturbance, canine issues, dissatisfaction with the crackdown or handling, inquiry about case progress, and commendation-related situations can be directly excluded." It can also obtain preset system prompts, including: basis prompts such as "the knowledge base shows the basis for…" and "if…, the above types of basis correspond to the situation"; intent prompts such as "remove the dispatch order" and "analyze whether the incident can be excluded based on the knowledge base"; and output method prompts such as "output according to…".

[0065] Then, based on the position of relevant clauses and work order information in the query context, determine the injectable positions and the corresponding prompt word types. For example, a basis prompt word can be injected in the vicinity of the relevant clauses; an intent prompt word can be injected within the basis prompt word; and an output method prompt word can be injected after the intent prompt word. Therefore, the content obtained is: "The knowledge base contains grounds for removing work orders. Based on the knowledge base analysis, whether an event can be removed is determined. If the event falls under the categories of noise pollution, canine issues, dissatisfaction with enforcement or handling, querying case progress, or commendation, it can be directly removed. The above types of grounds correspond to the respective situations. Output only three fields: result, reason, and basis, in JSON format: {"Result":<true|false> The dialogue context of ", reason":<remove reason>, basis":<remove basis>}.

[0066] It is evident that by using scenario-specific prompt word engineering and structured output design, dedicated system prompt words can be designed for dispatch rejection scenarios, clearly guiding the second model to identify rejection situations. Furthermore, by constraining the structured output format, the model output can include fields such as "result" (Boolean value), "reason" (rejection reason), and "basis" (corresponding legal clauses), achieving machine-readable and interpretable judgment.

[0067] S105. Use the second model to generate structured output results based on the dialogue context.

[0068] like Figure 5 As shown, after generating the dialogue context, a second model can be used to generate structured output results based on the dialogue context. This second model is a distillation model based on a large language model. The second model can perform model inference, that is, based on the input dialogue context, it infers whether the work order information meets the relevant terms, and outputs structured output results based on the inference result.

[0069] Similar to the first model, the second model can also be a reasoning model specifically built for government service systems, or a distilled model formed by knowledge distillation or reuse of a large language model. For example, the second model is the DEEPSEEK-R1-DISTILL-QWEN:32B model. The DEEPSEEK-R1-DISTILL-QWEN:32B model is a distilled model in the DeepSeek-R1 series. It can inherit the strong reasoning capabilities of a large language model at a scale of approximately 32 billion parameters through knowledge distillation techniques, achieving a balance between performance and resource consumption.

[0070] It is evident that the government service system can achieve intelligent dispatch rejection based on the RAG architecture and dual-model collaboration. By constructing a knowledge-based driven architecture specifically for government dispatch rejection scenarios, semantic encoding is performed using the Qwen3-Embedding model, efficient similarity retrieval is achieved by combining the Milvus vector database, and reasoning and judgment are performed based on the retrieval results using the DEEPSEEK-R1-DISTILL-QWEN:32B large language model, forming a complete technical closed loop of vector retrieval and knowledge enhancement generation.

[0071] In some embodiments, the second model is based on the large language model and adopts the same model architecture as the large language model. It performs knowledge distillation through supervised fine-tuning (SFT) to construct a lightweight distillation model. The second model uses the large language model as the teacher model and trains the student model with 800,000 inference samples with detailed thought chains generated by the DS-R1 model through supervised fine-tuning, thereby obtaining the lightweight distillation model, which serves as the second model.

[0072] The structured output generated by the second model based on the dialogue context can include a result field, a reason field, and a basis field. The result field indicates whether to dispatch a work order corresponding to the work order information.

[0073] For example, structured output can be formatted as JSON and includes three fields: result, reason, and basis. Therefore, the content of structured output would be: {"Result":}<true|false> "Reason": <Reason removed>, "Basis": <Basis removed>}.

[0074] In some embodiments, when using the second model to generate structured output results based on the dialogue context, the temperature parameter of the second model can be set first, wherein the temperature parameter is a positive number less than or equal to a preset temperature threshold.

[0075] The temperature parameter (Temperature) can be used to control the randomness of the second model's output. The relationship between the temperature parameter T and the classification probability is as follows:

[0076] in, Indicates selecting the first The probability of classifying each token. Indicates the first The logit value of a token, which is the raw score of the second model's output layer before softmax normalization, reflects the second model's confidence in that token; T represents the temperature parameter. When the temperature parameter T approaches 0, the second model tends to select the word with the highest probability, and the output is more certain; when the temperature parameter T increases, the output of the second model becomes more random.

[0077] Since the temperature parameter controls the sharpness of the probability distribution, when T approaches 0, the softmax output approaches a one-hot vector, indicating complete determinism; while when T approaches infinity, the softmax output approaches a uniform distribution, indicating complete randomness. Therefore, the temperature parameter of the second model can be set to T=0.1, which is a low-temperature setting, making the probability distribution of the second model's output more concentrated and ensuring the stability and consistency of the judgment results. Correspondingly, order rejection is a classification judgment task that requires deterministic output. Low temperature ensures that the same input produces the same output, guaranteeing judgment consistency. Therefore, setting the temperature parameter to 0.1 can avoid the problem of inconsistent judgment results for the same work order due to randomness.

[0078] After setting the temperature parameters for the second model, the dialogue context is input into the second model to calculate the logical value of the dialogue context relative to the classification label. Then, the temperature parameters are used to correct the logical value, and the classification probability is calculated based on the corrected logical value. Finally, a structured output result is generated based on the classification probability.

[0079] In some embodiments, when generating structured output results based on classification probabilities, result tags can be extracted first according to the classification probabilities, wherein the result tags are the classification tags with the highest classification probabilities. Then, result fields are generated based on the result tags. Next, a basis field is generated based on relevant clauses, and a reason field is generated based on the basis field and work order information. By obtaining the output format of the structured output results, the result fields, reason fields, and basis fields are combined into a structured output result according to the output format.

[0080] For example, a government service system receives a work order stating, "Residents complain about noise pollution from construction sites at night." RAG (Rich Internet Query Tool) searches the system and identifies the relevant clause as: "Noise pollution complaints can be directly deleted." Therefore, after combining the work order information and relevant clauses into a dialogue context and inputting it into a second model, the second model can perform large-scale inference, determining whether the work order falls into the category of removable requests based on rules, and generating a JSON output. The JSON output, as structured data from the model, can contain three fields: result (true), indicating removal; reason (explaining the decision, "Noise pollution complaints can be directly deleted"); and basis (citing the basis for removal, "Noise pollution can be removed"). This allows the large-scale model to automatically identify and filter out specific types of work orders, reducing manual processing.

[0081] By applying the technical solutions of the above embodiments, the intelligent order rejection method based on a large language model described in the above embodiments can improve the accuracy of order rejection judgment. By introducing a vector database as a domain knowledge base, the general capabilities of the large language model are combined with specific business rules. Compared with rule engines or statistical patterns that simply rely on keyword matching, the method can handle complex cases with similar semantics but different business meanings, and effectively suppresses model illusions and reduces the false judgment rate through retrieval enhancement.

[0082] The method also ensures the stability of the judgment process. Since a general-purpose large language model may produce different results when reasoning about the same work order multiple times due to variations in temperature parameter settings, this method uses knowledge base retrieval results as contextual constraints, combined with a deterministic retrieval mechanism. Therefore, it can ensure consistent output for the same input, meeting the deterministic requirements of the business process.

[0083] The method can also enhance the interpretability of the model output. By providing clear knowledge base references for each exclusion clause, such as specific legal clauses and historical case numbers, it can alleviate the pain point of black-box decision-making in deep learning methods and facilitate business personnel review and system iteration optimization.

[0084] The method can also maintain a balance between accuracy and efficiency. Although the introduction of vector retrieval increases the number of calculation steps, the index optimization and caching mechanism can significantly improve the processing speed while maintaining a high accuracy rate. The processing time for a single work order is reduced from minutes to seconds, which is far more efficient than manual review, thus achieving a balance between accuracy and efficiency.

[0085] In some embodiments, as a refinement and extension of the specific implementation of the above embodiments, and to fully illustrate the specific implementation process of this embodiment, some embodiments of this application also provide a method for intelligent order rejection based on a large language model. The difference between this method and the above embodiments is that it can update the vector database according to newly added terms, such as... Figure 6 As shown, the method includes: S201. Obtain the text of the newly added clause; S202. Use the first model to vectorize the newly added clause text into a vector representation to obtain the new knowledge entries; S203. Search for similar terms in the vector database according to the newly added knowledge entries; S204. Based on the query results for similar terms, update the vector database using the newly added knowledge entries.

[0086] During the maintenance of the vector database, the government service system can receive new clause documents sent by the operations and maintenance terminal and read the new clause text from these documents. After preprocessing, the new clause text can be input into the first model, which then vectorizes the text into a vector representation to obtain the new knowledge entry.

[0087] Next, similar clauses are searched in the vector database for each newly added knowledge entry. Similar clauses are those with content identical or similar to the newly added clause. Therefore, similar clauses can also be found by calculating the semantic similarity between the newly added knowledge entry and knowledge entries in the vector database, and based on a preset semantic similarity threshold for the newly added clause. Knowledge entries with a semantic similarity greater than or equal to the semantic similarity threshold are identified as similar clauses. For example, if the newly added clause states "Noise disturbance complaints in the same area within one week can be directly deleted," then by calculating the similarity between the newly added knowledge entry and knowledge entries in the vector database, the knowledge entry containing "Noise disturbance complaints can be directly deleted" is determined to be the most semantically similar to the newly added clause, and therefore, the similar clause is determined to be "Noise disturbance complaints can be directly deleted."

[0088] Then, based on the search results for similar terms, the vector database is updated using newly added knowledge entries. Specifically, when similar terms are found, the knowledge entries for the similar terms are replaced with newly added knowledge entries; when no similar terms are found, the newly added knowledge entries are added to the vector database.

[0089] For example, for a new clause stating "Noise disturbance complaints in the same area within one week can be directly deleted," if a knowledge entry corresponding to this clause is found in the vector database, the new knowledge entry can replace the knowledge entry for a similar clause, thus updating the individual knowledge entry. Conversely, if no removal clause related to noise disturbance is found in the vector database, the new knowledge entry can be added to the vector database, thus updating the vector database.

[0090] By applying the technical solutions of the above embodiments, the intelligent order rejection method based on a large language model described in the above embodiments can achieve a zero-sample adaptive update mechanism without the need for labeled data. System capabilities are expanded through a pure knowledge base update method; when adding rejection clauses, only a vectorized document needs to be appended to the knowledge base for it to take effect, without the need to retrain the model or prepare a large amount of labeled data, significantly reducing system maintenance costs and dependence on historical data.

[0091] In some embodiments, as a specific implementation of the intelligent order rejection method based on a large language model described in the above embodiments, some embodiments of this application also provide an intelligent order rejection system based on a large language model, such as... Figure 7 As shown, the system includes: The work order acquisition module is used to acquire work order information, which includes the text content of the work orders to be processed received through the information sending and receiving interface of the government service system. The query vector module is used to convert the work order information into query vectors using a first model, wherein the first model is a vector embedding model based on a large language model. The retrieval module is used to retrieve relevant clauses from a vector database based on the query vector. The vector database stores knowledge entries corresponding to the clauses to be removed. The knowledge entries are vector representations generated by vectorizing the clauses to be removed using the first model. The relevant clauses are retrieved based on the semantic similarity between the knowledge entries in the vector database and the query vector. The context generation module is used to generate a dialogue context based on the relevant terms and the work order information, and the dialogue context is injected with preset system prompt words; The result output module is used to generate structured output results using a second model based on the dialogue context. The second model is a distillation model based on a large language model. The structured output results include a result field, a reason field, and a basis field. The result field is used to indicate whether to dispatch the work order corresponding to the work order information.

[0092] By applying the technical solutions of the above embodiments, the intelligent dispatch rejection system based on a large language model described in the above embodiments, after obtaining work order information, can use a first model to convert the work order information into query vectors, and retrieve relevant clauses in the vector database based on the query vectors. Then, a dialogue context is generated based on the relevant clauses and work order information, and a second model is used to generate a structured output result including result fields, reason fields, and basis fields based on the dialogue context. The system can perform intelligent dispatch rejection based on a large language model. By combining vector retrieval and deep learning models, it achieves automatic analysis and intelligent judgment of work order content, improving the accuracy and efficiency of dispatch rejection, thereby solving the problem of low accuracy in judging request work orders in government service systems.

[0093] It should be noted that other corresponding descriptions of the functional units involved in the intelligent order rejection system based on a large language model provided in the embodiments of this application can be found in the corresponding descriptions in the intelligent order rejection method based on a large language model provided in the above embodiments, and will not be repeated here.

[0094] This application also provides a computer device, specifically a personal computer, server, network device, etc. The computer device includes a bus, processor, memory, and communication interface, and may also include input / output interfaces and a display device. The processor of the computer device provides computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium. The database of the computer device stores location information. The network interface of the computer device is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements the steps in the various method embodiments.

[0095] Those skilled in the art will understand that the structure of the computer device described above is only a partial structure related to the solution of this application, and does not constitute a limitation on the computer device to which the solution of this application is applied. A specific computer device may include more or fewer components, or combine certain components, or have different component arrangements.

[0096] In one embodiment, a computer-readable storage medium is also provided, which may be non-volatile or volatile, and a computer program is stored thereon, which, when executed by a processor, implements the steps in the above method embodiments.

[0097] In one embodiment, a computer program product is also provided, including a computer program that, when executed by a processor, implements the steps in the above method embodiments.

[0098] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.

[0099] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium. When the computer program is executed, it can include the processes of the embodiments of the above methods.

[0100] Any references to memory, database, or other media used in the embodiments provided in this application may include at least one of non-volatile and volatile memory. Non-volatile memory may include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc.

[0101] Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM can take many forms, such as static random access memory (SRAM) or dynamic random access memory (DRAM).

[0102] The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, graphics processors, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0103] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0104] The embodiments described above are merely examples of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these modifications and improvements all fall within the protection scope of this application.

Claims

1. A method for intelligent order rejection based on a large language model, characterized in that, The method includes: Obtain work order information, which includes the text content of the work orders to be processed received through the information sending and receiving interface of the government service system; The work order information is converted into a query vector using a first model, which is a vector embedding model based on a large language model. Relevant clauses are retrieved from a vector database based on the query vector. The vector database stores knowledge entries corresponding to the clauses to be removed. The knowledge entries are vector representations generated by vectorizing the clauses to be removed using the first model. The relevant clauses are retrieved based on the semantic similarity between the knowledge entries in the vector database and the query vector. A dialogue context is generated based on the relevant terms and the work order information, and the dialogue context is injected with preset system prompt words; The second model is used to generate a structured output result based on the dialogue context. The second model is a distillation model based on a large language model. The structured output result includes a result field, a reason field, and a basis field. The result field is used to indicate whether to dispatch the work order corresponding to the work order information.

2. The method according to claim 1, characterized in that, The method further includes: Obtain the exclusion clause document, which is used to store the exclusion clauses; Read the clause text from the removal clause document; The clause text is preprocessed according to preset preprocessing items to obtain clause input data; the preprocessing items include at least one of text cleaning, format conversion, and character processing. The first model is used to perform vectorization processing on the input data of the clause to generate knowledge entries corresponding to the clause to be removed; The knowledge entries are stored in the vector database.

3. The method according to claim 2, characterized in that, Storing the knowledge entries in the vector database includes: A dataset is created in the vector database, and the dataset is defined with field dimensions; the field dimensions include a primary key ID, a vector field, and a scalar field; the vector field is used to store the knowledge entries; the scalar field is used to store the clause text; Insert the knowledge entries and the clause text into the data set; The index type and distance metric are specified for the knowledge entries in the dataset. The index type is used to characterize the retrieval methods supported by the knowledge entries. The distance metric includes a preset similarity threshold.

4. The method according to claim 3, characterized in that, The method further includes: Get the text of the new terms; The first model is used to vectorize the newly added clause text into a vector representation to obtain the new knowledge entry. Search the vector database for similar terms based on the newly added knowledge entries; Based on the query results of the similar terms, the vector database is updated using the newly added knowledge entry; wherein, when the similar terms are found, the knowledge entry of the similar terms is replaced with the newly added knowledge entry; when no similar terms are found, the newly added knowledge entry is added to the vector database.

5. The method according to claim 1, characterized in that, Retrieving relevant terms from the vector database based on the query vector includes: Calculate the cosine similarity between the query vector and the knowledge entry representation in the vector database; Candidate clauses are determined based on a preset similarity threshold and the cosine similarity, wherein the candidate clauses are the removal clauses corresponding to the knowledge entries whose cosine similarity is greater than or equal to the similarity threshold; The candidate clauses are arranged in descending order of cosine similarity to obtain a sequence of candidate clauses. Extract the relevant clause from the sequence of alternative clauses.

6. The method according to claim 1, characterized in that, Generate a dialogue context based on the relevant terms and the work order information, including: Combine the relevant terms and the work order information into a query context; Obtain preset system prompts, including basis prompts, intent prompts, and output method prompts; The system prompt word is injected into the query context to obtain the dialogue context.

7. The method according to claim 6, characterized in that, Injecting the system prompt word into the query context to obtain the dialogue context includes: Read the work order information and related terms from the query context; Based on the work order information and the relevant terms, the injection parameters of the query context are detected. The injection parameters include the injectable position and the prompt word type corresponding to the injectable position. The system prompt is injected into the query context according to the injection parameters to obtain the dialogue context.

8. The method according to claim 1, characterized in that, The second model generates structured output results based on the dialogue context, including: Set the temperature parameters of the second model, wherein the temperature parameters are positive numbers less than or equal to a preset temperature threshold; The dialogue context is input into the second model to calculate the logical value of the dialogue context relative to the classification label using the second model; The logic value is corrected using the temperature parameter, and the classification probability is calculated based on the corrected logic value. The structured output result is generated based on the classification probability.

9. The method according to claim 8, characterized in that, The structured output result is generated based on the classification probability, including: The result label is extracted according to the classification probability, and the result label is the classification label with the highest classification probability; The result field is generated based on the result tag; The basis field is generated based on the relevant clauses, and the reason field is generated based on the basis field and the work order information; Obtain the output format of the structured output result; The result field, the reason field, and the basis field are combined into the structured output result according to the output format.

10. A dispatching intelligent rejection system based on a large language model, characterized in that, The system includes: The work order acquisition module is used to acquire work order information, which includes the text content of the work orders to be processed received through the information sending and receiving interface of the government service system. The query vector module is used to convert the work order information into query vectors using a first model, wherein the first model is a vector embedding model based on a large language model. The retrieval module is used to retrieve relevant clauses from a vector database based on the query vector. The vector database stores knowledge entries corresponding to the clauses to be removed. The knowledge entries are vector representations generated by vectorizing the clauses to be removed using the first model. The relevant clauses are retrieved based on the semantic similarity between the knowledge entries in the vector database and the query vector. The context generation module is used to generate a dialogue context based on the relevant terms and the work order information, and the dialogue context is injected with preset system prompt words; The result output module is used to generate structured output results using a second model based on the dialogue context. The second model is a distillation model based on a large language model. The structured output results include a result field, a reason field, and a basis field. The result field is used to indicate whether to dispatch the work order corresponding to the work order information.