Method and system for generating official document based on administrative constraint and streaming verification

By parsing administrative hierarchy relationships and dynamic grammatical constraints, combined with streaming generation and parallel verification, and utilizing knowledge graphs for real-time verification and correction, the official documents generated by the large language model ensure appropriate grammatical structure and authenticity of cited content. This solves the problems of inappropriate grammatical structure and inaccurate citations in existing technologies, and improves generation efficiency and system transparency.

CN122197832APending Publication Date: 2026-06-12BEIJING THUNISOFT INFORMATION TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
BEIJING THUNISOFT INFORMATION TECH
Filing Date
2026-03-18
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing large language models cannot accurately understand administrative hierarchy when generating official documents, resulting in inappropriate phrasing, inaccurate citations that are difficult to correct, and a lack of flexibility and real-time performance.

Method used

By using methods such as administrative hierarchy relationship parsing, dynamic linguistic constraints, streaming generation, and parallel verification, combined with knowledge graphs for real-time verification and correction, the appropriateness of the linguistic style and the authenticity of the cited content in official documents are ensured.

🎯Benefits of technology

It ensures the accuracy of voice and the correctness of hierarchical relationships in the document generation process, guarantees the authenticity and timeliness of cited content, and improves generation efficiency and system transparency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122197832A_ABST
    Figure CN122197832A_ABST
Patent Text Reader

Abstract

The application provides a document generation method and system based on administrative constraints and streaming verification, belonging to the technical field of intelligent document generation, aiming to solve the problems of improper tone and inaccurate reference when a large language model generates a document. The method comprises: receiving a document writing instruction, analyzing the hierarchical relationship of the document sender and the document receiver; according to the hierarchical relationship, dynamically imposing tone constraints on the text generation process of the large language model; instructing the large language model to generate text in a streaming manner, while monitoring the references to external files in the text stream in parallel; when a reference is detected, the generation is paused, and a preset knowledge graph is queried to verify the validity of the reference; if the verification is invalid, an automatic correction operation is triggered; the application ensures tone compliance through administrative constraints, and ensures that the reference is real and effective through streaming verification and correction, and can generate a document with proper tone and accurate reference.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence technology, and in particular to a method and system for generating official documents based on administrative constraints and streaming verification. Background Technology

[0002] With the rapid development of large language model technology, using artificial intelligence to assist in drafting official documents, such as notices, requests for instructions, and letters, has become an important direction for government office automation. Existing technologies typically employ a retrieval-enhanced generation model, which retrieves similar historical documents from a local template library based on the user's input topic and provides them as background information to the large language model to generate new documents by imitating the style and format of the templates.

[0003] However, such methods have significant drawbacks. First, official document writing is highly standardized and serious, especially in terms of the relationship between authority and responsibility and the tone of the text. General language models lack an understanding of administrative organizational structures and hierarchical relationships, failing to distinguish the appropriate wording differences between upward, downward, and parallel documents, often resulting in tone misalignment, such as using a commanding tone when addressing superiors. Second, general language models suffer from the "illusion" phenomenon, potentially fabricating non-existent policies and regulations or citing expired or invalid documents as evidence, severely impacting the authority and legal validity of official documents. Even existing technologies attempting to introduce compliance verification often rely on keyword matching and highlighting after full-text generation, which not only lacks automatic correction but also requires restructuring the entire verification logic when regulations are updated, lacking flexibility and real-time capability. While some technologies propose detection and correction during streaming generation, they primarily address the general illusion problem and do not provide specific verification and automatic correction mechanisms for the validity of policies and regulations in the official document field, nor do they solve the fundamental problem of inappropriate tone. Therefore, how to ensure that large language models can both adhere to strict administrative hierarchical language when generating official documents and guarantee the authenticity and validity of the cited content is a huge challenge currently facing the technology. Summary of the Invention

[0004] The purpose of this application is to provide a method and system for generating official documents based on administrative constraints and streaming verification. It aims to solve the technical problem that existing large language models cannot simultaneously meet the dual constraints of hierarchical authority and content compliance when generating official documents due to a lack of understanding of the rules specific to the administrative field. This will overcome the defects of existing technologies, such as inappropriate phrasing, inaccurate citations, and difficulty in correction.

[0005] To achieve the above objectives, this application provides a document generation method based on administrative constraints and streaming verification, comprising: an administrative hierarchy relationship parsing step, wherein the administrative hierarchy relationship parsing step is used to receive a document drafting instruction, extract the issuing subject and the receiving object from the instruction, and determine the hierarchical relationship between the issuing subject and the receiving object based on a preset administrative rank system; a dynamic grammatical constraint application step, wherein the dynamic grammatical constraint application step is used to dynamically apply grammatical constraints to the text generation process of a large language model according to the hierarchical relationship; a streaming generation and parallel verification step, wherein the streaming generation and parallel verification step is used to instruct a large language model to stream text in the form of word sequence based on the grammatical constraints, and, while the large language model generates text, it monitors in parallel whether there are references to external files in the generated text stream; and a reference validity verification and correction step, wherein when the reference is detected, the reference validity verification and correction step is used to pause text generation, query a preset knowledge graph to verify the validity of the reference, and, if the verification result is that the reference is invalid, trigger an automatic correction operation for the reference.

[0006] Optionally, the process of determining the hierarchical relationship between the issuing entity and the receiving entity in the administrative hierarchy relationship parsing step specifically includes: assigning quantitative values ​​to the administrative levels of the issuing entity and the receiving entity; and calculating the administrative distance representing the hierarchical relationship between the issuing entity and the receiving entity by performing calculations on the quantitative values.

[0007] Optionally, the dynamic voice constraint application step includes: loading a modest voice library when the administrative distance indicates an upward relationship; loading a guiding voice library when the administrative distance indicates a downward relationship; and loading a negotiating voice library when the administrative distance indicates a parallel relationship.

[0008] Optionally, the process of verifying the validity of the reference in the reference validity verification and correction step includes: extracting the entity to be verified from the reference; calculating the vector cosine similarity between the entity to be verified and the entity in the knowledge graph; and judging the validity of the reference based on the comparison result of the vector cosine similarity value and a preset threshold.

[0009] Optionally, the automatic correction operation includes: instructing the large language model to revert to the position before the reference; and injecting the correct file information obtained from the knowledge graph into the generation context to regenerate the content.

[0010] Optionally, in the streaming generation and parallel verification steps, the parallel monitoring of whether there are references to external files in the generated text stream is achieved by detecting whether book titles appear in the text stream, or whether preset keywords selected from "according to" or "in accordance with" appear.

[0011] Optionally, the method further includes: before the dynamic voice constraint application step, decoupling the document to be generated into at least two semantic blocks among cause, basis, matter, and requirement, and configuring independent voice constraints for different semantic blocks; and the dynamic voice constraint application step specifically includes: combining the voice constraints determined according to the hierarchical relationship with the independent voice constraints configured for each semantic block to form the final voice constraints applied to each semantic block.

[0012] This application also provides a document generation system based on administrative constraints and streaming verification, comprising: a large language model; an intent and hierarchy parsing module, used to receive document drafting instructions, extract the issuing subject and receiving object from the instructions, and determine the hierarchical relationship between the issuing subject and the receiving object based on a preset administrative rank system; a dynamic constraint construction module, used to dynamically construct grammatical constraints for the text generation process of the large language model according to the hierarchical relationship; and a large language model invocation module, used to instruct the large language model to stream-generate text in the form of word sequence based on the grammatical constraints, and to respond to the streaming verification. The verification module's correction instruction performs automatic correction of the references; the streaming compliance verification module is configured to run in parallel while the large language model generates text, and is used to monitor whether there are references to external files in the generated text stream. When the reference is detected, it sends a pause instruction to the large language model calling module to pause text generation, and queries a preset knowledge graph to verify the validity of the reference. If the verification result is that the reference is invalid, it sends a correction instruction to the large language model calling module; the knowledge graph is used to store external files and their validity status for the streaming compliance verification module to query.

[0013] Optionally, the intent and hierarchy parsing module is specifically used to: assign quantitative values ​​to the administrative levels of the issuing entity and the receiving entity; and calculate the administrative distance representing the hierarchical relationship between the issuing entity and the receiving entity by performing calculations on the quantitative values.

[0014] Optionally, to implement the automatic correction operation, the streaming compliance verification module is further configured to: instruct the large language model calling module to revert the generation process to the position before the reference, and inject the correct file information obtained from the knowledge graph into the generation context to regenerate the content.

[0015] Compared with the prior art, this application has the following beneficial effects:

[0016] 1. Ensure appropriate voice and hierarchical compliance. This application, by calculating administrative distance and dynamically applying voice constraints, enables large language models to understand and follow administrative hierarchical norms, solving the problem that general models cannot accurately grasp the wording of official documents, and ensuring the accuracy of the voice and the correctness of the hierarchical relationship in official documents.

[0017] 2. Ensuring the authenticity and timeliness of cited content. This application's streaming verification mechanism, linked to a real-time updated policy and regulatory knowledge graph, performs real-time checks and automatic corrections on citations during the generation process. This helps avoid the "illusion" problem of models fabricating or citing outdated or invalid regulations, improving the authority and legal validity of the generated documents. 3. Achieving efficient and traceable local correction. When verification detects problems, this application can block, revert, and perform local automatic corrections in real time, rather than requiring full rewriting or manual modification as in existing technologies. This modification method not only improves generation efficiency but also ensures that key information in the documents has a clear knowledge graph source, achieving structured traceability and enhancing the system's transparency and credibility. Attached Figure Description

[0018] To more clearly illustrate the technical solutions in the embodiments of this application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0019] Figure 1 This is a schematic diagram of the structure of a document generation system provided in an embodiment of this application;

[0020] Figure 2 A flowchart illustrating a document generation method provided in this application embodiment;

[0021] Figure 3 This is a signaling interaction timing diagram for a document generation method provided in an embodiment of this application.

[0022] The main reference numerals in the attached diagrams are explained as follows: 10 - User Interface; 20 - Intent and Hierarchy Parsing Module; 30 - Dynamic Constraint Prompt Construction Module; 40 - Large Language Model Calling Module; 50 - Streaming Compliance Verification Module; 60 - Policy and Regulation Knowledge Graph; S1 - Administrative Hierarchy Relationship Parsing Step; S2 - Dynamic Prompt Construction Step; S3 - Real-time Compliance Verification Step; D - Administrative Distance. Detailed Implementation

[0023] To make the objectives, technical solutions, and advantages of this application clearer, the application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the specific embodiments described herein are for illustrative purposes only and are not intended to limit the scope of protection of this application.

[0024] Example 1

[0025] This application provides a method and system for generating official documents based on administrative constraints and streaming verification. In this embodiment, the core technical solution of this application will be described in conjunction with the generation scenarios of parallel and downward-directed documents.

[0026] Please see Figure 1 This is a schematic diagram of the structure of a document generation system provided in an embodiment of this application. As an optional implementation, the system can be deployed on computing devices such as servers, cloud computing platforms, or personal computers. Its hardware foundation may include conventional computer components such as processors, memory, and network interfaces. At the software level, the system mainly includes a user interface 10, an intent and hierarchy parsing module 20, a dynamic constraint prompt word construction module 30, a large language model invocation module 40, a streaming compliance verification module 50, and a policy and regulation knowledge graph 60.

[0027] User interface 10, as the front end for user interaction with the system, can be a graphical user interface or a command-line interface, used to receive document writing instructions in natural language form input by the user, and finally present the generated document to the user.

[0028] The intent and hierarchy parsing module 20 is used for preliminary parsing of voice control. Specifically, after receiving the raw instructions from the user interface 10, this module performs deep semantic analysis on them. For example, this module can integrate a natural language understanding model and use technologies such as named entity recognition to accurately extract the issuing subject and receiving object from the instruction text. After extraction, the module further queries a pre-built database or configuration file that stores various administrative units and their corresponding levels, thereby assigning quantitative values ​​of administrative levels to the issuing subject and receiving object. Accordingly, the module calculates the administrative distance representing the hierarchical relationship between the two. Finally, the module outputs a structured parsing result, which may include the issuing subject, the receiving object, and the calculated administrative distance value, for use by subsequent modules.

[0029] The dynamic constraint prompt word construction module 30 is used to construct prompt words for text generation based on semantic understanding results. This module receives structured results from the intent and hierarchy parsing module 20, especially the administrative distance value. Its core function is to dynamically and specifically construct a complex prompt word to guide the large language model in generating text based on the administrative distance value. In one embodiment of this application, this module can maintain multiple morphological libraries, such as a modesty morphological library, a guidance morphological library, and a negotiation morphological library corresponding to upward, downward, and parallel relationships, respectively. Each morphological library may include recommended vocabulary, sentence templates, and a list of prohibited words. When the administrative distance value is received, the module selects and loads the corresponding morphological library and integrates the constraints therein (such as vocabulary weight adjustment, prohibited word penalties, etc.) into a structured constraint set. Finally, the module merges the user's original intent, the selected morphological template, hierarchy constraints, and background information obtained from other sources such as knowledge graphs into a highly customized final prompt word and sends it to the large language model calling module 40.

[0030] The large language model calling module 40, as the core execution unit for text generation, encapsulates calling interfaces for one or more large language models. This module receives the final prompt word from the dynamic constraint prompt word construction module 30 and instructs the large language model to initiate the streaming text generation process accordingly. It is understood that streaming generation means that the text is not output all at once, but rather generated and returned piecemeal or segment-by-segment in the form of a sequence of lexical units. Furthermore, this module is responsible for managing the state of the generation process and can respond to control commands from the streaming compliance verification module 50, such as pausing generation, resuming generation, or performing a rollback operation (i.e., discarding the already generated portion of the lexical unit sequence and restarting from a specified position).

[0031] The streaming compliance verification module 50, as a key monitoring unit to ensure the authenticity and validity of the generated content, runs in parallel with the large language model invocation module 40 to form a real-time feedback closed loop. This module continuously receives and analyzes the lexical stream generated by the large language model invocation module 40, and incorporates specific pattern recognition logic to detect whether references to external files appear in the text stream. Once a reference is detected, the module sends a pause command to the large language model invocation module 40. Subsequently, it extracts the name of the referenced file (i.e., the entity to be verified) from the text stream and initiates a query request to the policy and regulation knowledge graph 60. After receiving the verification result from the knowledge graph, the module makes a decision based on preset business logic: if the reference is valid, it sends a continue command to the large language model invocation module 40; if the reference is invalid (e.g., the file is obsolete or does not exist), it sends a correction command, which may include the specific rollback position and the correct information to be injected.

[0032] The policy and regulatory knowledge graph 60 forms the data foundation of this application. As a structured database, it can be implemented using a graph database or a relational database, storing a large number of external documents such as policies, regulations, standards, and ordinances as entities. It should be noted that each entity not only includes its standard name but also a series of important metadata, especially its "validity status" (e.g., effective, repealed, pending effectiveness, revised) and a link to the latest version (if any). To ensure the timeliness of information, this knowledge graph needs to be maintained and updated regularly. It provides fast and accurate query services for the streaming compliance verification module 50, thereby providing data support for the citation validity verification and correction functions.

[0033] The following will combine Figure 2 The method flowchart shown is Figure 3 The signaling interaction timing diagram shown illustrates the method provided in this embodiment through two specific scenarios.

[0034] Scenario 1: Generating parallel business negotiation letters

[0035] In a specific application scenario, users can input commands through user interface 10, such as: "Write a letter from the Municipal Education Bureau to the Municipal Finance Bureau to discuss funding for school building repairs."

[0036] First, in the administrative hierarchy relationship resolution step (corresponding to...) Figure 2 In S1), after receiving the instruction, the intent and hierarchy parsing module 20 identifies the "issuing entity" as "Municipal Education Bureau" and the "recipient" as "Municipal Finance Bureau." Subsequently, this module queries the preset administrative rank quantification system. It can be understood that different administrative unit rank quantification values ​​can be preset in this system, for example: national level is 10, provincial / ministerial level is 9, municipal level (departmental level) is 7, county level (county-level) is 5, and township / section level is 3. Accordingly, the module assigns a quantification value of 7 to both "Municipal Education Bureau" and "Municipal Finance Bureau," and then, according to the formula... Calculate the administrative distance. In this scenario, the calculation result is: This indicates that the two are parallel.

[0037] Next, in the step of applying dynamic voice constraints (corresponding to...) Figure 2In S2), the dynamic constraint prompt word construction module 30 automatically loads the "negotiation-oriented language library" based on the result of administrative distance D=0. This language library can contain recommended words (such as "negotiation," "letter of notification," "hopefully," "please provide support") and a list of prohibited words (such as "requirement," "command," "instruction"), and sets extremely high negative generation weights for prohibited words. Accordingly, the module integrates these constraints with the user's intent into a complex prompt word, for example: "Please draft an official letter on behalf of the Municipal Education Bureau for the Municipal Finance Bureau. The document type is letter, and the core content is to negotiate the funding required for the repair of school buildings. Please note that the tone of the document should be negotiation-oriented, using words such as 'request' and 'letter of notification,' and absolutely prohibiting the use of directive words such as 'requirement' and 'command.'"

[0038] Then, in the streaming generation and parallel verification steps, the aforementioned prompt words are sent to the large language model invocation module 40. The large language model then begins to generate text word by word. Suppose that during the generation process, the model, due to its inherent training bias, attempts to generate the sentence: "...Therefore, we request your bureau to allocate the relevant funds as soon as possible...". At this time, the pre-applied grammatical constraints (which can be implemented before generation by the dynamic constraint prompt word construction module 30, or monitored by a real-time grammatical filter similar to the streaming compliance verification module 50) will detect the taboo word "request". The system then intervenes, for example, by forcing the model to revert to after "therefore", and increasing the generation probability of words such as "request", thereby guiding the model to generate the corrected sentence: "...Therefore, we request your bureau to allocate the relevant funds as soon as possible...". In this way, it can be ensured that the final generated document conforms to the writing norms between parallel units in terms of grammatical structure.

[0039] Scenario 2: Generating downward notification and performing regulatory verification

[0040] In another application scenario, a user inputs a command through user interface 10: "Draft a notice on carrying out a major safety inspection in the name of the municipal government."

[0041] In the administrative hierarchy parsing step S1, the intent and hierarchy parsing module 20 identifies the "issuing entity" as the "municipal government" (level value 7), and the "recipients" as the subordinate district and county governments and municipal departments (level value is usually 5). Based on this, the administrative distance is calculated. The negative value indicates that this is a downward relationship.

[0042] Entering step S2 of applying dynamic voice constraints, the dynamic constraint prompt word construction module 30 loads the "guidance voice library" based on the result of D=-2. This library can contain words such as "notification," "requirement," "deployment," "practical," and "strict" to enhance the guidance and authority of the document.

[0043] Subsequently, the system enters the streaming generation and parallel verification steps. The large language model calls module 40 to start generating notification content, while the parallel-running streaming compliance verification module 50 monitors the output word stream in real time. Suppose the large language model generates the following content: "...In order to implement the spirit of the '2020 National Special Rectification Action Plan for Safe Production,' the municipal government has decided after research...".

[0044] At this point, the streaming compliance verification module 50 captures the citation through pattern matching (e.g., detecting the preset citation format of book title marks "《》"), and immediately performs citation validity verification and correction steps (corresponding to... Figure 2 (S3 in the text). For detailed interaction procedures, please refer to [link / reference]. Figure 3 The timing diagram is as follows: a. The module first sends a pause command to the large language model calling module 40 (e.g., ...). Figure 3 (As shown in message 5), temporarily freeze text generation. b. Next, extract the entity to be verified from the text stream: "2020 National Special Rectification Action Plan for Work Safety". c. The module initiates a query request to the policy and regulation knowledge graph 60 (e.g., Figure 3 (As shown in message 6), to verify the validity of the document. d. After searching, the policy and regulation knowledge graph 60 found that the metadata "validity status" of the document entity was "repealed," and a pointer to a new document existed. The knowledge graph returned this verification result to the streaming compliance verification module 50 (e.g., Figure 3 As shown in message 7), the result contains an "invalid" status and the correct latest file name: "National Three-Year Action Plan for Fundamental Improvement in Production Safety (2024-2026)". e. The streaming compliance verification module 50 triggers an automatic correction operation based on the verification result indicating an invalid reference. It sends a correction instruction (such as...) to the large language model calling module 40. Figure 3 As shown in message 8), this instruction can contain two parts: first, the instruction model reverts to the position before the reference (i.e., after "...to implement"); second, the correct file name obtained from the knowledge graph, "National Three-Year Action Plan for Addressing the Root Causes of Work Safety (2024-2026)," is injected into the generation context as necessary information. f. The large language model calls module 40 to execute this instruction, causing the large language model to regenerate this part of the content based on the updated context, thereby outputting the correct text stream: "...to implement the spirit of the National Three-Year Action Plan for Addressing the Root Causes of Work Safety (2024-2026), after research and decision by the municipal government..." (e.g. Figure 3 (As shown in message 9).

[0045] Once the verification is successful, the pause is lifted, and the large language model continues to generate subsequent content until the entire document is completed (corresponding to...). Figure 2If the message "Generation complete?" is judged as "Yes", a compliant official document with appropriate phrasing and accurate citations will be output through the user interface 10.

[0046] In summary, this embodiment demonstrates the technical solution of this application, which dynamically controls the voice by parsing administrative hierarchy relationships and verifies and corrects external file references in real time through streaming generation and parallel verification mechanisms. This helps to solve problems such as inappropriate voice and inaccurate citations that may exist in the prior art.

[0047] Example 2

[0048] This embodiment aims to focus on illustrating the specific implementation of this application in handling upward text scenarios, in order to further explain the details and effects of the dynamic voice constraint application steps. The system structure used in this embodiment can be the same as that in Embodiment 1. Figure 1 The system shown is consistent.

[0049] Imagine an upward communication scenario where a user inputs the following command through user interface 10: "Write a report for the county environmental protection bureau to the city environmental protection bureau, reporting on the progress of water pollution control work in the first quarter."

[0050] First, in the administrative hierarchy relationship parsing step S1, the intent and hierarchy parsing module 20 extracts the "issuing entity" as "county environmental protection bureau" and the "receiving entity" as "municipal environmental protection bureau" from the instruction. Based on the administrative rank quantification system defined in Example 1, the level value of "county environmental protection bureau" is 5, and the level value of "municipal environmental protection bureau" is 7. The module then calculates the administrative distance accordingly: A positive administrative distance indicates that this document is in an upward relationship.

[0051] Next, in step S2 of applying dynamic voice constraints, after receiving the signal that the administrative distance D=2, the dynamic constraint prompt word construction module 30 will precisely call the "modest voice library" preset for the upward communication scenario. This voice library is the core of realizing the voice control of upward communication. Specifically, the library may contain the following:

[0052] Recommended vocabulary and sentence structures: such as “report”, “request”, “request”, “submit”, “please approve”, “please instruct”, “please advise”, etc.; these vocabulary and sentence structures are given higher generation weights to guide the model to select them first.

[0053] Taboo word list: This is a key constraint that includes words that should not appear in the upward text, such as "requirement", "order", "approval", "deployment", "notification", etc. During the generation process, these words will be given a very high negative weight or will be completely blocked.

[0054] Sentence structure templates: For example, you can pre-set the opening template for an upward report: "The progress of our bureau's work on [topic] during [time period] is as follows:...", and the closing template: "The above report is submitted for your review by the Municipal Bureau. Please approve the subsequent work plan."

[0055] The dynamic constraint prompt word construction module 30 combines all the above constraints with the user's core intent "reporting on the progress of water pollution control" to form a highly structured final prompt word, which is then sent to the large language model calling module 40.

[0056] In the streaming generation and parallel verification steps, the large language model generates reports under the guidance of these strong constraints, thus naturally outputting text that conforms to the specifications, such as: "The progress of our bureau's water pollution control work in the first quarter of 2024 is reported as follows:...".

[0057] To further illustrate the intervention and correction capabilities of this embodiment, suppose that when generating the ending portion, the large language model deviates from its intended path due to its inherent language habits, generating: "...the follow-up work plan, please approve it as soon as possible." At this point, the system's grammatical constraint mechanism (which could be a pre-emptive prompt or a post-emptive real-time filter) will recognize that the expression "please approve it as soon as possible" carries an urging and commanding tone, contradicting the humble tone expected in upward-bound communication. The system will then trigger intervention, forcing the model to revert and correct it based on the taboo word list and recommended sentence structures. The corrected result might be: "...the follow-up work plan, please approve it." or "...request the Municipal Bureau to provide guidance on the follow-up plan." This correction transforms the inappropriate, commanding request into a humble, compliant expression requesting a superior's decision or guidance.

[0058] As can be seen from the description of this embodiment, the technical solution of this application can perform fine-grained control over the generation process of upward documents by quantifying administrative distance and matching specific grammatical libraries, thereby effectively avoiding the grammatical misalignment problem that may occur in the documents from lower levels to higher levels, and ensuring that the generated official documents are humble in wording, appropriate in tone, and in accordance with the writing norms of the administrative system.

[0059] Example 3

[0060] This embodiment focuses on illustrating how the streaming compliance verification mechanism of this application addresses the problem of large language models fabricating non-existent regulations (i.e., the "illusion" problem) and performs effective detection and correction. This embodiment can also employ... Figure 1 The system structure is shown.

[0061] Suppose a user enters a broad instruction through user interface 10: "Draft a management regulation to strengthen internal data security." Such instructions lack specific legal basis, increasing the risk of large language models generating illusions.

[0062] The system first performs the steps of parsing administrative hierarchy relationships and applying dynamic voice constraints. Since these are internal company regulations, there may not be a clearly defined administrative hierarchy, or the language may be set to a neutral voice. The dynamic constraint prompt word construction module 30 then constructs the prompt words accordingly.

[0063] Entering the streaming generation and parallel verification step, the large language model calls module 40 to instruct the large language model to start generating the prescribed content. Suppose that when generating a certain text paragraph, the model fabricates a regulation to increase the "authority" of the text, generating the following content: "...the transmission of all core business data shall be encrypted in accordance with the relevant provisions of the 'National Core Data Encryption Transmission Regulations'...".

[0064] The streaming compliance verification module 50, which runs in parallel with the large language model generation process, detects this citation by detecting the book title mark "《》" in its real-time monitoring. Subsequently, the module executes the citation validity verification and correction step S3, which in this embodiment can be specifically used to handle illusion detection.

[0065] The hallucination detection and verification process may include the following steps:

[0066] A1: The streaming compliance verification module 50 first sends a pause command to the large language model calling module 40.

[0067] A2: The module extracts the entity to be verified, Q: "Regulations on Encrypted Transmission of National Core Data", from the text stream.

[0068] A3: As an optional implementation, to perform semantic-level matching rather than simple string matching, the module can first call a text vectorization model to convert the entity Q to be verified into a high-dimensional floating-point vector. .

[0069] A4: The module initiates a query to the policy and regulation knowledge graph 60. This query is not a simple keyword search, but rather a request for the knowledge graph to return vector representations of all or related regulatory entities in its database (such as "data security" and "cybersecurity"). Assuming the knowledge graph stores actual regulations such as the "Cybersecurity Law of the People's Republic of China" and the "Data Security Law of the People's Republic of China," their vector representations are as follows: .

[0070] A5: The streaming compliance verification module 50 will next perform a core calculation, namely, calculating the entity vector to be verified. With each entity vector in the knowledge graph The cosine similarity is calculated using the following formula: This formula calculates the cosine of the angle between two vectors, with the result ranging from [-1, 1]. The closer the value is to 1, the more similar the semantics.

[0071] A6: The module will compare all the calculated similarity values ​​with a preset matching threshold. (For example, (Can be set to 0.85) for comparison. After calculation, if the module finds... If the highest cosine similarity with all real legal entity vectors is only 0.4, which is far below the threshold of 0.85, then a corresponding judgment can be made.

[0072] Based on the above calculation results, the streaming compliance verification module 50 determines that since no semantically highly similar valid entities can be found in the knowledge graph, the “National Core Data Encryption Transmission Regulations” is a fabricated, non-existent “illusion” reference.

[0073] Accordingly, the system triggers an automatic correction operation:

[0074] B1: The system marks this event as a "hallucination alert" and sends a correction instruction to the large language model calling module 40.

[0075] B2: This instruction first requires the large language model to fall back to the position before the reference, that is, after "...all must be based on".

[0076] B3: As a preferred correction strategy, the streaming compliance verification module 50 can trigger a supplementary search process. It can utilize currently generated contextual keywords (such as "data security," "encryption," and "transmission") to perform a reverse semantic search within the policy and regulatory knowledge graph 60 to find the most relevant authentic regulations. The search results may locate relevant chapters in the "Data Security Law of the People's Republic of China" concerning "data classification and hierarchical protection" and "data transmission security."

[0077] B4: The module injects the actual and relevant legal title, "The Data Security Law of the People's Republic of China," as the correction information into the generation context of the large language model.

[0078] B5: The large language model calling module 40 instruction model regenerates this paragraph based on this new, authentic regulatory information. Ultimately, the model may output: "...the transmission of all core business data must be encrypted in accordance with the requirements for data classification and grading protection in the 'Data Security Law of the People's Republic of China'...".

[0079] Through the detailed process of this embodiment, it is clearly demonstrated how this application uses a method combining vector cosine similarity and a preset threshold to effectively identify "illusory" citations in large language models, and through mechanisms of rollback, supplementary retrieval, and information injection, it automatically corrects false information into true and accurate legal basis, thereby helping to enhance the seriousness and credibility of generated official documents.

[0080] Example 4

[0081] This embodiment illustrates a preferred implementation of the present application, namely, a method for decoupling and generating official document elements. This method aims to further improve the overall logic and quality of the final output document by structurally decomposing the document content and applying different, more refined generation control strategies to different parts. As an optional implementation, the system in this embodiment can... Figure 1 Based on the structure shown, the functionality of the dynamic constraint prompt word construction module 30 is expanded, for example, by adding a "structural planning" sub-module.

[0082] For example, in Scenario 2 of Example 1, the user instruction is: "Draft a notice on carrying out a major safety inspection in the name of the municipal government."

[0083] Upon receiving the instruction, the intent is to work normally with the hierarchical parsing module 20 and parse out the downlink relationship (D=-2).

[0084] Before proceeding to the regular dynamic voice constraint application step, the newly added "Structure Planning" submodule is activated. This submodule, based on its understanding of the document's style (in this example, a "notice"), logically decouples the document to be generated into multiple core semantic blocks. For a notice, a typical structure can be broken down into at least the following four parts:

[0085] [Reason] section: Explains the background and reason for issuing this notice;

[0086] [Basis] Block: Clearly states the laws, regulations, or directives from higher authorities that this action is based on;

[0087] [Specific Items] section: Lists in detail the specific content, scope, timing, and methods of the major inspection;

[0088] [Work Requirements] Block: Provides specific directives that subordinate units must comply with and implement.

[0089] Next, the system does not generate the full text all at once, but generates each of the above semantic blocks in segments, and in the process, it applies specific constraints to each semantic block that best match its function.

[0090] Generating the [Basis] Block: For this semantic block, accuracy is the primary objective. Therefore, the system can prioritize performing a mandatory knowledge graph retrieval. Specifically, the dynamic constraint prompt word construction module 30 can retrieve all relevant regulatory documents in the policy and regulations knowledge graph 60 that are in effect, based on the keyword "major safety production inspection" in the user's intent. The search results (e.g., the "Safety Production Law of the People's Republic of China" and the "National Three-Year Action Plan for Addressing the Root Causes of Safety Production (2024-2026)") will be used as strong constraints and injected into the prompt words specifically constructed for generating the [Basis] block. This ensures that the legal basis of the official documents is accurate from the source, rather than relying on the model's own memory or creation.

[0091] Generating the [Work Requirements] block: For this semantic block, the authority of the voice is crucial. Therefore, the system will focus on applying the results of administrative distance calculation (D=-2, downward text). The dynamic constraint prompt word construction module 30 will load the "guidance voice library" and configure a strong imperative style for the prompt words generated for this block, such as using a large number of phrases like "all units must attach great importance to...", "must ensure...", and "strictly implement..." to reflect the clear deployment and requirements of superiors to subordinates.

[0092] Generating "[Reason]" and "[Specific Matter]" blocks: These two semantic blocks primarily consist of factual statements and specific arrangements, with relatively low requirements for voice. Therefore, the system can employ more general generation strategies. For example, a retrieval-enhanced generation method based on historical examples can be applied to extract content templates from similar old notices; alternatively, relatively neutral prompts can be used to allow the large language model to generate content under basic fluency and consistency constraints.

[0093] After all semantic blocks have been generated independently, the system enters the content reorganization stage. The system will splice these independently generated text blocks according to the standard logical order of official documents (usually: subject matter → basis → specific matters → work requirements). During the splicing process, the system can also perform necessary transitional phrases and sentences, such as adding transitional phrases like "Therefore, ...", "The specific arrangements are as follows:", and "To ensure that the work is implemented effectively, the following requirements are hereby proposed:", so that the final text is logically coherent and seamless.

[0094] The above description is merely a preferred embodiment of this application and is not intended to limit the scope of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the protection scope of this application.

Claims

1. A document generation method based on administrative constraints and streaming verification, characterized in that, include: The administrative hierarchy relationship resolution step is used to receive document drafting instructions, extract the issuing entity and the receiving entity from the instructions, and determine the hierarchical relationship between the issuing entity and the receiving entity based on a preset administrative rank system. A dynamic grammatical constraint application step is used to dynamically apply grammatical constraints to the text generation process of the large language model according to the hierarchical relationship. The streaming generation and parallel verification steps are used to instruct a large language model to stream text in the form of word sequence based on the grammatical constraints; while the large language model generates text, it monitors in parallel whether there are references to external files in the generated text stream. The citation validity verification and correction step is used to pause text generation and query a preset knowledge graph to verify the validity of the citation when the citation is detected; if the verification result is that the citation is invalid, an automatic correction operation for the citation is triggered.

2. The method according to claim 1, characterized in that, The process of determining the hierarchical relationship between the issuing entity and the receiving entity in the administrative hierarchy relationship resolution step specifically includes: Assign quantitative values ​​to the administrative levels of the issuing entity and the receiving entity; By performing calculations on the quantified value, the administrative distance, which represents the hierarchical relationship between the issuing entity and the receiving entity, is calculated.

3. The method according to claim 2, characterized in that, The steps for applying dynamic voice constraints include: When the administrative distance indicates an upward relationship, load the modest voice library; When the administrative distance indicates a downward relationship, load the guidance voice library; When the administrative distance indicates a parallel relationship, load the negotiation-oriented language library.

4. The method according to claim 1, characterized in that, The process of verifying the validity of the reference in the reference validity verification and correction step includes: Extract the entity to be verified from the reference; Calculate the vector cosine similarity between the entity to be verified and the entities in the knowledge graph; The validity of the reference is determined by comparing the value of the vector cosine similarity with a preset threshold.

5. The method according to claim 1, characterized in that, The automatic correction operation includes: The instruction states that the large language model should revert to its position before the reference. The correct file information obtained from the knowledge graph is injected into the generation context to regenerate the content.

6. The method according to claim 1, characterized in that, In the streaming generation and parallel verification steps, the parallel monitoring of whether there are references to external files in the generated text stream is achieved by detecting whether book titles appear in the text stream, or whether preset keywords selected from "according to" or "in accordance with" appear.

7. The method according to claim 1, characterized in that, Also includes: Before the dynamic voice constraint application step, the document to be generated is decoupled into at least two semantic blocks among cause, basis, matter, and requirement, and independent voice constraints are configured for different semantic blocks; and the dynamic voice constraint application step specifically includes: combining the voice constraints determined according to the hierarchical relationship with the independent voice constraints configured for each semantic block to form the final voice constraints applied to each semantic block.

8. A document generation system based on administrative constraints and streaming verification, characterized in that, include: Large language model; The intent and hierarchy parsing module is used to receive document drafting instructions, extract the issuing entity and the receiving entity from the instructions, and determine the hierarchical relationship between the issuing entity and the receiving entity based on a preset administrative rank system. A dynamic constraint construction module is used to dynamically construct grammatical constraints for the text generation process of the large language model based on the hierarchical relationship. The large language model invocation module is used to instruct the large language model to generate text in the form of word sequence based on the grammatical constraints, and to respond to the correction instructions from the streaming compliance verification module to perform automatic correction operations on the citations. The streaming compliance verification module is configured to run in parallel while the large language model generates text. It is used to monitor whether there are references to external files in the generated text stream. When the reference is detected, it sends a pause command to the large language model calling module to pause text generation, and queries a preset knowledge graph to verify the validity of the reference. If the verification result is that the reference is invalid, it sends a correction command to the large language model calling module. The knowledge graph is used to store external files and their validity status for querying by the streaming compliance verification module.

9. The system according to claim 8, characterized in that, The intent and hierarchy parsing module is specifically used for: Assign quantitative values ​​to the administrative levels of the issuing entity and the receiving entity; By performing calculations on the quantified value, the administrative distance, which represents the hierarchical relationship between the issuing entity and the receiving entity, is calculated.

10. The system according to claim 8, characterized in that, To implement the automatic correction operation, the streaming compliance verification module is further configured as follows: The instruction states that the large language model calling module will revert the generation process to the position before the reference and inject the correct file information obtained from the knowledge graph into the generation context to regenerate the content.