Rule element-based content generation method and device, electronic equipment and readable medium
By introducing a rule meta-module into the large model, the end-to-end rule discrimination and correction of the large model output is realized, which solves the problem of uncontrollable output of the large model and improves the controllability and reliability of the output.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- BEIJING VIMICRO ARTIFICIAL INTELLIGENCE CHIP TECH CO LTD
- Filing Date
- 2026-03-16
- Publication Date
- 2026-06-30
AI Technical Summary
Large models lack an understanding of physical rules and knowledge during content generation, resulting in uncontrollable and unreliable outputs. They struggle to establish causal relationships and complex logical reasoning, leading to reasoning illusions and uncontrollable outputs.
A rule meta module is introduced, including a rule meta gating unit, a rule meta review unit, and a rule meta correction unit. Through a phased dynamic constraint mechanism, the output of the large model is guided and controlled, thus constructing a full-link rule discrimination and correction mechanism.
It improves the controllability and reliability of large model output, making its output safer, more controllable, and compliant, and solves the problem of uncontrollability of large models in content generation.
Smart Images

Figure CN122309658A_ABST
Abstract
Description
Technical Field
[0001] Embodiments of this disclosure relate to the field of computer technology, and more specifically to a method, apparatus, electronic device, and readable medium for generating content based on rule-based meta-data. Background Technology
[0002] Large-scale modeling technology has entered a critical stage of large-scale deployment, with its technological iteration accelerating. It is extending deeply from general-purpose models to scenario-based and specialized approaches, making breakthroughs in natural language processing, multimodal generation, and complex reasoning, becoming a core productivity tool driving digital transformation across industries. However, current large-scale modeling technology has limitations in its underlying design. It typically uses a single big data-driven approach to achieve probabilistic modeling, with "Next Token Prediction" as its core mechanism. It lacks an understanding of physical rules and knowledge, and suffers from fundamental limitations such as difficulty in establishing causal relationships and achieving complex logical reasoning. Therefore, in practical applications, it inevitably suffers from inherent problems such as reasoning illusions and high uncontrollability of output, posing serious challenges to content security, factual accuracy, and ethical compliance, becoming a bottleneck restricting the reliable application of large-scale models. The uncontrollable output of large-scale models manifests specifically as follows: after receiving human instructions, even if the input content is compliant and the instructions are clear, the model's output may still deviate from the preset direction, exceed normative boundaries, and even produce potentially harmful content.
[0003] The information disclosed in this background section is only intended to enhance the understanding of the background of the inventive concept, and therefore may contain information that does not constitute prior art known to those skilled in the art. Summary of the Invention
[0004] The summary portion of this disclosure is intended to provide a brief overview of the concepts, which will be described in detail in the detailed description portion. This summary portion is not intended to identify key or essential features of the claimed technical solutions, nor is it intended to limit the scope of the claimed technical solutions.
[0005] Some embodiments of this disclosure propose a method, apparatus, electronic device, and computer-readable medium for generating content based on rule-based meta-data to address one or more of the technical problems mentioned in the background section above.
[0006] In a first aspect, some embodiments of this disclosure provide a content generation method based on rule meta-analysis. The method includes: receiving input text from a user; inputting the input text into a pre-trained question-answering model, wherein the question-answering model includes a rule meta-analysis module, which includes a rule meta-analysis gating unit, a rule meta-analysis review unit, and a rule meta-analysis correction unit; in response to determining that the content generation process corresponding to the input text has entered the thought chain reasoning stage through the rule meta-analysis gating unit, inputting intermediate reasoning text obtained through thought chain reasoning into the rule meta-analysis review unit; in response to determining that the review result of the rule meta-analysis review unit does not meet preset conditions, correcting the intermediate reasoning text to obtain corrected reasoning text; and in response to determining that the content generation process corresponding to the input text has entered the main content generation stage through the rule meta-analysis gating unit, sending the corrected reasoning text into the question-answering model for output content generation, wherein during output content generation, the rule meta-analysis correction unit is used for rule discrimination and correction.
[0007] Secondly, some embodiments of this disclosure provide a content generation apparatus based on rule meta-matrix. The apparatus includes: a receiving unit configured to receive input text input by a user; a first input unit configured to input the input text into a pre-trained question-and-answer model, wherein the question-and-answer model includes a rule meta-module, which includes a rule meta-gating unit, a rule meta-review unit, and a rule meta-correction unit; a second input unit configured to, in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the thought chain reasoning stage, input intermediate reasoning text obtained by thought chain reasoning into the rule meta-review unit; a correction unit configured to, in response to determining that the review result of the rule meta-review unit does not meet preset conditions, correct the intermediate reasoning text to obtain corrected reasoning text; and a third input unit configured to, in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the main content generation stage, send the corrected reasoning text into the question-and-answer model to generate output content, wherein, during the output content generation process, the rule meta-correction unit is used for rule discrimination and correction.
[0008] Thirdly, some embodiments of this disclosure provide an electronic device, including: one or more processors; and a storage device having one or more programs stored thereon, wherein when the one or more programs are executed by the one or more processors, the one or more processors implement the method described in any implementation of the first aspect above.
[0009] Fourthly, some embodiments of this disclosure provide a computer-readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect above.
[0010] The above embodiments of this disclosure have the following beneficial effects: By using the rule-based content generation method of some embodiments of this disclosure to constrain the output content of a large model with rule elements, the output of the large model can be guided and controlled from the source while maintaining its expressive power, thereby improving the controllability of the large model's output and making its output content safer, more reliable, controllable, and compliant. Specifically, the reason for the high uncontrollability of the output is that the large model lacks cognition of physical rules and knowledge, and has fundamental limitations such as difficulty in establishing causal relationships and implementing complex logical reasoning. Based on this, the rule-based content generation method of some embodiments of this disclosure first receives input text from the user. Second, the input text is input into a pre-trained question-answering large model, wherein the question-answering large model includes a rule element module, which includes a rule element gating unit, a rule element review unit, and a rule element correction unit. Then, in response to determining that the content generation process corresponding to the input text identified by the rule element gating unit has entered the thought chain reasoning stage, the intermediate reasoning text obtained by the thought chain reasoning is input into the rule element review unit. Secondly, in response to the determination that the review result of the aforementioned rule meta-review unit does not meet the preset conditions, the aforementioned intermediate reasoning text is corrected to obtain the corrected reasoning text. Finally, in response to the determination that the content generation process corresponding to the aforementioned input text has entered the main content generation stage through the aforementioned rule meta-gating unit, the aforementioned corrected reasoning text is sent to the aforementioned question-and-answer big model for output content generation. During the output content generation process, the aforementioned rule meta-correction unit is used for rule discrimination and correction. This method overcomes the lag limitations of traditional post-processing filtering and constructs a full-link, phased dynamic constraint mechanism: First, the rule meta-gating unit accurately identifies and decouples the two stages of "thinking chain reasoning" and "main content generation," achieving fine-grained insight and intervention into the model's cognitive process; second, the rule meta-review unit and rule meta-correction unit in the rule meta module respectively perform violation discrimination and correction on the output of the two stages. Thus, while maintaining the expressive power of the big model, it guides and controls the output of the big model from the source, improving the controllability of the big model's output and making the output content of the big model safer, more reliable, controllable, and compliant. Attached Figure Description
[0011] The above and other features, advantages, and aspects of the embodiments of this disclosure will become more apparent from the accompanying drawings and the following detailed description. Throughout the drawings, the same or similar reference numerals denote the same or similar elements. It should be understood that the drawings are schematic, and elements are not necessarily drawn to scale.
[0012] Figure 1 This is a flowchart of some embodiments of the rule-based content generation method according to this disclosure; Figure 2 This is a schematic diagram of the structure of a large-scale question-answering model based on a rule-based content generation method according to some embodiments of this disclosure; Figure 3 This is a schematic diagram of the structure of some embodiments of the rule-based content generation apparatus according to the present disclosure; Figure 4 This is a schematic diagram of the structure of an electronic device suitable for implementing some embodiments of the present disclosure. Detailed Implementation
[0013] Embodiments of this disclosure will now be described in more detail with reference to the accompanying drawings. While some embodiments of this disclosure are shown in the drawings, it should be understood that this disclosure can be implemented in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided to provide a more thorough and complete understanding of this disclosure. It should be understood that the accompanying drawings and embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of protection of this disclosure.
[0014] It should also be noted that, for ease of description, only the parts relevant to the invention are shown in the accompanying drawings. Unless otherwise specified, the embodiments and features described in this disclosure can be combined with each other.
[0015] It should be noted that the concepts of "first" and "second" mentioned in this disclosure are used only to distinguish different devices, modules or units, and are not used to limit the order of functions performed by these devices, modules or units or their interdependencies.
[0016] It should be noted that the terms "a" and "a plurality of" used in this disclosure are illustrative rather than restrictive, and those skilled in the art should understand that, unless otherwise expressly indicated in the context, they should be understood as "one or more".
[0017] The names of messages or information exchanged between multiple devices in the embodiments of this disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
[0018] Before performing any of the operations involving the collection, storage, or use of user personal information (such as input text) disclosed in this disclosure, the relevant organizations or individuals shall fulfill their obligations, including conducting personal information security impact assessments, informing personal information subjects, and obtaining prior authorization and consent from personal information subjects.
[0019] This disclosure will now be described in detail with reference to the accompanying drawings and embodiments.
[0020] Figure 1 A flow 100 of some embodiments of the rule-based meta-based content generation method according to this disclosure is shown. The rule-based meta-based content generation method includes the following steps: Step 101: Receive the input text from the user.
[0021] In some embodiments, the executor of the rule-based content generation method (e.g., a computing device) can receive input text from a user. This input text can be a question the user asks the large model, or text requiring a response from the large model.
[0022] Step 102: Input the input text into the pre-trained question-answering model.
[0023] In some embodiments, the aforementioned execution entity can input the input text into a pre-trained question-answering model. This question-answering model can be a pre-trained large language model used to respond to the input text. For example, the basic model structure of the question-answering model can include, but is not limited to, Qwen3 and DeepSeek V3. The question-answering model can include a rule meta-module. This rule meta-module can include a rule meta-gating unit, a rule meta-review unit, and a rule meta-correction unit. The rule meta-gating unit can act as a state perceiver in the generation process, responsible for real-time parsing of the generation context, determining whether the current stage is the Chain-of-Thought (CoT) reasoning stage or the main content generation stage, and triggering differentiated constraint strategies accordingly. The rule meta-review unit can be used to discriminate and correct the output of the Chain-of-Thought reasoning stage. The rule meta-correction unit can be used to discriminate and correct the output of the main content stage. Rule meta-elements can be a type of basic meta-information in a predefined meta-computation paradigm, and can be explicitly encoded logical rules, physical rules, ethical guidelines, business constraints, or compliance requirements—meta-representations that can be used to guide or correct model behavior. Meta-computing paradigms can also include data elements, knowledge elements, perceptual elements, and memory elements. Data elements, as meta-representations of the basic data for model reasoning and output, represent contextual semantics and response task objectives, and can correspond to tokens in a large model. Knowledge elements can be meta-representations of structured or semi-structured static knowledge, including common sense knowledge, domain dictionaries, entity attributes (such as material and physical properties), causal relationships, etc. Perceptual elements can be meta-representations of multimodal perceptual information, covering space (distance, orientation, scale, etc.), time, and multiple sensory dimensions such as vision, hearing, touch, smell, and taste. Memory elements can be meta-representations that simulate human memory, used to continuously record, organize, retrieve, and fuse short-term (working memory) and long-term (experience / personalized) memory during reasoning, ensuring that the AI system possesses contextual consistency, task continuity, personalized adaptation, and self-correction capabilities.
[0024] Optionally, the aforementioned rule meta-module may also include a rule base. The rule base stores rule information corresponding to each constraint dimension. This rule information represents the constraints that the stored model generation must adhere to. These constraint dimensions may include, but are not limited to: security filtering, ethical alignment, privacy protection, content compliance, and technical operation. The rule information for the security filtering dimension can be used to define the blocking standards for violence, hate speech, illegal activities, and fraudulent information. The rule information for the ethical alignment dimension can be used to define the neutrality principles that the model should follow in its output, explicitly prohibiting the generation of content with gender, race, or regional discrimination, while also including sensitive expression norms from different cultural backgrounds to ensure that the model's values are consistent with human ethics. The rule information for the privacy protection dimension can be used to stipulate the bottom-line principles of data processing, including specific implementation instructions for "data minimization" and "anonymization," and can also explicitly prohibit the model from storing or analyzing users' personally identifiable information (PII), and includes built-in compliance check clauses for global data regulations (such as GDPR). Content compliance rules can be used to set fine-grained constraints for specific vertical fields (such as politics, healthcare, and education). For example, neutrality requirements for political topics, safety verification of health advice, and appropriateness standards for educational content. Technical operation rules involve underlying stability constraints on model operation, which may include dynamic input length limits, standardized output format templates (such as JSON and Markdown specifications), and self-restraint mechanisms to prevent the model from getting stuck in infinite loops or experiencing hallucinations.
[0025] The rule information stored in the aforementioned rule base is stored in text formats including semantic-level and keyword-level formats. Semantic-level formats can represent complex rule descriptions, employing long text paragraphs written in natural language or logical decision trees to handle complex contextual scenarios. The system can parse semantic-level rule information through a semantic understanding module to determine whether user intent or generated content violates deep logical constraints. For example, a semantic-level rule might be "Specific medication advice cannot be given without providing a medical disclaimer." Keyword-level formats can represent blacklists, employing structured word lists, regular expressions, or phrase sets for rapid interception of high-frequency, clearly defined violations.
[0026] As examples, rule information corresponding to violent content may include "Prohibiting the generation of content that promotes, glorifies, or provides specific steps for carrying out serious personal injury" (semantic level), and may also include "weapon manufacturing, methods of lethal injury, details of torture" (keyword level). Rule information corresponding to hate speech may include "Prohibiting content that denigrates, dehumanizes, or incites discrimination against specific groups based on protected characteristics such as race, religion, gender, sexual orientation, or disability" (semantic level). Rule information that explicitly prohibits the model from storing or analyzing users' personally identifiable information may include "The model is strictly prohibited from processing, storing, remembering, or outputting any identity information (including name, ID number, biometrics, precise address, contact information, etc.) that can directly or indirectly identify a specific natural person."
[0027] As an example, the model structure of the question-answering big data model can be referenced. Figure 2 . Figure 2 In this code, "Text input" refers to the input text. "Vocab Embedding" refers to lexical embedding, used to map discrete tokens to vector representations. "MLA" stands for Multi-head Latent Attention. "FFN" stands for Feed-Forward Network. "MoE" stands for Mixture of Experts. "N×Decoder" refers to N decoders. "Decode sampling" refers to decoding sampling, which determines how to select the next token from the model output. "Output token" refers to the output token, i.e., the model's output content.
[0028] The above rule-based gating unit is configured to perform the following steps: The first step is to scan the tokens in the token stream output from the sampling layer. The token stream can be a stream of tokens that is input to the rule meta-gating unit in real time.
[0029] The second step involves determining that a reasoning start marker has been detected, thus indicating that the current content generation process has entered the thought chain reasoning stage. The reasoning start marker can be a pre-defined marker indicating the start of the thought chain reasoning stage, and it can have a corresponding reasoning end marker. The reasoning end marker is a pre-defined marker indicating the end of the thought chain reasoning stage. The reasoning start marker and the reasoning end marker constitute a thought chain marker pair. For example, a thought chain marker pair could be: <thought> 、< / thought> For example, a thought chain tag pair can be: <think> 、< / think> Therefore, when the reasoning initiation marker is hit, it can be determined that the current content generation process has entered the thought chain reasoning stage.
[0030] The third step involves determining that the inference end marker has been detected, thus confirming that the current content generation process has entered the main content generation stage. Therefore, upon hitting the inference end marker, it can be determined that the current content generation process has entered the main content generation stage.
[0031] Step 103: In response to the determination that the content generation process of the corresponding input text identified by the rule meta gating unit has entered the thinking chain reasoning stage, the intermediate reasoning text obtained by thinking chain reasoning is input to the rule meta review unit.
[0032] In some embodiments, the execution entity may, in response to determining that the content generation process corresponding to the input text identified by the rule meta gating unit has entered the thought chain reasoning stage, input the intermediate reasoning text obtained from the thought chain reasoning to the rule meta review unit. The intermediate reasoning text obtained from the thought chain reasoning may be the natural language reasoning step text generated by the model's autoregression. The intermediate reasoning text from the thought chain reasoning may be a structured natural language reasoning step actively output by the model to improve the quality of the answer.
[0033] Step 104: In response to the fact that the review result of the determination rule element review unit does not meet the preset conditions, the intermediate reasoning text is corrected to obtain the corrected reasoning text.
[0034] In some embodiments, the aforementioned executing entity may, in response to determining that the review result of the aforementioned rule meta-review unit does not meet the preset conditions, modify the aforementioned intermediate reasoning text to obtain modified reasoning text. The preset conditions can be the review result representing a rule violation. In practice, a preset correction prompt can be used to enable the aforementioned question-and-answer model to modify the aforementioned intermediate reasoning text to obtain modified reasoning text. For example, the correction prompt could be: "If non-compliant content is found, the entire thought process will be corrected and modified to ensure that the new content is compliant, and a new, reasonable answer will be given, clearly indicating a refusal to answer." The rule meta-review unit can determine whether a rule is violated through a prompt. For example, the prompt used by the rule meta-review unit to determine whether a rule is violated could be set as: "The following is a dialogue between the user and the model, including the user's question and the model's thought process. Please review the compliance of the thought process. The rules include rule information from the rule base. Based on whether the above rules are met, the review conclusion, i.e., 'compliant' or 'non-compliant,' should be output."
[0035] Step 105: In response to the determination that the content generation process of the corresponding input text identified by the rule meta gating unit has entered the main content generation stage, the corrected reasoning text is sent into the question-answering big model to generate output content. In the process of generating output content, the rule meta correction unit is used to perform rule discrimination and correction.
[0036] In some embodiments, the execution entity may, in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the main content generation stage, send the revised reasoning text into the question-answering big model to generate output content. The main content generation stage may be a display output stage for confirming candidate meta-elements. During the output content generation process, the rule meta-correction unit is used for rule discrimination and correction.
[0037] Optionally, the above-mentioned rule meta-correction unit can be configured to perform the following steps: The first step is to generate a context vector based on the context of the reasoning process. This context can include the generated historical text sequence (i.e., the determined token sequence). In practice, this context can be input into an embedding model to obtain the context vector. For example, the embedding model could be BGE-M3.
[0038] The second step involves retrieving each rule vector in the rule metaspace included in the aforementioned rule meta-module based on the aforementioned context vector, obtaining the retrieval results. In practice, either a brute-force search or a fast search method can be used to retrieve each rule vector in the rule metaspace included in the aforementioned rule meta-module. The retrieval results can represent whether a rule was hit or not. The brute-force search method can directly calculate the similarity between the context vector and each rule vector (such as cosine similarity, Euclidean distance, dot product), sort them, and take the top N similarity scores. If the similarity score is greater than a preset threshold, the rule is considered hit. The fast search method uses the Approximate Nearest Neighbor (ANN) algorithm, which sacrifices a small amount of precision through strategies such as "spatial partitioning" and "quantization" to achieve an order-of-magnitude improvement in retrieval speed. Examples include HNSW and KD-Tree. The algorithm can sort the rules and take the top N similarity scores; if the similarity score is greater than a preset threshold, the rule is considered hit.
[0039] The third step involves rejecting current candidate terms in response to the search result representation hit rules determined above. Candidate terms can be candidate tokens.
[0040] Optionally, the aforementioned rule meta-correction unit can also be configured to trigger the resampling process of the sampling layer in the aforementioned question-answering model, so as to constrain the generation of output content again through the aforementioned rule meta-module until output content that conforms to the rule meta-constraints is generated. For example, the sampling layer can be... Figure 2 "Decode sampling" in the context of this.
[0041] Optionally, the aforementioned rule metaspace can be pre-constructed through the following steps: The first step is to extract the rule information set corresponding to the keyword-level format from the rule base.
[0042] The second step is to input the aforementioned rule information set into a preset embedding model to obtain a rule vector set. The preset embedding model can be a pre-trained embedding model, for example, BGE-M3.
[0043] The third step is to construct the rule metaspace based on the aforementioned rule vector set. In practice, the aforementioned rule vector set can be used as the rule metaspace.
[0044] Optionally, when generating the context vector based on the context during the reasoning process, the aforementioned context can be input into the aforementioned preset embedding model to obtain the context vector. Thus, the embedding model used in the process of constructing the rule metaspace can be used to generate the context vector.
[0045] In addressing the technical issues mentioned above, the deployment of a large-scale question-answering model on edge devices (such as in-vehicle terminals, mobile phones, and smart bracelets) often presents the following challenges: The computational load for rule discrimination and correction using rule meta-modules is significantly higher than that of general-purpose large-scale question-answering models. This can exacerbate the load on edge devices under higher loads, leading to system lag and accelerated hardware degradation (Technical Issue Two). To meet the specific requirements of this application scenario—the ability to constrain the output of the large-scale model using rule meta-modules even under higher edge device loads—we have decided to adopt the following solution: The aforementioned implementing entities may also perform the following steps: The first step, in response to determining that the currently running device is an edge device, is to monitor the device's operational resource information. This operational resource information may include, but is not limited to, available memory percentage, CPU utilization, and disk queue depth. Edge device types may include, but are not limited to, in-vehicle terminals, mobile phones, and smart bracelets.
[0046] The second step involves determining the response mode of the aforementioned question-and-answer model as a secondary rule constraint mode in response to the determination that the above-mentioned operational resource information meets the preset load conditions. The preset load conditions can be preset conditions used to determine whether the device is under slightly high load. For example, preset load conditions could be that the available memory percentage is less than a preset percentage, or the CPU utilization rate is greater than a preset ratio, or the disk queue depth is greater than a preset value. Specific preset percentages, ratios, and values can be preset according to different edge device types. The sub-conditions corresponding to the available memory percentage, CPU utilization rate, and disk queue depth can have a logical relationship of "AND" or "OR," without specific limitations. The secondary rule constraint mode can represent a rule constraint force that is less than the standard rule constraint mode. The standard rule constraint mode can be a mode in which both the rule element review unit and the rule element correction unit participate in the constraint, i.e., the mode corresponding to steps 101-105.
[0047] Thirdly, under the aforementioned secondary rule constraint mode, in response to the determination that the content generation process corresponding to the input text identified by the aforementioned rule meta-gating unit has entered the thought chain reasoning stage, the aforementioned rule meta-review unit is disabled. That is, under the secondary rule constraint mode, the rule meta-review unit does not process the intermediate reasoning text obtained from the thought chain reasoning, and the aforementioned rule meta-correction unit is normally available.
[0048] Fourth, under the above-mentioned secondary rule constraint mode, in response to the determination that the content generation process corresponding to the above-mentioned input text has been identified by the above-mentioned rule meta-gating unit, the main content generation stage is entered, and a context vector is generated according to the context in the reasoning process.
[0049] Fifth, based on the aforementioned context vector, the various rule vectors in the rule metaspace included in the aforementioned rule meta module are retrieved to obtain the retrieval results.
[0050] Step six: In response to determining the hit rules for the above search results, reject the current candidate terminology. In practice, the specific implementation of steps four through six can refer to the steps executed by the rule element correction unit. Thus, in the secondary rule constraint mode, rule constraints and corrections can be performed only through the rule element correction unit during the main content generation stage.
[0051] Steps one through six above, as an inventive point of this disclosure, solve technical problem two: "Using the rule meta-module for rule discrimination and correction involves a larger computational load compared to a general question-and-answer model, which increases the load on edge devices, leading to system lag and accelerated hardware wear." Factors causing system lag and accelerated hardware wear are often as follows: using the rule meta-module for rule discrimination and correction involves a larger computational load compared to a general question-and-answer model; when the load on edge devices is slightly high, it increases the load on edge devices. Solving these factors can alleviate system lag and hardware wear. To achieve this effect, this disclosure monitors the operating resource information in real time when it determines that the currently running device is an edge device to determine if the device is under a slightly high load. Under a slightly high load, the question-and-answer model enters a secondary rule constraint mode. In the secondary rule constraint mode, the rule meta-review unit does not process the intermediate reasoning text obtained from the thought chain reasoning; it only performs rule constraints and corrections through the rule meta-correction unit during the main content generation stage. This significantly reduces the reasoning computation of the rule meta-module and the question-and-answer model, alleviating the load on edge devices. This alleviates system lag and hardware wear, and enables the use of rule-based constraints to constrain the output of large models even when edge devices are under slightly higher load.
[0052] In addressing the technical issues mentioned above, the deployment of a large-scale question-answering model in the educational tutoring field often presents the following challenges: the model's level of detail in the answers may not accurately match user needs. For example, a calculus-based solution might be provided for younger users, while a detailed solution for a linear equation might be provided for older users (Technical Issue 3). Considering the specific requirements of this application scenario—the ability to directly extend rule constraints for the educational tutoring field using existing rule meta-modules—we have decided to adopt the following solution: In some optional implementations of certain embodiments, the aforementioned execution entity may further perform the following steps: The first step is to determine the educational background information of the aforementioned users. This educational background information can represent the user's academic level and / or learning ability. For example, educational background information could be first grade of elementary school. Or, it could be a master's degree in statistics.
[0053] The second step is to determine the initial educational prompts for the corresponding reasoning stage of the thought process, based on the aforementioned educational background information. Each piece of educational background information can correspond to a pre-set prompt for the reasoning stage. For example, the prompt for "first grade" could be: the current user's educational level is first grade, and the model's response needs to be consistent with the knowledge background of first grade. In practice, the pre-set prompts corresponding to the aforementioned educational background information and the reasoning stage can be used as the initial educational prompts.
[0054] The third step involves responding to the determination that the content generation process corresponding to the input text identified by the aforementioned rule-based gating unit has entered the thought chain reasoning stage. Based on the initial prompts, intermediate reasoning text is obtained through thought chain reasoning. In practice, the initial prompts can be combined with the prompts used in the thought chain reasoning stage to obtain intermediate reasoning text.
[0055] The fourth step is to determine the rule information corresponding to the educational dimension and matching the aforementioned educational background information from the rule base. The rule base can pre-store rule information for each educational dimension. Each rule for an educational dimension can correspond to specific educational background information. For example, the rule information corresponding to "first grade" might include: the solution method for word problems cannot exceed the first grade curriculum. In practice, the rule information in the rule base that corresponds to the educational dimension and whose corresponding educational background information is the aforementioned educational background information can be identified as the matching rule information.
[0056] The fifth step involves revising the intermediate reasoning text based on the matched rule information to update it. In practice, the matched rule information can be directly used as a correction prompt, allowing the question-and-answer model to revise the intermediate reasoning text during the thought chain reasoning stage, resulting in the revised reasoning text.
[0057] The first to fifth steps described above, as an inventive aspect of this disclosure, address technical problem three: "the model's level of detail in the answer has a low degree of matching with user needs." Factors leading to this low matching degree often include: the large question-and-answer model lacks constraints related to the user's educational background. Solving these factors can improve the matching degree between the model's level of detail and user needs. To achieve this, this disclosure directly extends the rule information of the education dimension to the already constructed rule base. During the reasoning stage of the thought chain, the matched rule information of the education dimension can be used to correct the intermediate reasoning text obtained from the reasoning, ensuring that the corrected intermediate reasoning text conforms to the rule constraints of the corresponding user's educational background under the education dimension. Therefore, the already constructed rule meta-module can be used to directly extend the rule constraints for the educational tutoring field, improving the matching degree between the model's level of detail and user needs.
[0058] In some optional implementations of certain embodiments, the aforementioned execution entity may further perform the following steps: The first step is to determine the basic educational background of the users mentioned above. This background can be pre-defined by the users. For example, the basic educational background could be "first grade of primary school".
[0059] The second step involves generating a dynamic educational background based on the user's question-and-answer records corresponding to the aforementioned question-and-answer model. These records can include individual question-and-answer pairs. Each pair can consist of the user's question text and the answer provided by the question-and-answer model. The dynamic educational background can be used to characterize the user's learning ability level. For example, a dynamic educational background could indicate strong learning ability.
[0060] The third step is to define the aforementioned basic educational background and dynamic educational background as the user's educational background information, and then update this information. In practice, the update frequency for educational background information can be preset. For example, a user's educational background information can be updated every three months.
[0061] In some optional implementations of certain embodiments, the aforementioned executing entity can generate dynamic educational background based on the question-and-answer records of the user corresponding to the aforementioned question-and-answer big model through the following steps: The first step is to extract the most recent preset number of question texts from the above question and answer records, obtaining each question text as the initial question text set. For example, the preset number can be 100.
[0062] The second step is to clean the initial question text set to obtain the final question text set. In practice, the initial question text set can be obtained by removing initial question texts with fewer than a preset number of characters. For example, the preset number can be 3.
[0063] The third step is to determine the total number of words in the questions included in the above question text set.
[0064] The fourth step is to generate the average question length based on the total number of words in the questions and the preset quantity. In practice, the ratio of the total number of words in the questions to the preset quantity can be used to determine the average question length.
[0065] The fifth step is to determine the high-order word density by the number of times the above question text set hits high-order words. In practice, a high-order word set can be pre-defined. For example, the high-order word set can include, but is not limited to: why, how, design, verify, compare, and limitations.
[0066] The sixth step is to determine the percentage of questions that conform to the question format in the aforementioned question text set as the follow-up question activity level. It is understandable that a single question text may contain multiple questions.
[0067] The seventh step is to determine the knowledge relevance by the number of times the above question text set matches the relevant words. In practice, a set of relevant words can be pre-defined. For example, the set of relevant words can include, but is not limited to: connection, combination, before, similar.
[0068] Step 8: Generate a question score based on the above-mentioned average question length, higher-order word density, follow-up question activity, and knowledge relevance. In practice, corresponding score mapping tables can be pre-configured for the four dimensions of average question length, higher-order word density, follow-up question activity, and knowledge relevance. For example, the score mapping table for average question length can include: <8 characters = 0 points; 8~15 characters = 0.5 points; >15 characters = 1 point. The score corresponding to the average question length can be determined through the score mapping table for average question length. The score corresponding to the higher-order word density can be determined through the score mapping table for higher-order word density. The score corresponding to the follow-up question activity can be determined through the score mapping table for follow-up question activity. The score corresponding to the knowledge relevance can be determined through the score mapping table for knowledge relevance. Finally, the weighted sum of the four determined scores can be used to determine the question score. There are no specific limitations on the setting of the weighting coefficients. For example, the weighting coefficients for the dimensions of average question length, higher-order word density, follow-up question activity, and knowledge relevance can be 0.3, 0.4, 0.2, and 0.1, respectively.
[0069] Step nine involves generating the dynamic educational background corresponding to the user based on the question scores. In practice, a pre-defined grading table can be used to determine the learning ability level corresponding to the question scores. The grading table can include the score range and the corresponding learning ability level for each score range. Learning ability levels can be differentiated by grades. The determined learning ability level is then used as the dynamic educational background for the user. Therefore, based on the user's historical inquiry history within the large question-and-answer model, their educational background can be dynamically determined, allowing the model's responses to dynamically match the user's needs.
[0070] The above embodiments of this disclosure have the following beneficial effects: By using the rule-based content generation method of some embodiments of this disclosure to constrain the output content of a large model with rule elements, the output of the large model can be guided and controlled from the source while maintaining its expressive power, thereby improving the controllability of the large model's output and making its output content safer, more reliable, controllable, and compliant. Specifically, the reason for the high uncontrollability of the output is that the large model lacks cognition of physical rules and knowledge, and has fundamental limitations such as difficulty in establishing causal relationships and implementing complex logical reasoning. Based on this, the rule-based content generation method of some embodiments of this disclosure first receives input text from the user. Second, the input text is input into a pre-trained question-answering large model, wherein the question-answering large model includes a rule element module, which includes a rule element gating unit, a rule element review unit, and a rule element correction unit. Then, in response to determining that the content generation process corresponding to the input text identified by the rule element gating unit has entered the thought chain reasoning stage, the intermediate reasoning text obtained by the thought chain reasoning is input into the rule element review unit. Secondly, in response to the determination that the review result of the aforementioned rule meta-review unit does not meet the preset conditions, the aforementioned intermediate reasoning text is corrected to obtain the corrected reasoning text. Finally, in response to the determination that the content generation process corresponding to the aforementioned input text has entered the main content generation stage through the aforementioned rule meta-gating unit, the aforementioned corrected reasoning text is sent to the aforementioned question-and-answer big model for output content generation. During the output content generation process, the aforementioned rule meta-correction unit is used for rule discrimination and correction. This method overcomes the lag limitations of traditional post-processing filtering and constructs a full-link, phased dynamic constraint mechanism: First, the rule meta-gating unit accurately identifies and decouples the two stages of "thinking chain reasoning" and "main content generation," achieving fine-grained insight and intervention into the model's cognitive process; second, the rule meta-review unit and rule meta-correction unit in the rule meta module respectively perform violation discrimination and correction on the output of the two stages. Thus, while maintaining the expressive power of the big model, it guides and controls the output of the big model from the source, improving the controllability of the big model's output and making the output content of the big model safer, more reliable, controllable, and compliant.
[0071] Further reference Figure 3 As an implementation of the methods shown in the above figures, this disclosure provides some embodiments of a content generation apparatus based on rule elements, which are similar to... Figure 1 Corresponding to the method embodiments shown, the device can be specifically applied to various electronic devices.
[0072] like Figure 3As shown, the rule-based content generation apparatus 300 in some embodiments includes: a receiving unit 301, a first input unit 302, a second input unit 303, a correction unit 304, and a third input unit 305. The receiving unit 301 is configured to receive input text from a user; the first input unit 302 is configured to input the input text into a pre-trained question-answering model, wherein the question-answering model includes a rule meta-module, which includes a rule meta-gating unit, a rule meta-review unit, and a rule meta-correction unit; the second input unit 303 is configured to, in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the thought chain reasoning stage, input the intermediate reasoning text obtained by thought chain reasoning into the rule meta-review unit; the correction unit 304 is configured to, in response to determining that the review result of the rule meta-review unit does not meet the preset conditions, correct the intermediate reasoning text to obtain corrected reasoning text; the third input unit 305 is configured to, in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the main content generation stage, send the corrected reasoning text into the question-answering model for output content generation, wherein, during the output content generation process, the rule meta-correction unit is used for rule discrimination and correction.
[0073] It is understandable that the units described in the device 300 are related to the reference. Figure 1 The steps in the described method correspond accordingly. Therefore, the operations, features, and beneficial effects described above for the method also apply to the device 300 and the units contained therein, and will not be repeated here.
[0074] The following is for reference. Figure 4 It shows a schematic diagram of the structure of an electronic device 400 (e.g., a server or terminal device) suitable for implementing some embodiments of the present disclosure. Figure 4 The electronic device shown is merely an example and should not be construed as limiting the functionality and scope of the embodiments of this disclosure.
[0075] like Figure 4 As shown, the electronic device 400 may include a processing unit 401 (e.g., a central processing unit, a graphics processor, etc.), which can perform various appropriate actions and processes according to a program stored in a read-only memory (ROM) 402 or a program loaded from a storage device 408 into a random access memory (RAM) 403. The RAM 403 also stores various programs and data required for the operation of the electronic device 400. The processing unit 401, ROM 402, and RAM 403 are interconnected via a bus 404. An input / output (I / O) interface 405 is also connected to the bus 404.
[0076] Typically, the following devices can be connected to I / O interface 405: input devices 406 including, for example, touchscreens, touchpads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; output devices 407 including, for example, liquid crystal displays (LCDs), speakers, vibrators, etc.; storage devices 408 including, for example, magnetic tapes, hard disks, etc.; and communication devices 409. Communication device 409 allows electronic device 400 to communicate wirelessly or wiredly with other devices to exchange data. Although Figure 4 An electronic device 400 with various devices is shown; however, it should be understood that it is not required to implement or possess all of the devices shown. More or fewer devices may be implemented or possessed alternatively. Figure 4 Each box shown can represent a device or multiple devices as needed.
[0077] In particular, according to some embodiments of this disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, some embodiments of this disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods shown in the flowcharts. In such embodiments, the computer program can be downloaded and installed from a network via communication device 409, or installed from storage device 408, or installed from ROM 402. When the computer program is executed by processing device 401, it performs the functions defined above in the methods of some embodiments of this disclosure.
[0078] It should be noted that, in some embodiments of this disclosure, the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium, or any combination thereof. A computer-readable storage medium may be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In some embodiments of this disclosure, a computer-readable storage medium may be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In some embodiments of this disclosure, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals may take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wires, optical fibers, RF (radio frequency), etc., or any suitable combination thereof.
[0079] In some implementations, clients and servers can communicate using any currently known or future-developed network protocol such as HTTP (Hypertext Transfer Protocol) and can interconnect with digital data communication (e.g., communication networks) of any form or medium. Examples of communication networks include local area networks (“LANs”), wide area networks (“WANs”), the Internet (e.g., the Internet of Things), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future-developed networks.
[0080] The aforementioned computer-readable medium may be included in the aforementioned electronic device; or it may exist independently and not assembled into the electronic device. The aforementioned computer-readable medium carries one or more programs, which, when executed by the electronic device, cause the electronic device to: receive input text from a user; input the input text into a pre-trained question-answering model, wherein the question-answering model includes a rule meta-module, the rule meta-module including a rule meta-gating unit, a rule meta-review unit, and a rule meta-correction unit; in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the thought chain reasoning stage, input the intermediate reasoning text obtained from the thought chain reasoning into the rule meta-review unit; in response to determining that the review result of the rule meta-review unit does not meet preset conditions, correct the intermediate reasoning text to obtain corrected reasoning text; in response to determining that the content generation process corresponding to the input text identified by the rule meta-gating unit has entered the main content generation stage, send the corrected reasoning text into the question-answering model for output content generation, wherein, during the output content generation process, the rule meta-correction unit is used for rule discrimination and correction.
[0081] Computer program code for performing operations of some embodiments of this disclosure can be written in one or more programming languages or a combination thereof, including object-oriented programming languages such as Java, Smalltalk, and C++, and conventional procedural programming languages such as the "C" language or similar programming languages. The program code can be executed entirely on the user's computer, partially on the user's computer, as a standalone software package, partially on the user's computer and partially on a remote computer, or entirely on a remote computer or server. In cases involving remote computers, the remote computer can be connected to the user's computer via any type of network—including a local area network (LAN) or a wide area network (WAN)—or can be connected to an external computer (e.g., via the Internet using an Internet service provider).
[0082] The flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of this disclosure. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions indicated in the blocks may occur in a different order than those indicated in the drawings. For example, two consecutively indicated blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or operation, or using a combination of dedicated hardware and computer instructions.
[0083] The units described in some embodiments of this disclosure can be implemented in software or hardware. The described units can also be housed in a processor; for example, a processor may be described as including a receiving unit, a first input unit, a second input unit, a correction unit, and a third input unit. The names of these units do not necessarily limit the specific unit; for example, a receiving unit may also be described as "a unit that receives input text from a user."
[0084] The functions described above in this document can be performed at least in part by one or more hardware logic components. For example, exemplary types of hardware logic components that can be used, without limitation, include: field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), application-specific standard products (ASSPs), system-on-a-chip (SoCs), complex programmable logic devices (CPLDs), and so on.
[0085] Some embodiments of this disclosure also provide a computer program product, including a computer program that, when executed by a processor, implements any of the rule-based content generation methods described above.
[0086] The above description is merely a selection of preferred embodiments of this disclosure and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of the invention involved in the embodiments of this disclosure is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the above-described inventive concept. For example, technical solutions formed by substituting the above-described features with (but not limited to) technical features with similar functions disclosed in the embodiments of this disclosure.
Claims
1. A content generation method based on rule meta-elements, comprising: Receive text input from the user; The input text is fed into a pre-trained question-answering model, wherein the question-answering model includes a rule meta module, and the rule meta module includes a rule meta gating unit, a rule meta review unit, and a rule meta correction unit; In response to determining that the content generation process of the input text identified by the rule meta gating unit has entered the thinking chain reasoning stage, the intermediate reasoning text obtained by the thinking chain reasoning is input to the rule meta review unit. In response to determining that the review result of the rule meta-review unit does not meet the preset conditions, the intermediate reasoning text is corrected to obtain the corrected reasoning text; In response to the determination that the content generation process of the corresponding input text identified by the rule meta-gating unit has entered the main content generation stage, the corrected reasoning text is sent into the question-answering big model to generate output content. During the output content generation process, the rule meta-correction unit is used to perform rule discrimination and correction.
2. The method according to claim 1, wherein, The rule meta-correction unit is configured to perform the following steps: Generate a context vector based on the context of the reasoning process; Based on the context vector, the rule vectors in the rule metaspace included in the rule meta module are retrieved to obtain the retrieval results; In response to determining the hit rules for the retrieval results, the current candidate terminology is rejected.
3. The method according to claim 2, wherein, The rule meta-correction unit is also configured to perform the following steps: Trigger the resampling process of the sampling layer in the question-answering model to constrain the generation of output content again through the rule meta module.
4. The method according to claim 2, wherein, The rule meta module also includes a rule base, which stores rule information corresponding to each constraint dimension. The text storage format of the rule information stored in the rule base includes semantic level and keyword level.
5. The method according to claim 4, wherein, The rule metaspace is pre-constructed through the following steps: Extract the set of rule information corresponding to the keyword-level format from the rule base; The rule information set is input into a preset embedding model to obtain a rule vector set; Based on the set of rule vectors, construct the rule metaspace.
6. The method according to claim 5, wherein, The step of generating a context vector based on the context during the reasoning process includes: The context is input into the preset embedding model to obtain the context vector.
7. The method according to claim 1, wherein, The rule meta-gating unit is configured to perform the following steps: The tokens of the word stream output from the sampling layer are scanned; In response to the detection of the reasoning initiation marker, the current content generation process is determined to enter the thought chain reasoning stage; In response to the detection of the inference end marker, the current content generation process is determined to enter the main content generation stage.
8. A content generation apparatus based on rule elements, comprising: The receiving unit is configured to receive input text from the user; The first input unit is configured to input the input text into a pre-trained question-answering model, wherein the question-answering model includes a rule meta module, and the rule meta module includes a rule meta gating unit, a rule meta review unit, and a rule meta correction unit; The second input unit is configured to input the intermediate reasoning text obtained by the thought chain reasoning to the rule meta review unit in response to determining that the content generation process corresponding to the input text has entered the thought chain reasoning stage through the rule meta gating unit. The correction unit is configured to correct the intermediate inference text in response to determining that the review result of the rule meta-review unit does not meet the preset conditions, so as to obtain the corrected inference text. The third input unit is configured to, in response to determining that the content generation process corresponding to the input text identified by the rule meta gating unit has entered the main content generation stage, send the corrected reasoning text into the question-answering big model to generate output content. During the output content generation process, the rule meta correction unit is used to perform rule discrimination and correction.
9. An electronic device, comprising: One or more processors; Storage device, on which one or more programs are stored, When the one or more programs are executed by the one or more processors, the one or more processors implement the method as described in any one of claims 1-7.
10. A computer-readable medium having a computer program stored thereon, wherein, When the computer program is executed by a processor, it implements the method as described in any one of claims 1-7.