Business specification generation method and system based on specific language rules and thought chains
By building a lightweight database based on ANTLR4 and Jinja2 DSL rule sets, the problem of low efficiency in business database storage and logical calls is solved, and efficient and standardized business content output is achieved, which is suitable for complex engineering fields such as automotive electronics and aerospace.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHANGSHA KUAIZI TECHNOLOGY CO LTD
- Filing Date
- 2026-04-24
- Publication Date
- 2026-06-19
AI Technical Summary
Existing technologies suffer from large storage volumes, low logic call efficiency, and insufficient output content standardization. Especially in complex engineering fields such as automotive electronics and aerospace, traditional database solutions fail to effectively separate content from logic, resulting in high storage costs, slow call response, and output that does not conform to business specifications.
We adopt a method based on ANTLR4 semantic rules and Jinja2 language template rules. We construct a syntax parser and a thought chain reasoning step template through the DSL rule set, filter non-compliant content, generate a lightweight database that only stores logical relationship information, and achieve high-standard output through the lightweight database.
It achieves a 99.7% reduction in database size, a 99% reduction in inference output size, an 80% increase in response speed, and a 100% standardization rate for output content, supporting flexible output with both automated and manual review.
Smart Images

Figure CN122240625A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of domain-specific language applications and database construction technology, and in particular to a method and system for generating business specifications based on specific language rules and thought chains. Background Technology
[0002] The core pain points in the current business domain's database construction and content output are concentrated in the data storage and logical retrieval stages.
[0003] First, traditional business databases need to store all business data. In complex engineering fields such as automotive electronics and aerospace, the storage scale of all business data can easily reach hundreds of GB or even TB, resulting in high storage costs and slow response times. Moreover, the specific content in the database is deeply coupled with business logic, making it difficult to quickly filter out effective information from massive amounts of content when it is necessary to extract core business logic.
[0004] Secondly, the Chain of Thought (COT) is not constrained by business specifications. When processing business data, the general COT has a large number of redundant steps in the reasoning path, and the length of the thought chain is too long. This results in the reasoning results being mixed with a lot of natural language descriptions that are not related to business logic. Not only does it fail to accurately abstract the core logic, but it also further increases the data volume. The volume of the reasoning output is often larger than the original data.
[0005] Third, existing lightweight database solutions only perform simple data compression or index optimization, such as reducing physical storage space usage through dictionary compression and deduplication, but they do not effectively separate "content" from "logic." The database still stores a large amount of actual text content, failing to fundamentally solve the core problems of "large size, slow access, and difficulty in standardizing output."
[0006] Fourth, the output stage lacks unified constraints based on the business logic library. Existing solutions either rely on Large Language Models (LLMs) to freely generate content, resulting in deviations between the output results and business specifications, and the standardization rate cannot be guaranteed; or they call the entire database for retrieval and generation, resulting in low calling efficiency, excessively long response time, and an inability to balance output quality and generation efficiency.
[0007] From the perspective of the overall technology chain, the three stages of DSL (Domain-Specific Language) rule design, COT (Conceptual Object Logic) application, and database construction in existing technologies are in a state of isolation, failing to form a closed loop of "DSL constraints → COT simplification → logical abstraction → lightweight database → standardized output". In particular, there is a technological gap in the key stage of "generating a lightweight database based on COT-abstracted business logic", and no technical solution has yet been able to achieve a fundamental breakthrough in business data storage from "full content storage" to "logical storage only". Summary of the Invention
[0008] This invention proposes a method and system for generating business specifications based on specific language rules and thought chains, aiming to solve the technical problems of large business database storage volume, low logic call efficiency, and insufficient output content standardization in existing technologies.
[0009] In a first aspect, the present invention provides a method for generating business specifications based on specific language rules and thought chains, including: S1, obtaining a domain-specific language (DSL) rule set for the target business domain, wherein the DSL rule set includes ANTLR4 semantic rules and Jinja2 language template rules; S2, Based on the ANTLR4 semantic rules, construct a syntax parser, use the syntax parser to perform semantic compliance verification on all input business data, filter out content that does not conform to the ANTLR4 semantic rules, and obtain the abstract syntax tree nodes that pass the verification. S3. Based on the Jinja2 language template rules, construct a thought chain reasoning step template. The thought chain reasoning step template limits the thought chain reasoning path to only revolving around extracting business logic and business relationships. S4. Using the thought chain reasoning step template, perform simplified thought chain reasoning on the verified abstract syntax tree node, and output structured logical data containing only logical types, associated objects, and association rules. S5, the structured logical data is converted into a lightweight database, which stores only logical constraints, and the logical constraints include business logic and logical association information; S6. Based on the logical constraints stored in the lightweight database, respond to user output requirements and generate target output content that conforms to business specifications.
[0010] The technical effects of the business specification generation method based on specific language rules and thought chains disclosed in this invention are as follows: This invention generates a simplified COT thought chain through pre-defined DSL rule constraints, abstracting massive business data into a lightweight database containing only logical association information, and achieving high-standard output based on this database. Its technical effects include: First, it achieves a fundamental breakthrough in business data storage, moving from "full content storage" to "logic-only storage," compressing the database volume by more than 99.7% compared to a full database, significantly reducing storage costs; Second, through ANTLR4 syntax validation and Jinja2 template constraints, redundant content is filtered and the thought chain length is compressed from the source, reducing the inference output volume by more than 99% and improving processing efficiency; Third, the logical information stored in the lightweight database perfectly matches the business specifications, resulting in a 100% standardization rate for the output content and a response speed more than 80% faster than traditional methods; Fourth, this method has good scenario adaptability, supporting both LLM combined with database output and direct database output, balancing automation and manual review needs.
[0011] Furthermore, S2 specifically includes: Use the ANTLR4 tool to define the lexical rule file and syntax rule file for the target business domain, and generate the corresponding lexical analyzer and syntax analyzer; The parser parses the input business data fragments and generates an abstract syntax tree; Only the abstract syntax tree nodes that pass the syntax validation are used as input to the thought chain reasoning engine.
[0012] Furthermore, S3 specifically includes: Based on the Jinja2 language template rules, a thought chain reasoning step template including logical extraction condition judgment and loop traversal is designed. In the thought chain reasoning step template, corresponding reasoning steps are defined for different types of business logic. The types of reasoning steps include reference relationship extraction, parameter constraint extraction, module association extraction, and normative mapping extraction. Each reasoning step outputs a structured result in a preset format, which includes logical type fields, source object fields, target object fields, and association rule fields.
[0013] Furthermore, the structured logical data in S4 is organized in a data serialization format, and each node is a structured result in the preset format, containing the following fields: A logical unique identifier field, used to uniquely identify a logical record; Logical type fields can take values including at least one of the following: reference, constraint, association, and mapping. The source object field is used to identify the starting point of a logical relationship, which may include a document, module, or parameter; The target object field is used to identify the endpoint of the logical relationship; The logical rules field is used to describe the core logical rules between the source object and the target object; The business specification field is used to record the business specification identifier associated with this logic.
[0014] Furthermore, in step S5, the structured logical data is converted into a lightweight database, specifically including: Import the structured logical data in the data serialization format into a relational database or graph database; In the relational database or graph database, construct a logical node table, an association rule table, and a specification mapping table. The logical node table stores the unique identifier, type, and business domain of a document, module, or parameter. The association rule table stores the association type, logical rules, and association strength between nodes. The specification mapping table stores the mapping relationship between nodes and business specifications. The logical node table, the association rule table, and the specification mapping table do not have fields for storing specific text content.
[0015] Furthermore, S6 specifically includes: Receive user output requirements, call the query interface of the lightweight database, and extract all business logic records related to the user output requirements; The extracted business logic records are converted into prompt constraints of the large language model. The prompt constraints include document reference requirements, parameter constraint requirements, and specification compliance requirements. The prompt constraints are input into the large language model, so that the large language model generates content that conforms to business specifications under the constraints of the prompt constraints.
[0016] Furthermore, S6 specifically includes: Receive user output requirements, call the query interface of the lightweight database, and extract the logical rules and corresponding Jinja2 template identifiers related to the user output requirements; The Jinja2 template engine is invoked to render a standardized output template based on the extracted logical rules, generating an empty template containing parameter name placeholders, constraint rule placeholders, and specification basis placeholders. The system receives specific content from users or automated tools to fill in each placeholder in the empty template, and obtains the final output content that conforms to the logical constraints of the lightweight database.
[0017] Furthermore, the target business domain includes the automotive electronics domain, and the ANTLR4 semantic rules include the semantic verification grammar of electronic control unit parameters. The semantic verification grammar of electronic control unit parameters is defined as a combination rule of parameter name, value and unit.
[0018] Furthermore, the storage format of the full set of business data includes at least one of text documents, code files, and configuration files; the business logic and logical association information includes at least one of document reference relationships, parameter logic constraints, inter-module association rules, and business specification mapping relationships.
[0019] Secondly, this invention provides a business specification generation system based on specific language rules and thought chains, the system being used to execute the method, the system comprising: The DSL rule acquisition module is used to acquire a domain-specific language rule set for the target business domain. The DSL rule set includes ANTLR4 semantic rules and Jinja2 language template rules. The syntax verification module is used to build a syntax parser based on the ANTLR4 semantic rules, perform semantic compliance verification on all input business data, filter out content that does not conform to the ANTLR4 semantic rules, and obtain the abstract syntax tree nodes that pass the verification. The Mind Chain Customization Module is used to construct a Mind Chain Reasoning Step Template based on the Jinja2 language template rules, and to perform simplified Mind Chain Reasoning on the verified Abstract Syntax Tree Node using the Mind Chain Reasoning Step Template, outputting structured logical data containing only logical types, associated objects, and association rules. The database construction module is used to convert the structured logical data into a lightweight database. The lightweight database only stores business logic and logical relationship information, and does not store the specific content of the full business data. The standardized output module is used to respond to user output requirements based on the logical constraints stored in the lightweight database and generate target output content that conforms to business specifications.
[0020] The technical advantages of the system disclosed in this invention are as follows: The system consists of a DSL rule acquisition module, a syntax verification module, a mind chain customization module, a database construction module, and a specification output module. These modules work collaboratively to execute the steps described. The system achieves a complete closed loop of "DSL constraints → COT simplification → logical abstraction → lightweight DB → specification output," and has the following technical advantages: First, the system effectively filters non-compliant content through the syntax verification module, reducing the computational load of subsequent modules. Second, the system generates highly structured logical data through the mind chain customization module, providing standardized input for database construction. Third, the lightweight database constructed by the system only stores logical association information, achieving extreme compression while maintaining the integrity of business logic, with a single logical query response time ≤10ms. Fourth, the system supports flexible specification output methods, with output content strictly adhering to the logical constraints in the database, achieving a 100% standardization rate. Fifth, the system has excellent scalability; when adding or modifying business specifications, only the logical rule records need to be adjusted, without reconstructing the entire database, reducing expansion costs by more than 90%. Attached Figure Description
[0021] Figure 1 This is a flowchart illustrating the business specification generation method based on specific language rules and thought chains proposed in an embodiment of the present invention. Figure 2 A detailed flowchart of syntax validation and filtering provided for embodiments of the present invention; Figure 3 A detailed flowchart of a high-specification output method based on a lightweight database provided in an embodiment of the present invention; Figure 4 This is a schematic diagram of the core data table structure of a lightweight database provided in an embodiment of the present invention. Detailed Implementation
[0022] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0023] Traditional databases require storing all content, resulting in terabyte-level storage overhead and minute-level response latency; general-purpose thought chain reasoning is redundant and leads to bloated output size; existing lightweight solutions fail to effectively separate content from logic; and the output stage lacks unified constraints based on a logic library, resulting in a lack of standardization. This invention aims to provide a method and system capable of compressing massive amounts of business data into a lightweight database containing only logical relationships, and achieving fast, highly standardized output based on this database.
[0024] This invention belongs to the technical fields of Domain Specific Language (DSL) applications, thought chain reasoning, and lightweight database construction. Specifically, it relates to a method and system for generating business specifications based on DSL rules and thought chains. By constraining the prior DSL rules, a simplified COT thought chain is generated, abstracting business data into a lightweight database containing only logical relationships. Based on this database, a method and system are implemented to achieve highly standardized and efficient output of business content. This invention is applicable to the construction and standardized output scenarios of various business domain databases such as software repositories, code repositories, and document repositories.
[0025] Example 1 This implementation provides a method for generating business specifications based on specific language rules and thought processes, using the automotive electronics field as an example. It should be understood that the automotive electronics field is merely illustrative; the method of this invention is equally applicable to any field with standardized business data, such as aerospace, medical devices, industrial control, and software development.
[0026] Step S1: Obtain the DSL rule set for the target business domain.
[0027] The DSL rule set for the target business domain consists of two parts: ANTLR4 semantic rules and Jinja2 language template rules. The DSL rule set can be generated in advance by integrating ANTLR4 semantic rules with Jinja2 language templates and combining them with business specifications.
[0028] In the field of automotive electronics, the ANTLR4 semantic rules define the semantic verification grammar for Electronic Control Unit (ECU) parameters. For example, for the ECU parameter "VCU maximum speed 120km / h", its semantic verification grammar can be defined as: ECU Parameters → Parameter Name + Value Unit ECU Parameters.
[0029] The parameter names are strings, the values are integers or floating-point numbers, and the units are preset sets of physical quantity units (such as km / h, V, A, kW, etc.). Jinja2 language template rules define the standardized structure of business documents, such as the document templates specified in the ISO 26262 functional safety standard.
[0030] Step S2: Perform syntax validation and filtering based on ANTLR4 semantic rules.
[0031] Based on the ANTLR4 semantic rules in the DSL rule set, a parser is constructed. Specifically, the ANTLR4 tool is used to define the lexical rule file and grammar rule file for the target business domain, and the corresponding lexer and parser are generated using the ANTLR4 tool.
[0032] When parsing the full input business data, the lexical analyzer first converts the input character stream into a lexical unit stream, and the syntax analyzer then constructs an Abstract Syntax Tree (AST) from the lexical unit stream based on the context-free grammar. Only the AST nodes that pass the syntax check are used as input to the subsequent thought chain inference engine, filtering out content that does not conform to the ANTLR4 semantic rules, thus reducing the thought chain inference load from the source.
[0033] For example, when the input ECU raw data fragment is "VCU maximum speed 120km / h", the parser verifies that it conforms to the semantic rule of "parameter name + value + unit", generates the corresponding abstract syntax tree node, and retains it. When the input fragment is an incomplete expression such as "VCU maximum speed undetermined" or "VCU maximum speed", the parser determines that it does not conform to the semantic rule and directly filters it without processing. In this embodiment, the ANTLR4 parser is used to verify 300GB of ECU raw data, filtering out approximately 80% of redundant and non-compliant content.
[0034] Step S3: Construct a template for the reasoning steps of the thought chain.
[0035] Based on the Jinja2 language template rules in the DSL rule set, a thought chain reasoning step template is constructed. This thought chain reasoning step template limits the thought chain reasoning path to revolving only around extracting business logic and business relationships, avoiding irrelevant reasoning steps and compressing the length of the thought chain.
[0036] Step S4: Using the thought chain reasoning step template, perform simplified thought chain reasoning on the verified abstract syntax tree node.
[0037] The specific implementation involves designing a Jinja2-formatted thought chain reasoning step template, including a template structure for logical extraction, conditional judgment, and loop traversal. Different reasoning steps are defined for different types of business logic, including reference relationship extraction, parameter constraint extraction, module association extraction, and specification mapping extraction.
[0038] The thought chain reasoning engine strictly follows this template to perform reasoning, and each step of the reasoning outputs a structured result in a preset format. This preset format includes logical type fields, source object fields, target object fields, and association rule fields. For example, a typical reasoning output is: {Logical type: reference, source object: ECU design document, target object: BMS test report, association rule: corresponding to section 3.2}.
[0039] Actual testing shows that the thought chain based on the Jinja2 inference template is about 60% shorter than that of the general COT, and the volume of the inference output is reduced by more than 99% compared to the general COT.
[0040] Step S5: Convert the structured logical data into a lightweight database.
[0041] The MindChain reasoning engine is invoked to traverse all business data, strictly adhering to the structured format of "logical type - associated object - association rule," extracting only core logical information and completely stripping away specific text and code content. The structured logical results output by MindChain are converted into JSON files using Python's JSON library.
[0042] Import the JSON file into a relational database (such as PostgreSQL) or a graph database (such as Neo4j). Create only three core tables in the database, without setting any fields to store actual content. These three core tables are the logical node table (t_logic_node), the association rule table (t_relation_rule), and the standard mapping table (t_standard_mapping). Their specific field definitions are detailed in the appendix. Figure 4 The structures shown are consistent.
[0043] In this embodiment, the logical results output by the thought chain are converted into a JSON file containing 100,000 logical nodes, with an original file size of approximately 200MB. After importing into a PostgreSQL database and optimizing the index, the final database size is only 900MB. Through an innovative design of "content stripping + structured storage + index optimization," a compression of 300GB of full business data to a lightweight 900MB database is achieved. The compression ratio C is calculated as follows: ; in, To reduce the size of the database, This represents the volume of all business data. The core feature of a lightweight database is that it stores concise logical chains of thought within the business domain, fully matching business specifications and standards, with a single logical query response time of ≤10ms.
[0044] Step S6: Achieve high-standard output based on a lightweight database.
[0045] Based on the logical constraints stored in a lightweight database, this invention responds to user output requests and generates target output content that conforms to business specifications. This invention provides two output methods, which will be described in detail in subsequent embodiments.
[0046] Example 2: Method for constructing a syntax parser.
[0047] This embodiment details the construction process of the syntax parser. First, the ANTLR4 tool is used to write a lexical rule file (.g4 file) for the target business domain, defining all legal lexical units in the business domain. In the automotive electronics field, lexical rules include parameter names (consisting of letters, numbers, and underscores), numerical values (integers or floating-point numbers), units (km / h, V, A, kW, etc.), and separators.
[0048] Second, write the syntax rules in the same .g4 file to define the context-free grammar of the business data. For example, the syntax rules for ECU parameters can be defined as follows: ecu_param: param_name value unit; param_name: IDENTIFIER; value: INT | FLOAT; unit: 'km / h' | 'V' | 'A' | 'kW'; Third, the ANTLR4 tool compiles the .g4 file to generate code in the target programming language (such as Java or Python), resulting in a lexical analyzer and a syntax analyzer. Fourth, at runtime, the lexical analyzer receives the input business data character stream, identifies and outputs a stream of lexical units; the syntax analyzer receives the lexical unit stream and constructs an abstract syntax tree based on grammar rules. Fifth, only nodes that pass syntax verification (i.e., successfully construct an abstract syntax tree) are used as input to the ThinkingChain inference engine; data fragments that fail verification are discarded directly. This method filters out more than 80% of redundant and non-compliant content at the source, significantly reducing the computational load on the ThinkingChain inference engine.
[0049] Example 3: Method for constructing a template for reasoning steps in a thought chain.
[0050] This embodiment details the construction of a thought chain reasoning step template. First, based on Jinja2 language template rules, the overall structure of the template is designed, including loop traversal of abstract syntax tree nodes and conditional judgments for different logical types. An example of a thought chain reasoning step template in the automotive electronics field is as follows: {% for node in ast_nodes %}; {% if node.type == "reference" %}; Extract the reference relationship between {{node.source}} and {{node.target}}, and associate it with the location: {{node.location}}; {% elif node.type == "parameter constraint" %}; Extract the logical constraints of parameter {{node.param_name}}: {{node.constraint_expr}}, based on: {{node.basis}}; {% elif node.type == "module association" %}; Extract the association rule between {{node.module_A}} and {{node.module_B}}: {{node.rule_desc}}; {% elif node.type == "canonical mapping" %}; Extract the mapping relationship between {{node.content}} and the specification {{node.standard_id}}; {% endif %}{% endfor %}.
[0051] Second, for reference relationship types, a reasoning step template is defined to extract the reference relationship and associated position between the source object and the target object. Third, for parameter constraint types, a reasoning step template is defined to extract the parameter name, logical constraint expression, and constraint basis. Fourth, for module association types, a reasoning step template is defined to extract the names of the two associated modules and the description of the association rule. Fifth, for canonical mapping types, a reasoning step template is defined to extract the mapping relationship between content nodes and canonical identifiers. Each reasoning step outputs a unified structured format: logical type field, source object field, target object field, and association rule field. Through the precise constraint of the thought chain reasoning path using Jinja2 templates, the thought chain length is compressed by 60% compared to the general COT, and the size of the reasoning output result is reduced by more than 99%.
[0052] Example 4: Structured Logical Data Serialization Format This embodiment describes in detail the data serialization format of structured logical data. In step S4, the structured logical results output by the thought chain reasoning are converted into a JSON file. Each node is a structured result in the preset format, containing the following fields: The `logic_id` field: A unique identifier for the logical record, which can be a Universally Unique Identifier (UUID) or an auto-incrementing sequence. The `logic_type` field: The logical type, with values including "reference," "constraint," "association," and "mapping." The `source` field: The source object, used to identify the starting point of the logical relationship; it can be a document name, module name, or parameter name. The `target` field: The target object, used to identify the ending point of the logical relationship; it can be a document name, module name, parameter name, or specification identifier. The `rule` field: The core logical rule, using structured or semi-structured text to describe the logical relationship between the source and target objects, without containing specific text content. The `business_standard` field: The associated business specification identifier, such as "ISO26262-6," "IEC 61508," etc. A typical node example is as follows: {"logic_id": "L001", "logic_type": "reference", "source": "VCU design document", "target": "MC_V3.2 Motor Control Module Parameter Table", "rule": "Voltage parameter constraints must be referenced in the Functional Safety section". "business_standard": "ISO 26262-6"}.
[0053] A unified data serialization format facilitates data parsing, transmission, and import into databases, achieving a high degree of structure and lightweighting of logical information.
[0054] Example 5: Lightweight Database Conversion Method.
[0055] This embodiment describes in detail the conversion process of the lightweight database. First, the generated JSON file is parsed to extract all logical node records, association rule records, and canonical mapping records.
[0056] Second, create three core data tables in a relational database (such as PostgreSQL). The specific field definitions and appendices for these three tables are as follows. Figure 4 Exactly the same: The logical node table (t_logic_node) contains the fields node_id (unique node identifier, primary key), node_type (node type, values such as document, module, parameter), node_name (node name), business_domain (business domain), and created_at (creation time). This table is used to store the unique identifier, type, and business domain of a document, module, or parameter.
[0057] The association rule table (t_relation_rule) contains the following fields: `relation_id` (unique identifier for the association, primary key), `source_node_id` (source node ID, foreign key to `node_id` in the `t_logic_node` table), `target_node_id` (target node ID, foreign key to `node_id` in the `t_logic_node` table), `relation_type` (association type, values such as: reference, constraint, association, mapping), `logic_rule` (logical rule description text), and `relation_strength` (association strength, a floating-point number between 0 and 1). This table stores the association type, logical rule, and association strength between nodes.
[0058] The specification mapping table (t_standard_mapping) contains the fields mapping_id (unique mapping identifier, primary key), node_id (node ID, foreign key related to node_id in the t_logic_node table), standard_id (specification identifier, such as ISO26262), standard_chapter (specification chapter number), and mapping_desc (mapping description text). This table is used to store the mapping relationship between nodes and business specifications.
[0059] None of the three tables mentioned above have any fields for storing specific text content (such as full-text storage fields of type BLOB, CLOB, or TEXT), and only store structured logical metadata.
[0060] Third, perform the data import operation, inserting the records from the JSON file into the corresponding data tables. Fourth, create indexes for each data table to optimize query performance. For example, create a full-text index on the `node_name` field of the logical node table, a B-tree index on the `source_node_id` and `target_node_id` fields of the association rule table, and a hash index on the `standard_id` field of the canonical mapping table. Through a data model design that stores only logical association information and not specific content, and with reasonable index optimization strategies, extreme compression of 300GB of full business data to a lightweight 900MB database was achieved, with a single logical query response time of ≤10ms.
[0061] Example 6: Combining lightweight database output method.
[0062] This embodiment describes in detail the output method of combining a large language model with a lightweight database. The first sub-step: Receiving user output requests. Users can input their requests via natural language, such as "Generate VCU 2024 version functional safety document".
[0063] The second sub-step involves calling the lightweight database's query interface. This parses keywords from the user's requirements (such as "VCU" and "functional safety"), constructs an SQL query, and extracts all business logic records relevant to the requirements from the logical node table, association rule table, and specification mapping table. Example query: SELECT r. FROM t_relation_rule r; JOIN t_logic_node n ON r.source_node_id = n.node_id; WHERE n.node_name LIKE '%VCU%' AND r.relation_type IN ('reference', 'constraint', 'mapping').
[0064] The third sub-step involves converting the extracted business logic records into hint constraints for the large language model. The hint constraints are formatted in a structured manner, listing document reference requirements, parameter constraints, and specification compliance requirements in sequence. Each requirement is annotated with the specific record ID from the lightweight database. The format of the hint constraints is as follows: 1. The document sections must reference the voltage parameter constraints of the motor control module MC_V3.2 (from DB association rule table ID:1001); 2. Functional safety parameters must comply with the requirements of Chapter 6 of ISO 26262 (from DB specification mapping table ID:2003); 3. The document structure must match the Jinja2 template: Chapter 1 - Basic Information, Chapter 2 - Parameter Constraints, Chapter 3 - Specification Compliance (from DB Logical Node Table ID: 3005).
[0065] The fourth sub-step involves inputting the prompt constraints as system prompt words into the large language model. The large language model then generates specific content within the framework of the thought chain logical constraints. Because the output of the large language model is strictly limited to the preset logical constraints, the generated content will not have any missing specifications or logical errors. Actual testing shows that the output content of the large language model based on lightweight database constraints achieves a 100% standardization rate, and the response speed is more than 80% faster than calling the full database. In the automotive electronics VCU document generation scenario, the response time has been reduced from 5 minutes (originally 5 minutes) to less than 1 minute.
[0066] Example 7: Directly calling the lightweight database output method.
[0067] This embodiment describes in detail the output method of directly calling the lightweight database. The first sub-step is: receiving the user's output request, calling the lightweight database's query interface, and extracting the logical rules related to the output request and the corresponding Jinja2 template identifiers.
[0068] The second sub-step involves calling the Jinja2 template engine, passing the extracted logic rules as template context variables, and rendering the standardized output template. The rendered result is an empty template containing placeholders, as shown in the example below: ECU configuration documentation; Parameter name: {{param_name}}; Constraint rule: {{constraint_rule}}; Standard basis: {{standard_basis}}.
[0069] The third sub-step involves presenting an empty template to the user, who or an automated tool then fills in the placeholders according to the actual situation to obtain the final output content. This method is suitable for high-standard scenarios requiring manual review, such as generating airworthiness documents in the aerospace industry or functional safety documents in the automotive electronics industry. By combining template rendering with manual review, it ensures that the final output content not only conforms to the logical constraints of the lightweight database but has also been manually verified, making it suitable for scenarios with extremely high requirements for output specifications.
[0070] Example 8: Specific applications in the field of automotive electronics.
[0071] This embodiment uses the automotive electronics field as an example to illustrate the specific application parameters of the present invention. In the automotive electronics field, the ANTLR4 semantic rules define the semantic verification grammar for ECU parameters, and the expression is: ECU parameters → parameter name + numerical unit; The parameter names are identifiers composed of letters, numbers, and underscores, the values are integers or floating-point numbers, and the units are preset sets of physical quantity units (including km / h, V, A, kW, Nm, rpm, etc.). The Jinja2 language template rules are designed based on automotive electronics standards such as ISO 26262 functional safety standard and ISO 21434 cybersecurity standard. After the lightweight database is built, the stored thought chain logic fully matches the requirements of the ISO 26262 standard. When generating VCU functional safety documents, the large language model generates documents under the logical constraints of the lightweight database, achieving a 100% compliance rate.
[0072] Example 9: Data format and logical information type.
[0073] This embodiment illustrates the storage format and logical information types of all business data. The storage formats for all business data include, but are not limited to: text documents (such as TXT, DOC, and PDF formats), code files (such as C, Python, and Java source code formats), and configuration files (such as JSON, XML, and YAML formats). Business logic and logical relationship information include the following four types: First, document reference relationships, such as "Chapter 5 of document A references Chapter 3 of document B"; Second, parameter logical constraints, such as "VCU maximum speed..." ≤120km / h”; third, inter-module association rules, such as “voltage parameters of the motor control module associated with the battery management module”; fourth, business specification mapping relationships, such as “ECU configuration parameters must comply with Chapter 6 of ISO 26262”. This invention is applicable to any field with standardized business data, not limited to the automotive electronics field.
[0074] Example 10: A business specification generation system based on specific language rules and thought chains.
[0075] This embodiment provides a lightweight business logic database construction and specification output system based on domain-specific language rules and a streamlined thought process. The system includes the following modules: DSL rule acquisition module: used to acquire the DSL rule set of the target business domain, which includes ANTLR4 semantic rules and Jinja2 language template rules.
[0076] Syntax validation module: Connected to the DSL rule acquisition module, it is used to build a syntax parser based on the ANTLR4 semantic rules, perform semantic compliance validation on all input business data, filter out content that does not conform to the ANTLR4 semantic rules, and obtain the validated abstract syntax tree nodes.
[0077] The Mind Chain Customization Module is connected to the Syntax Validation Module. It is used to construct a Mind Chain Reasoning Step Template based on the Jinja2 language template rules, and to perform simplified Mind Chain Reasoning on the validated abstract syntax tree nodes using the Mind Chain Reasoning Step Template, outputting structured logical data containing only logical types, associated objects, and association rules.
[0078] Database building module: Connected to the mind chain customization module, it is used to convert the structured logical data into a lightweight database. The lightweight database only stores business logic and logical relationship information, and does not store the specific content of the full business data.
[0079] Standardized output module: Connected to the database construction module, it is used to respond to user output requirements based on the logical constraints stored in the lightweight database and generate target output content that conforms to business specifications.
[0080] This system can be deployed on a single server or using a distributed architecture. Data interaction between modules can be achieved through application programming interfaces (APIs) or message queues. The system can execute the complete process described in any of the aforementioned method embodiments, achieving the same technical effect: realizing a fundamental breakthrough in business data storage from "full content storage" to "logical storage only," achieving 100% output standardization, improving response speed by more than 80%, and possessing good scalability.
[0081] Example embodiments have been disclosed herein, and while specific terminology has been used, it is for illustrative purposes only and should be construed as such, and is not intended to be limiting. In some instances, it will be apparent to those skilled in the art that features, characteristics, and / or elements described in conjunction with particular embodiments may be used alone, or in combination with features, characteristics, and / or elements described in conjunction with other embodiments, unless otherwise expressly indicated. Therefore, those skilled in the art will understand that various changes in form and detail may be made without departing from the scope of the invention as set forth in the appended claims.
Claims
1. A method for generating business specifications based on specific language rules and thought chains, characterized in that: include: S1, Obtain the domain-specific language (DSL) rule set of the target business domain, which includes ANTLR4 semantic rules and Jinja2 language template rules; S2, Based on the ANTLR4 semantic rules, construct a syntax parser, use the syntax parser to perform semantic compliance verification on all input business data, filter out content that does not conform to the ANTLR4 semantic rules, and obtain the abstract syntax tree nodes that pass the verification. S3. Based on the Jinja2 language template rules, construct a thought chain reasoning step template. The thought chain reasoning step template limits the thought chain reasoning path to only revolving around extracting business logic and business relationships. S4. Using the thought chain reasoning step template, perform simplified thought chain reasoning on the verified abstract syntax tree node, and output structured logical data containing only logical types, associated objects, and association rules. S5, the structured logical data is converted into a lightweight database, which stores only logical constraints, and the logical constraints include business logic and logical association information; S6. Based on the logical constraints stored in the lightweight database, respond to user output requirements and generate target output content that conforms to business specifications.
2. The method according to claim 1, characterized in that, S2 specifically includes: Use the ANTLR4 tool to define the lexical rule file and syntax rule file for the target business domain, and generate the corresponding lexical analyzer and syntax analyzer; The parser parses the input business data fragments and generates an abstract syntax tree; Only the abstract syntax tree nodes that pass the syntax validation are used as input to the thought chain reasoning engine.
3. The method according to claim 1, characterized in that, S3 specifically includes: Based on the Jinja2 language template rules, a thought chain reasoning step template including logical extraction condition judgment and loop traversal is designed. In the thought chain reasoning step template, corresponding reasoning steps are defined for different types of business logic. The types of reasoning steps include reference relationship extraction, parameter constraint extraction, module association extraction, and normative mapping extraction. Each reasoning step outputs a structured result in a preset format, which includes logical type fields, source object fields, target object fields, and association rule fields.
4. The method according to claim 3, characterized in that, The structured logical data in S4 is organized in a data serialization format. Each node is a structured result in the preset format, containing the following fields: A logical unique identifier field, used to uniquely identify a logical record; Logical type fields can take values including at least one of the following: reference, constraint, association, and mapping. The source object field is used to identify the starting point of a logical relationship, which may include a document, module, or parameter; The target object field is used to identify the endpoint of the logical relationship; The logical rules field is used to describe the core logical rules between the source object and the target object; The business specification field is used to record the business specification identifier associated with this logic.
5. The method according to claim 4, characterized in that, In step S5, the structured logical data is converted into a lightweight database, specifically including: Import the structured logical data in the data serialization format into a relational database or graph database; In the relational database or graph database, construct a logical node table, an association rule table, and a specification mapping table. The logical node table stores the unique identifier, type, and business domain of a document, module, or parameter. The association rule table stores the association type, logical rules, and association strength between nodes. The specification mapping table stores the mapping relationship between nodes and business specifications. The logical node table, the association rule table, and the specification mapping table do not have fields for storing specific text content.
6. The method according to claim 1, characterized in that, S6 specifically includes: Receive user output requirements, call the query interface of the lightweight database, and extract all business logic records related to the user output requirements; The extracted business logic records are converted into prompt constraints of the large language model. The prompt constraints include document reference requirements, parameter constraint requirements, and specification compliance requirements. The prompt constraints are input into the large language model, so that the large language model generates content that conforms to business specifications under the constraints of the prompt constraints.
7. The method according to claim 1, characterized in that, S6 specifically includes: Receive user output requirements, call the query interface of the lightweight database, and extract the logical rules and corresponding Jinja2 template identifiers related to the user output requirements; The Jinja2 template engine is invoked to render a standardized output template based on the extracted logical rules, generating an empty template containing parameter name placeholders, constraint rule placeholders, and specification basis placeholders. The system receives specific content from users or automated tools to fill in each placeholder in the empty template, and obtains the final output content that conforms to the logical constraints of the lightweight database.
8. The method according to claim 1, characterized in that, The target business area includes the automotive electronics field, and the ANTLR4 semantic rules include the semantic verification grammar of electronic control unit parameters. The semantic verification grammar of electronic control unit parameters is defined as a combination rule of parameter name, value and unit.
9. The method according to claim 1, characterized in that, The storage format of the full set of business data includes at least one of text documents, code files, and configuration files; the business logic and logical association information includes at least one of document reference relationships, parameter logic constraints, inter-module association rules, and business specification mapping relationships.
10. A business specification generation system based on specific language rules and thought chains, characterized in that: The system is used to perform the method according to any one of claims 1-9, the system comprising: The DSL rule acquisition module is used to acquire a domain-specific language rule set for the target business domain. The DSL rule set includes ANTLR4 semantic rules and Jinja2 language template rules. The syntax verification module is used to build a syntax parser based on the ANTLR4 semantic rules, perform semantic compliance verification on all input business data, filter out content that does not conform to the ANTLR4 semantic rules, and obtain the abstract syntax tree nodes that pass the verification. The Mind Chain Customization Module is used to construct a Mind Chain Reasoning Step Template based on the Jinja2 language template rules, and to perform simplified Mind Chain Reasoning on the verified Abstract Syntax Tree Node using the Mind Chain Reasoning Step Template, outputting structured logical data containing only logical types, associated objects, and association rules. The database construction module is used to convert the structured logical data into a lightweight database. The lightweight database only stores business logic and logical relationship information, and does not store the specific content of the full business data. The standardized output module is used to respond to user output requirements based on the logical constraints stored in the lightweight database and generate target output content that conforms to business specifications.