Engineering design quantity prediction method based on large language model and engineering semantic reasoning
By adopting a three-stage hybrid prediction architecture combining large language models and engineering semantic reasoning, the accuracy and adaptability issues of engineering design workload prediction are solved. This enables high-precision, interpretable quantitative assessment of engineering design tasks, supports continuous learning and system integration, and improves the automation and accuracy of engineering management.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHINA RAILWAY SHANGHAI DESIGN INST GRP CO LTD
- Filing Date
- 2026-03-06
- Publication Date
- 2026-06-12
AI Technical Summary
Existing engineering design workload prediction methods suffer from insufficient accuracy, poor adaptability, low automation, and weak interpretability in the context of complex multidisciplinary collaboration and rapid technological evolution, especially when predicting new fields, new processes, and new tasks.
A three-stage hybrid prediction architecture based on large language models and engineering semantic reasoning is adopted, including retrieval-enhanced generation (RAG), instruction-guided large language model reasoning, and engineering rule-driven post-processing. Combined with structured data, it can achieve high-precision, interpretable, and evolvable quantitative evaluation of engineering design tasks.
It achieves high-precision prediction of engineering design tasks, improves the model's generalization ability and applicability, provides interpretable prediction results, and supports seamless integration with mainstream engineering management systems through continuous learning and feedback optimization, thereby improving the accuracy and automation of predictions.
Smart Images

Figure CN122198228A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the technical field of the intersection of engineering management and artificial intelligence, and in particular to a method for predicting engineering design quantities based on large language models and engineering semantic reasoning. Background Technology
[0002] In engineering design, research and development, and technical service project management, scientifically and accurately estimating task workload (times, man-days, or costs) is a core prerequisite for achieving refined planning, resource optimization, cost control, and performance evaluation. For a long time, engineering companies have generally adopted traditional time quota assessment methods such as technical measurement, analogical comparison, statistical analysis, and empirical estimation. While these methods have been effective in specific periods, their shortcomings are becoming increasingly apparent in today's context of increasingly complex projects, multidisciplinary collaboration, and rapid technological evolution. Technical measurement methods are costly and time-consuming, making it difficult to cover mental labor and implicit work; The analogy comparison method relies on highly similar historical projects and fails when faced with innovative tasks. Statistical analysis can only reflect historical efficiency and cannot reflect the required level and technological progress. Empirical estimation methods are highly subjective, poorly standardized, and difficult to replicate on a large scale.
[0003] In the last decade, machine learning methods have emerged that use structured task labels and word embeddings as features, combined with models such as Random Forest and XGBoost, to predict workload. Some companies have managed to keep prediction errors below 8%. However, these methods still have significant limitations: Superficial semantic understanding: Word vectors are static expressions that cannot capture implicit constraints, collaboration costs, and dependencies in engineering statements.
[0004] Feature engineering disconnects semantics and structure: the coupling between text semantics and WBS, stages, and professions is not modeled.
[0005] It has weak generalization ability and difficulty in cold start: it fails to predict new fields, new processes and new tasks.
[0006] Lack of interpretability and continuous learning mechanisms: unable to answer the reasons for predictions and lacking feedback loops.
[0007] Low process integration: It relies heavily on manual data processing and is difficult to integrate into project management systems to achieve real-time forecasting and dynamic adjustment.
[0008] Although LLMs (such as GPT, Qwen, and Llama) possess powerful semantic understanding, reasoning, and few-shot learning capabilities, they lack mature applications in the field of engineering design workload prediction. Existing LLMs are mostly used for question answering or text generation and have not yet been transformed into engineering-specific intelligent models capable of parsing task complexity, performing workload reasoning, and fusing with structured data. They also lack domain adaptation and collaborative reasoning frameworks.
[0009] Therefore, there is an urgent need for a new prediction method: based on a large language model, through deep semantic understanding, domain knowledge modeling and structured data fusion, to realize an interpretable, evolvable, end-to-end intelligent assessment system for engineering workload, fundamentally breaking through the bottlenecks of traditional methods in terms of accuracy, adaptability and automation. Summary of the Invention
[0010] The purpose of this invention is to address the shortcomings of existing technologies by providing a method for predicting engineering design workload based on a large language model and engineering semantic reasoning. This method targets engineering management scenarios where plain text task descriptions are the core input. It combines structured project metadata (such as professional categories, design stages, WBS coding, and planned durations) with a three-stage hybrid prediction architecture. This architecture integrates Retrieval-Augmented Generation (RAG), instruction-guided large language model reasoning, and engineering rule-driven post-processing to achieve a high-precision, interpretable, and evolvable quantitative assessment of engineering design task workload (typically measured in person-hours). The entire prediction process strictly follows the logical chain of "experience anchoring → semantic reasoning → numerical calibration," ensuring that the output possesses both AI generalization capabilities and conforms to the rationality and compliance of engineering practice. This prediction method breaks through the limitations of traditional word vectors, achieving deep engineering semantic understanding: leveraging the powerful context awareness and semantic reasoning capabilities of the large language model, it accurately identifies the complexity factors implicit in the task description (such as technical difficulty, professional interfaces, constraints, and innovativeness), going beyond keyword matching to achieve a structured analysis of the essential characteristics of the engineering task. This prediction method constructs a high-precision and robust workload prediction mechanism: by combining instruction tuning and retrieval-augmented generation (RAG), the model can both learn from the experience of similar historical tasks and reasonably generalize to new types of tasks, significantly improving prediction accuracy and applicability. This method enhances model interpretability and user trust: while outputting predicted workload, it automatically generates highly readable reasoning, supporting counterfactual analysis and expert review, forming transparent and credible decision support. This method establishes a continuous learning and feedback optimization closed loop: it designs a human-in-the-loop mechanism, allowing users to correct prediction results and annotate the reasons. The system automatically triggers incremental training or sample re-entry, realizing online model evolution and knowledge accumulation. This forecasting method achieves seamless integration with engineering management systems: it abandons the inefficient mode of relying on manual Excel operations, provides standardized API interfaces or lightweight plugins, and supports direct connection with mainstream systems such as Primavera P6, Microsoft Project, and design quality management platforms, enabling automatic acquisition of task data, real-time writing back of forecast results, and closed-loop business processes.
[0011] The objective of this invention is achieved through the following technical solutions: A method for predicting engineering design quantities based on a large language model and engineering semantic reasoning, the prediction method comprising the following steps: S1: Obtain the structured information of the target task, which includes task description text and structured metadata, and perform cleaning and preprocessing on the structured information. S2: Input the preprocessed task description text into a large language model fine-tuned by instructions from an engineering domain corpus to generate the semantic vector and complexity factor of the target task; wherein, the complexity factor includes one or more of stress level, security integrity level, technological novelty, number of interfaces, and project background information, and the project background information includes one or more of device type, project year, and industry category; S3: Using the semantic vector as a query, retrieve several historical task cases with the highest similarity from the preset historical task vector library. Each historical task case includes a historical task description, actual workload, and historical complexity factor; wherein, the historical complexity factor and the complexity factor have the same data structure. S4: The structured information of the target task, the complexity factor, and the retrieved historical task cases are concatenated into reasoning prompts according to the prompt word template, and input into the large language model for contextual reasoning. The model output is then parsed to obtain the original prediction result containing the prediction workload, confidence interval, and reasoning basis. S5: Perform regular parsing on the original prediction results and apply preset engineering business rules for numerical calibration to generate standardized final prediction results. The final prediction results include prediction workload, confidence interval, inference summary, case ID of historical task cases, and a Boolean flag indicating whether calibration rules are triggered. S6: Receive user feedback on the final prediction result, add the feedback samples to the active learning queue, and periodically perform LoRA incremental fine-tuning on the large language model, update the historical task vector library, and optimize the prompt word template based on the samples in the queue.
[0012] In step S2, generating the semantic vector specifically includes: The hidden states of a specific network layer of the large language model are extracted and pooled to generate the semantic vector of the target task.
[0013] In step S3, retrieving historical task cases specifically includes: Using the semantic vector of the target task as a query, an approximate nearest neighbor search is performed in the historical task vector library deployed in the vector database to recall the historical task cases with the highest similarity.
[0014] In step S4, the prompt word template is a fixed format string that cannot be modified by the user. It is used to guide the large language model to perform structured reasoning and force it to output the prediction results in JSON format.
[0015] In step S5, the engineering business rules include one or more of the following rules: Minimum working hours verification rule is used to check whether the original predicted workload is lower than the preset minimum threshold corresponding to a specific profession or task type. If so, it will be forcibly adjusted to that threshold. Special condition correction rules are used to multiply the original prediction workload by a preset adjustment coefficient based on the complexity factor of the target task. Confidence interval normalization rules are used to limit the confidence intervals in the original prediction results to a pre-defined reasonable range.
[0016] Step S6 specifically includes: Obtain the actual workload consumed by user annotations and the reasons for deviations, and generate corrected samples; The corrected samples are added to a high-quality feedback pool to form an active learning queue; The large language model is periodically fine-tuned using LoRA based on samples in the queue. Newly completed audited tasks are added to the historical task vector library, and the instruction wording of the prompt word template is adjusted according to the error pattern.
[0017] The prediction method is implemented through a prediction system, which includes a data access and preprocessing module, an engineering semantic understanding and feature extraction module, a historical case retrieval module, a large language model structured reasoning module, a numerical calibration and result generation module, and a feedback learning and model evolution module. The data access and preprocessing module is used to acquire structured information of the target task and clean and preprocess the structured information. The engineering semantic understanding and feature extraction module is used to input the preprocessed task description text into a large language model fine-tuned by instructions from an engineering domain corpus, and generate the semantic vector and complexity factor of the target task. The historical case retrieval module is used to retrieve the most similar historical task cases from the historical task vector library using the semantic vector as the query. The structured reasoning module of the large language model is used to concatenate the structured information of the target task, the complexity factor, and the retrieved historical task cases into reasoning prompts according to the prompt word template, input them into the large language model for contextual reasoning, and parse the model output to obtain the original prediction result containing the prediction workload, confidence interval, and reasoning basis. The numerical calibration and result generation module is used to perform regular parsing on the original prediction results and apply preset engineering business rules to perform numerical calibration to generate standardized final prediction results. The feedback learning and model evolution module is used to receive user feedback on the final prediction results, add feedback samples to the active learning queue, and periodically perform LoRA incremental fine-tuning on the large language model, update the historical task vector library, and optimize the prompt word template based on the samples in the queue.
[0018] The advantages of this invention are: (1) Achieve the leap from shallow semantic matching to deep semantic reasoning and improve prediction accuracy: By introducing a large language model for fine-tuning instructions in the engineering field, it is possible to perform professional semantic understanding and logical reasoning on task descriptions, identify the implicit complexity factors contained therein, and incorporate them into prediction calculations, thereby effectively improving prediction reliability and stability. (2) Supports zero-sample and few-sample prediction, breaking through the limitations of traditional cold start: By utilizing the prior knowledge system and analogical reasoning ability of the large language model, it can still maintain strong prediction ability when there is a lack of high-quality historical samples, significantly enhancing the model's adaptability to new tasks, new technology paths and special working scenarios. (3) Provide highly interpretable prediction results to enhance business decision-making capabilities: Not only does it provide the predicted workload, but it also outputs structured reasoning, key complexity labels and explanations of influencing factors, enabling users to understand the prediction logic and improve the transparency and credibility of the system in the management process. (4) Form a continuous learning mechanism of "human-in-the-loop" to realize model self-evolution: Construct a user feedback closed loop, introduce manual review and deviation correction information into the incremental learning process of the model, so that the system can be continuously optimized in practical applications and avoid the problem of traditional static models degrading over time; (5) Deep integration with mainstream engineering management systems to achieve end-to-end automated applications: Provides general interface capabilities, can be directly connected with mainstream planning management and design collaboration systems, realizes automatic extraction, prediction and result writing of task data, significantly reduces manual operation links and improves the degree of process automation; (6) Improve enterprise resource allocation efficiency and management capabilities: Real-time workload prediction and updates can be achieved throughout the entire project lifecycle, providing a reliable basis for manpower allocation, cost planning and schedule management, and promoting the evolution of enterprise management models towards refinement, digitalization and intelligence; (7) Possesses good system scalability and industry migration capability: The core architecture is universal and can be extended to adapt to other knowledge-intensive workload assessment scenarios. Only minor adjustments to the domain are needed to achieve cross-business deployment and industry migration, which has strong application flexibility and promotion value. Attached Figure Description
[0019] Figure 1 This is a flowchart illustrating the steps of the engineering design quantity prediction method based on a large language model and engineering semantic reasoning, as described in this invention. Figure 2 This is a schematic diagram of the prediction system of the present invention; Figure 3 This is a flowchart illustrating the second stage of the present invention; Figure 4 This is a schematic diagram of the prompt word template of the present invention. Detailed Implementation
[0020] The features and other related features of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments, so as to facilitate understanding by those skilled in the art: Example: Figure 1 As shown, this embodiment relates to a method for predicting engineering design quantities based on a large language model and engineering semantic reasoning. This prediction method mainly includes the following steps: S1: Obtain the structured information of the target task, including the task description text and structured metadata, and perform cleaning and preprocessing on the structured information.
[0021] S2: Input the preprocessed task description text into a large language model fine-tuned by instructions from an engineering domain corpus to generate the semantic vector of the target task and a complexity factor; wherein, the complexity factor includes one or more of the following: stress level, security integrity level, technological novelty, number of interfaces, and project background information, and the project background information includes one or more of the following: device type, project year, and industry category.
[0022] In this embodiment, generating semantic vectors specifically includes: The hidden states of specific network layers in a large language model are extracted and pooled to generate semantic vectors for the target task.
[0023] S3: Using semantic vectors as queries, retrieve several historical task cases with the highest similarity from the preset historical task vector library. Each historical task case includes a historical task description, actual workload, and historical complexity factor; the historical complexity factor and the complexity factor have the same data structure.
[0024] In this embodiment, the specific cases for retrieving historical tasks include: Using the semantic vector of the target task as a query, an approximate nearest neighbor search is performed in the historical task vector library deployed in the vector database to recall the historical task cases with the highest similarity.
[0025] S4: The structured information of the target task, the complexity factor, and the retrieved historical task cases are concatenated into inference prompts according to the prompt word template, and then input into the large language model for contextual inference. The model output is then parsed to obtain the original prediction results, which include the prediction workload, confidence interval, and inference basis.
[0026] In this embodiment, the prompt word template is a fixed format string that cannot be modified by the user. It is used to guide the large language model to perform structured reasoning and force it to output the prediction results in JSON format.
[0027] S5: Perform regular parsing on the original prediction results and apply preset engineering business rules for numerical calibration to generate standardized final prediction results. The final prediction results include prediction workload, confidence interval, inference summary, case ID of historical task cases, and a Boolean flag indicating whether the calibration rule has been triggered.
[0028] In this embodiment, the engineering business rules include one or more of the following rules: Minimum working hours verification rule is used to check whether the original predicted workload is lower than the preset minimum threshold corresponding to a specific profession or task type. If so, it will be forcibly adjusted to that threshold. Special condition correction rules are used to multiply the original prediction workload by a preset adjustment coefficient based on the complexity factor of the target task. Confidence interval normalization rules are used to limit the confidence intervals in the original prediction results to a pre-defined reasonable range.
[0029] S6: Receive user feedback on the final prediction results, add the feedback samples to the active learning queue, and periodically perform LoRA incremental fine-tuning on the large language model, update the historical task vector library, and optimize the prompt word template based on the samples in the queue.
[0030] This embodiment specifically includes: Obtain the actual workload consumed by user annotations and the reasons for deviations, and generate corrected samples; The corrected samples are added to a high-quality feedback pool to form an active learning queue; Regularly perform LoRA incremental fine-tuning on the large language model based on samples in the queue, while adding newly completed audited tasks to the historical task vector library, and adjusting the instruction wording of the prompt word template according to the error pattern.
[0031] In addition, such as Figure 2 As shown, the prediction method achieves prediction through a prediction system, which includes a data access and preprocessing module, an engineering semantic understanding and feature extraction module, a historical case retrieval module, a large language model structured reasoning module, a numerical calibration and result generation module, and a feedback learning and model evolution module.
[0032] The data access and preprocessing module is used to acquire structured information of the target task and clean and preprocess the structured information.
[0033] The engineering semantic understanding and feature extraction module is used to input the preprocessed task description text into a large language model fine-tuned by instructions from an engineering domain corpus, and generate the semantic vector and complexity factor of the target task.
[0034] The historical case retrieval module is used to retrieve the most similar historical task cases from the historical task vector library by using semantic vectors as the query.
[0035] The Large Language Model (LLM) structured reasoning module is used to concatenate the structured information of the target task, complexity factors, and retrieved historical task cases into reasoning prompts according to the prompt word template, input them into the Large Language Model for contextual reasoning, and parse the model output to obtain the raw prediction results containing the prediction workload, confidence interval, and reasoning basis.
[0036] The numerical calibration and result generation module is used to perform regularization parsing on the original prediction results and apply preset engineering business rules to perform numerical calibration, generating standardized final prediction results.
[0037] The feedback learning and model evolution module is used to receive user feedback on the final prediction results, add feedback samples to the active learning queue, and periodically perform LoRA incremental fine-tuning on the large language model, update the historical task vector library, and optimize the prompt word template based on the samples in the queue.
[0038] The modules exchange data through standardized interfaces (such as JSON, CSV, REST API), supporting both batch processing and real-time call modes.
[0039] The time forecasting in this embodiment is accomplished through three sequentially executed technical stages with rigorous data transmission. The entire process does not invoke traditional regression models, statistical tools, or external forecasting services; all intelligent judgments are achieved collaboratively by internal system modules. The specific implementation methods for each stage are as follows: (1) First stage: Experience anchoring (RAG retrieval).
[0040] enter: The task description text after cleaning (e.g.: "Complete the PID diagram design for the high-pressure reactor, operating pressure 15MPa, SIL3 certification required"). Structured metadata includes: specialization (such as "process"), design phase (such as "detailed design"), and WBS coding.
[0041] Processing steps: a) Call a large language model that has been fine-tuned on an engineering domain corpus (containing 6.5 million real task descriptions, design specifications, change orders, etc.) (the base model is Qwen-7B, and LoRA adapter is used for efficient fine-tuning).
[0042] b) Input the task description text into the model, extract the average pooling result of the last hidden state, and generate a 1024-dimensional dense vector as the semantic embedding of the task.
[0043] c) Submit the semantic embedding as a query vector to a vector database (deploy Milvus 2.3 or FAISSIVF-PQ index) and perform an approximate nearest neighbor (ANN) search in the historical task knowledge base.
[0044] d) Return the top 5 most similar historical task records, sorted from highest to lowest similarity.
[0045] Each historical task record contains the following fields: a) Original task description; b) Actual man-hours consumed (unit: man-hours, derived from audit data of completed projects); c) A set of complexity labels, including: Pressure rating (e.g., ">10MPa"); Security integrity level (e.g., "SIL2", "SIL3"); Technological novelty (e.g., "first application"); Number of interfaces (e.g., "involving 3 majors"); Project background information: unit type (e.g., "hydrogenation unit"), project year, industry category (e.g., "petrochemical").
[0046] Output: A structured list of cases for use in the next stage of constructing the Prompt.
[0047] (2) such as Figure 3 As shown, the second stage is semantic reasoning (LLM structured reasoning).
[0048] enter: a) Current task information (description, specialty, stage); b) The Top-5 historical case list output in the first phase.
[0049] Processing steps: a) The system loads a predefined Prompt template, which is a fixed string that cannot be modified by the user, to ensure inference consistency.
[0050] b) Combine the current task information with historical cases in the following format, for example: Figure 4 .
[0051] c) Input the concatenated complete Prompt into the same fine-tuned large language model (or dedicated inference instance), perform forward inference once, and generate a text response.
[0052] d) The system uses the regular expression r'\{.*"predicted_hours".*\}' to extract JSON substrings and validates the format using a standard JSON parser.
[0053] e) If the parsing is successful, extract the predicted_hours, confidence_interval_percent, and reasoning fields; if it fails, retry a maximum of 2 times (changing the sampling temperature), and if it still fails, mark it as an exception and return the default value (e.g., 100h ±20%).
[0054] (3) Third stage: numerical calibration (engineering rule post-processing).
[0055] enter: The second stage outputs the raw prediction results, including prediction time, confidence interval, and inference explanation.
[0056] Processing steps: a) Load calibration rules: The system reads the pre-configured engineering calibration rule set (stored in YAML or JSON format, which can be updated by maintenance personnel) as the basis for subsequent calibration.
[0057] b) Time verification and correction of special conditions: Perform minimum time verification: For process engineering tasks involving PID or piping / instrumentation diagrams, the predicted time must not be less than 80; for instrumentation tasks involving SIL3, it must not be less than 60; and for safety engineering tasks involving HAZOP, it must not be less than 50. If the original predicted value is lower than the threshold, it must be forcibly adjusted to the threshold.
[0058] Adjust the predicted value according to special conditions: For tasks that meet both the "first application" and SIL3 conditions, multiply the original predicted value by 1.2; for tasks involving three or more professional interfaces, multiply by 1.15.
[0059] c) Confidence interval normalization and numerical standardization: Confidence intervals less than 8% are uniformly set to 8%, and those greater than 25% are uniformly limited to 25%.
[0060] The final predicted working hours are rounded down to remove the decimal part to ensure compatibility with systems such as Primavera P6.
[0061] Output: The calibrated standardized prediction results include predicted hours (integer, man-hours), confidence interval (±X%), inference summary (original reasoning field), list of reference case IDs (from Phase 1), and a Boolean flag indicating whether a calibration rule has been triggered, for audit trail purposes.
[0062] In addition, the system receives task description input from engineering projects, such as: "Complete the PID design of the reaction feed heater, operating pressure 15MPa, requiring HAZOP analysis." This task also includes structured metadata, including a professional category of "process" and a design stage of "detailed design."
[0063] The system first uses a fine-tuned large language model to encode the task description, generating a 1024-dimensional semantic vector, which is then submitted to a historical task vector database for approximate nearest neighbor retrieval. The database returns several historical task records with the highest similarity, which must contain at least the following types of historical samples: a) Tasks with similar operating pressures but lower safety levels (e.g., 12MPa, SIL2). b) Missions with the same safety level but where the technology is being used for the first time (e.g., SIL3, first use of a certain type of catalyst); c) Tasks with low stress levels and no need for HAZOP review.
[0064] The system sorts these historical cases by similarity and concatenates them, along with the current task information, into a pre-defined structured Prompt template. This template mandates that the large language model output prediction results in JSON format, with fields including prediction time, confidence interval percentage, and a Chinese inference description of no more than 100 characters.
[0065] The large language model performs inference based on the context in the Prompt, outputting a structured response. After parsing the response, the system performs numerical verification according to a pre-configured set of engineering calibration rules. These calibration rules include, but are not limited to: minimum man-hour thresholds for process-specific PID tasks (e.g., 80 man-hours), minimum man-hour constraints for HAZOP-related tasks, and reasonable confidence intervals (8%–25%). If the original predicted value satisfies all rules, the original value is retained; otherwise, it is corrected according to the rules.
[0066] Finally, the system generates structured results containing predicted working hours, confidence intervals, inference summaries, and reference case identifiers, and outputs them in one of the following ways: a) Automatically write back to the budget work hours field in the project management system; b) Generate an Excel file with the newly added columns for manual review; c) Display interactive reports on the web interface, supporting the tracing of reference case details.
[0067] In this implementation, the system outputs a predicted man-hour of 218 person-hours with a confidence interval of ±10%. The inference summary states that "due to unusually high pressure and the need for a special HAZOP review, the figure is increased by 4% compared to Case 2." After project execution, it was confirmed that the actual man-hours consumed were 225 person-hours, with a relative prediction error of approximately 3.1%. This result demonstrates that the adopted three-stage hybrid inference mechanism can achieve high-precision prediction in real-world engineering scenarios.
[0068] In addition, the system integration and deployment methods are as follows: Input support: a) File import: Users upload an Excel / CSV file that matches the template (fields include task ID, description, major, stage, etc.); b) API integration: Retrieve the task list (JSON format) in real time via RESTful API or Primavera P6 SDK.
[0069] Output support: a) Automatic write-back: Write the "Forecasted Labor Hours" into the "Budgeted Labor Hours" field of Primavera P6 via the P6 SDK. b) File Export: Generate an Excel file with newly added "Forecast Hours", "Confidence Intervals", and "Inference Summary" columns. c) Web interface: Provides interactive reports, allowing users to click "Reference Cases" to view details and download PDF evaluation forms. Deployment environment: a) Local server: Suitable for large engineering companies with high data security requirements; b) Private cloud / hybrid cloud: Supports elastic scaling and high availability; c) Inference service: Built on FastAPI, supports concurrent requests, and has a response time of <2 seconds / task (RTX 4090 GPU).
[0070] In addition, the continuous evolution mechanism is as follows: This system is designed with a closed-loop human-machine collaboration mechanism to continuously optimize prediction accuracy: a) Users can easily annotate prediction deviations, such as recording the difference between the actual time taken and the predicted value and the reasons behind it (e.g., "The complex on-site civil engineering conditions led to multiple coordinations for reserved openings"). b) These corrected samples (including original predictions, actual values, and cause labels) will be added to the high-quality feedback pool for subsequent model training and optimization.
[0071] Weekly automated processes: a) Analyze high-frequency error patterns and adjust the wording of instructions in the Prompt template accordingly to enhance the model's understanding; b) Implement the LoRA incremental fine-tuning strategy, which updates only the adapter parameters without changing the base model, thereby reducing training costs (training time for a single GPU is less than 4 hours). c) Automated vector library updates, incorporating newly completed and audited task information to continuously enrich the content of the RAG retrieval pool; d) Use MLflow for version management, record the model version, test set MAE / RMSE metrics, and important rollback points for each iteration, support A / B testing, and facilitate performance comparison and selection of the optimal model.
[0072] This continuous evolution mechanism enables the system to learn on its own. As time goes by and experience accumulates, its prediction accuracy will gradually improve, achieving effective accumulation and reuse of organizational knowledge.
[0073] It's worth noting that, firstly, regarding models and training strategies, there are no limitations on specific model types; any open-source or commercial large language model can be used, and different fine-tuning paths can be selected based on resource conditions, including full-parameter training, efficient parameter adaptation techniques, or contextual learning methods. Model capabilities are allowed to scale flexibly according to business scale to achieve a dynamic balance between performance and cost.
[0074] Secondly, in terms of data utilization methods and information access channels, it can not only process task description text, but also extend to various unstructured text information such as contract terms, technical specifications, change records, and meeting minutes. It also allows access to multi-source corpora, which are fused and understood through a unified semantic extraction and vectorization mechanism, thereby forming a more comprehensive contextual cognitive structure.
[0075] Third, in terms of system interaction and user experience, it not only provides numerical prediction results, but also outputs explanatory information, multi-granularity prediction structures, risk warnings and standardized reports as needed. It also supports natural language interaction and automated interfaces between systems, making it suitable for both professional engineers and managers and systematic scheduling processes.
[0076] Fourth, in terms of system deployment and operation architecture, it supports different operating environments, including local deployment, private cloud, hybrid cloud and edge execution, and can be integrated with existing enterprise systems through APIs, messaging mechanisms or process automation platforms to meet different enterprise requirements for security, compliance and scalability.
[0077] Finally, regarding continuous evolution and learning mechanisms, the system supports both manual feedback and automatic deviation correction, and can be combined with active learning or cross-organizational distributed collaborative training methods, enabling the system to continuously accumulate experience during long-term operation and achieve continuous performance improvement. Simultaneously, it reserves multimodal development capabilities, allowing for future integration with drawings, BIM models, and engineering visual information, realizing the evolution from pure text semantic reasoning to cross-modal semantic understanding.
[0078] The beneficial technical effects of this embodiment are as follows: (1) Achieve the leap from shallow semantic matching to deep semantic reasoning and improve prediction accuracy: By introducing a large language model for fine-tuning instructions in the engineering field, it is possible to perform professional semantic understanding and logical reasoning on task descriptions, identify the implicit complexity factors contained therein, and incorporate them into prediction calculations, thereby effectively improving prediction reliability and stability. (2) Supports zero-sample and few-sample prediction, breaking through the limitations of traditional cold start: By utilizing the prior knowledge system and analogical reasoning ability of the large language model, it can still maintain strong prediction ability when there is a lack of high-quality historical samples, significantly enhancing the model's adaptability to new tasks, new technology paths and special working scenarios. (3) Provide highly interpretable prediction results to enhance business decision-making capabilities: Not only does it provide the predicted workload, but it also outputs structured reasoning, key complexity labels and explanations of influencing factors, enabling users to understand the prediction logic and improve the transparency and credibility of the system in the management process. (4) Form a continuous learning mechanism of "human-in-the-loop" to realize model self-evolution: Construct a user feedback closed loop, introduce manual review and deviation correction information into the incremental learning process of the model, so that the system can be continuously optimized in practical applications and avoid the problem of traditional static models degrading over time; (5) Deep integration with mainstream engineering management systems to achieve end-to-end automated applications: Provides general interface capabilities, can be directly connected with mainstream planning management and design collaboration systems, realizes automatic extraction, prediction and result writing of task data, significantly reduces manual operation links and improves the degree of process automation; (6) Improve enterprise resource allocation efficiency and management capabilities: Real-time workload prediction and updates can be achieved throughout the entire project lifecycle, providing a reliable basis for manpower allocation, cost planning and schedule management, and promoting the evolution of enterprise management models towards refinement, digitalization and intelligence; (7) Possesses good system scalability and industry migration capability: The core architecture is universal and can be extended to adapt to other knowledge-intensive workload assessment scenarios. Only minor adjustments to the domain are needed to achieve cross-business deployment and industry migration, which has strong application flexibility and promotion value.
Claims
1. A method for predicting engineering design quantities based on large language models and engineering semantic reasoning, characterized in that... The prediction method includes the following steps: S1: Obtain the structured information of the target task, which includes task description text and structured metadata, and perform cleaning and preprocessing on the structured information. S2: Input the preprocessed task description text into a large language model fine-tuned by instructions from an engineering domain corpus to generate the semantic vector and complexity factor of the target task; wherein, the complexity factor includes one or more of stress level, security integrity level, technological novelty, number of interfaces, and project background information, and the project background information includes one or more of device type, project year, and industry category; S3: Using the semantic vector as a query, retrieve several historical task cases with the highest similarity from the preset historical task vector library. Each historical task case includes a historical task description, actual workload, and historical complexity factor; wherein, the historical complexity factor and the complexity factor have the same data structure. S4: The structured information of the target task, the complexity factor, and the retrieved historical task cases are concatenated into reasoning prompts according to the prompt word template, and input into the large language model for contextual reasoning. The model output is then parsed to obtain the original prediction result containing the prediction workload, confidence interval, and reasoning basis. S5: Perform regular parsing on the original prediction results and apply preset engineering business rules for numerical calibration to generate standardized final prediction results. The final prediction results include prediction workload, confidence interval, inference summary, case ID of historical task cases, and a Boolean flag indicating whether calibration rules are triggered. S6: Receive user feedback on the final prediction result, add the feedback samples to the active learning queue, and periodically perform LoRA incremental fine-tuning on the large language model, update the historical task vector library, and optimize the prompt word template based on the samples in the queue.
2. The method for predicting engineering design quantities based on a large language model and engineering semantic reasoning as described in claim 1, characterized in that... In step S2, generating the semantic vector specifically includes: The hidden states of a specific network layer of the large language model are extracted and pooled to generate the semantic vector of the target task.
3. The method for predicting engineering design quantities based on a large language model and engineering semantic reasoning as described in claim 1, characterized in that... In step S3, retrieving historical task cases specifically includes: Using the semantic vector of the target task as a query, an approximate nearest neighbor search is performed in the historical task vector library deployed in the vector database to recall the historical task cases with the highest similarity.
4. The method for predicting engineering design quantities based on a large language model and engineering semantic reasoning as described in claim 1, characterized in that... In step S4, the prompt word template is a fixed format string that cannot be modified by the user. It is used to guide the large language model to perform structured reasoning and force it to output the prediction results in JSON format.
5. The method for predicting engineering design quantities based on a large language model and engineering semantic reasoning as described in claim 1, characterized in that... In step S5, the engineering business rules include one or more of the following rules: Minimum working hours verification rule is used to check whether the original predicted workload is lower than the preset minimum threshold corresponding to a specific profession or task type. If so, it will be forcibly adjusted to that threshold. Special condition correction rules are used to multiply the original prediction workload by a preset adjustment coefficient based on the complexity factor of the target task. Confidence interval normalization rules are used to limit the confidence intervals in the original prediction results to a pre-defined reasonable range.
6. The method for predicting engineering design quantities based on a large language model and engineering semantic reasoning as described in claim 1, characterized in that... Step S6 specifically includes: Obtain the actual workload consumed by user annotations and the reasons for deviations, and generate corrected samples; The corrected samples are added to a high-quality feedback pool to form an active learning queue; The large language model is periodically fine-tuned using LoRA based on samples in the queue. Newly completed audited tasks are added to the historical task vector library, and the instruction wording of the prompt word template is adjusted according to the error pattern.
7. The method for predicting engineering design quantities based on a large language model and engineering semantic reasoning as described in claim 1, characterized in that... The prediction method is implemented through a prediction system, which includes a data access and preprocessing module, an engineering semantic understanding and feature extraction module, a historical case retrieval module, a large language model structured reasoning module, a numerical calibration and result generation module, and a feedback learning and model evolution module. The data access and preprocessing module is used to acquire structured information of the target task and clean and preprocess the structured information. The engineering semantic understanding and feature extraction module is used to input the preprocessed task description text into a large language model fine-tuned by instructions from an engineering domain corpus, and generate the semantic vector and complexity factor of the target task. The historical case retrieval module is used to retrieve the most similar historical task cases from the historical task vector library using the semantic vector as the query. The structured reasoning module of the large language model is used to concatenate the structured information of the target task, the complexity factor, and the retrieved historical task cases into reasoning prompts according to the prompt word template, input them into the large language model for contextual reasoning, and parse the model output to obtain the original prediction result containing the prediction workload, confidence interval, and reasoning basis. The numerical calibration and result generation module is used to perform regular parsing on the original prediction results and apply preset engineering business rules to perform numerical calibration to generate standardized final prediction results. The feedback learning and model evolution module is used to receive user feedback on the final prediction results, add feedback samples to the active learning queue, and periodically perform LoRA incremental fine-tuning on the large language model, update the historical task vector library, and optimize the prompt word template based on the samples in the queue.