Industrial model fine-tuning and inference method, apparatus and electronic device

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By performing feature filtering and parameter updates of model active blocks on multi-source heterogeneous data on industrial edge devices, the problems of insufficient feature extraction and redundant information interference in existing technologies are solved. This enables efficient and real-time industrial quality inspection and fault prediction, reduces resource consumption, and is suitable for the real-time and detection accuracy requirements of smart manufacturing production lines.

CN121920544BActive Publication Date: 2026-06-16HAIER DIGITAL TECHNOLOGY (QINGDAO) CO LTD +2

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: HAIER DIGITAL TECHNOLOGY (QINGDAO) CO LTD
Filing Date: 2026-03-25
Publication Date: 2026-06-16

Application Information

Patent Timeline

25 Mar 2026

Application

16 Jun 2026

Publication

CN121920544B

IPC: G06N5/04; G06F18/10; G06F18/2433; G06F18/213; G06F18/22

CPC: G06N5/04; G06F18/10; G06F18/2433; G06F18/213; G06F18/22

AI Tagging

Application Domain

Inference methods

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN121920544B_ABST

Patent Text Reader

Abstract

Embodiments of the present application provide an industrial model fine-tuning and reasoning method and device, and electronic equipment, relating to the technical field of industrial artificial intelligence. The method obtains multi-source heterogeneous data of an industrial edge device, performs feature screening processing on the multi-source heterogeneous data to obtain feature data representing the multi-source heterogeneous data; establishes a mapping relationship between current process parameters and functional attributes of each layer block of a pre-trained model, determines an active block to be updated based on the mapping relationship, and updates parameters of the active block based on the feature data to obtain a target model adapted to the current process parameters; and performs reasoning on industrial data based on the target model to obtain at least one of an industrial quality inspection result, a device fault prediction result, or a process parameter optimization suggestion to guide the operation of an industrial production line. The method can significantly improve the self-adaptability and detection accuracy of an industrial model in a process frequently changing scenario, and achieve real-time and accurate monitoring and closed-loop control of a production state.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of industrial artificial intelligence technology, and in particular to an industrial model fine-tuning and reasoning method, device and electronic device. Background Technology

[0002] With the deepening of industrial and intelligent manufacturing, deep learning models are widely used in industrial quality inspection, equipment failure prediction, process optimization, and other scenarios. The demand for automation, intelligence, and real-time performance in industrial production is increasing, and pre-trained large models, fine-tuned to adapt to specific industrial tasks, have become a key means of improving production efficiency.

[0003] In existing industrial data processing and model application solutions, when faced with multi-source heterogeneous data of different types such as images, time series, and text, a unified data preprocessing process or a general feature extraction strategy is usually adopted for standardized processing. This ignores the differences in physical characteristics and spatiotemporal distribution of various types of data, and no differentiated feature extraction strategies are designed for their characteristics.

[0004] However, existing technologies fail to design differentiated feature extraction strategies for multi-source industrial data such as images, time series, and text, resulting in insufficient extraction of core features. A large amount of redundant information interferes with the model's capture of key process features, thereby affecting the accuracy and efficiency of subsequent tasks such as quality inspection and fault prediction. Summary of the Invention

[0005] This application provides an industrial model fine-tuning and inference method, apparatus, and electronic device to solve the technical problem in the prior art that insufficient feature extraction and redundant information interfering with model accuracy are caused by the lack of differentiated feature extraction strategies for multi-source heterogeneous data such as images, time series, and text, which in turn affects the efficiency and accuracy of industrial quality inspection and fault prediction tasks.

[0006] Firstly, this application provides a method for fine-tuning and inference of an industrial model, including:

[0007] Acquire multi-source heterogeneous data from industrial edge devices, perform feature filtering processing on the multi-source heterogeneous data, and obtain feature data characterizing the multi-source heterogeneous data;

[0008] Establish a mapping relationship between the current process parameters and the functional attributes of each layer block of the pre-trained model. Based on the mapping relationship, determine the active block of the model to be updated, and update the parameters of the active block of the model based on the feature data to obtain a target model that adapts to the current process parameters.

[0009] Based on the target model, inference is performed on industrial data to obtain at least one of the following: industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions, in order to guide the operation of industrial production lines.

[0010] Optionally, the feature filtering process for the multi-source heterogeneous data includes:

[0011] Extract the dimensional and statistical features of the multi-source heterogeneous data to construct a feature vector;

[0012] The feature vectors are matched using preset rules to determine the data type identifier corresponding to the multi-source heterogeneous data;

[0013] Based on the data type identifier, the corresponding feature filtering rules are invoked to perform feature filtering processing on the multi-source heterogeneous data.

[0014] Optionally, the step of invoking the corresponding feature filtering rule based on the data type identifier to perform feature filtering processing on the multi-source heterogeneous data includes at least one of the following:

[0015] If the data type is identified as image data, a mask matrix is generated based on a preset defect area coordinate library, and the original image pixels are weighted based on the mask matrix, retaining the pixel features of the area covered by the mask matrix as image feature data.

[0016] If the data type is identified as time series data, the length of the time window is determined based on the industrial process cycle, and the correlation between the data and process parameters within the time window is calculated. Time series segments with a correlation higher than a preset correlation threshold are retained as time series feature data.

[0017] If the data type is identified as text data, then the text content is matched based on a preset process keyword dictionary, and the matching keywords and their context vectors are extracted as text feature data.

[0018] Optionally, establishing the mapping relationship between the current process parameters and the functional attributes of each layer block of the pre-trained model includes:

[0019] Extract the functional feature vectors of each layer block in the pre-trained model, and extract the process semantic vector of the process parameters at the current time.

[0020] Calculate the cosine similarity between the functional feature vector and the process semantic vector, and mark the blocks with a cosine similarity greater than a preset similarity threshold as candidate active blocks;

[0021] The average gradient magnitude of each candidate active block during the historical fine-tuning process is calculated, and a predetermined number of candidate active blocks with the highest average gradient magnitude are determined as the active blocks of the model to be updated.

[0022] Optionally, updating the parameters of the active block of the model based on the feature data to obtain a target model adapted to the current process parameters includes:

[0023] Keeping the original weight parameters of the pre-trained model frozen, a first low-rank matrix and a second low-rank matrix are injected into the active block of the model to be updated. The product of the first low-rank matrix and the second low-rank matrix is used to represent the amount of the original weight parameters to be adjusted.

[0024] The feature data is input into the pre-trained model to obtain the prediction result of the current industrial task, and a loss function is constructed based on the difference between the prediction result and the expected real result of the industrial task.

[0025] The contribution of each element in the first low-rank matrix and the second low-rank matrix to the prediction error is calculated according to the loss function, and the adjustment direction and adjustment magnitude of each element are determined according to the contribution. The elements in the first low-rank matrix and the second low-rank matrix are then iteratively updated.

[0026] After the iteration is completed, the product of the updated first low-rank matrix and the second low-rank matrix is added to the corresponding original weight parameters to obtain the target model.

[0027] Optionally, the method further includes: during the process of updating the parameters of the active block of the model based on the feature data, adjusting the computational intensity of the parameter update according to the memory occupancy and power consumption of the industrial edge device.

[0028] Optionally, adjusting the computational intensity of parameter updates based on the memory occupancy and power consumption of the industrial edge device includes:

[0029] When the memory usage rate is detected to exceed the preset memory threshold, the filtering intensity when performing feature filtering on the multi-source heterogeneous data is increased, and the parameter storage accuracy of the pre-trained model is switched from high-precision format to low-precision format.

[0030] When the power consumption is detected to exceed the preset power consumption threshold, the freezing ratio of model layer blocks is increased, and the number of active model blocks participating in the calculation in a single iteration is reduced.

[0031] When the memory occupancy rate is detected to be lower than the preset memory threshold and the power consumption is lower than the preset power consumption threshold, the filtering intensity is reduced and the parameter storage precision is restored to the high-precision format to increase the computational intensity of parameter updates.

[0032] Optionally, the method further includes:

[0033] After completing the parameter update of the active block of the model to obtain the target model, the target model is deployed as an online inference model, and the process parameters of the production line are monitored in real time during the operation of the online inference model.

[0034] When the difference between the new process parameters and the current process parameters corresponding to the online inference model is detected to be greater than the change threshold, the parameters of the active block of the model are finely adjusted based on the parameters of the online inference model and / or the finely tuned model parameters of historical similar process parameters, and the new process parameters are used as a benchmark to obtain the updated new model.

[0035] The current online inference model is switched to the updated model, and the switched model is used to infer industrial data to guide the operation of the production line.

[0036] Secondly, this application provides an industrial model fine-tuning and inference device, comprising:

[0037] The acquisition module is used to acquire multi-source heterogeneous data from industrial edge devices;

[0038] The processing module is used to perform feature filtering processing on the multi-source heterogeneous data to obtain feature data characterizing the multi-source heterogeneous data;

[0039] The processing module is also used to establish a mapping relationship between the current process parameters and the functional attributes of each layer block of the pre-trained model, determine the active block of the model to be updated based on the mapping relationship, and update the parameters of the active block of the model based on the feature data to obtain a target model that adapts to the current process parameters.

[0040] The processing module is also used to reason about industrial data based on the target model to obtain at least one of industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions, so as to guide the operation of industrial production lines.

[0041] Optionally, the device further includes: a determining module;

[0042] The processing module is also used to extract the dimensional features and statistical features of the multi-source heterogeneous data and construct a feature vector;

[0043] The determining module is used to match the feature vector according to preset rules to determine the data type identifier corresponding to the multi-source heterogeneous data;

[0044] The processing module is also used to call the corresponding feature filtering rules according to the data type identifier to perform feature filtering processing on the multi-source heterogeneous data.

[0045] Optionally, the processing module is further configured to generate a mask matrix based on a preset defect area coordinate library if the data type is identified as image data, and to perform weighted processing on the original image pixels based on the mask matrix, retaining the pixel features of the area covered by the mask matrix as image feature data.

[0046] The processing module is further configured to, if the data type is identified as time series data, determine the length of the time window based on the industrial process cycle, calculate the correlation between the data and process parameters within the time window, and retain time series segments with a correlation higher than a preset correlation threshold as time series feature data.

[0047] The processing module is further configured to, if the data type is identified as text data, match the text content based on a preset process keyword dictionary and extract the matching keywords and their context vectors as text feature data.

[0048] Optionally, the device further includes: a computing module;

[0049] The processing module is also used to extract the functional feature vectors of each layer block in the pre-trained model and extract the process semantic vector of the process parameters at the current moment.

[0050] The calculation module is used to calculate the cosine similarity between the functional feature vector and the process semantic vector;

[0051] The processing module is also used to mark blocks with a cosine similarity greater than a preset similarity threshold as candidate active blocks;

[0052] The determining module is further configured to calculate the average gradient magnitude of each candidate active block during the historical fine-tuning process, and determine a preset number of candidate active blocks with the highest average gradient magnitude as the active blocks of the model to be updated.

[0053] Optionally, the processing module is further configured to keep the original weight parameters of the pre-trained model frozen, and inject a first low-rank matrix and a second low-rank matrix into the active block of the model to be updated, wherein the product of the first low-rank matrix and the second low-rank matrix is used to represent the amount to be adjusted of the original weight parameters.

[0054] The processing module is further configured to input the feature data into the pre-trained model to obtain the prediction result of the current industrial task, and to construct a loss function based on the difference between the prediction result and the expected real result of the industrial task;

[0055] The processing module is further configured to calculate the contribution of each element in the first low-rank matrix and the second low-rank matrix to the prediction error according to the loss function, and determine the adjustment direction and adjustment magnitude of each element according to the contribution, and iteratively update the elements in the first low-rank matrix and the second low-rank matrix.

[0056] The processing module is further configured to, after the iteration is completed, add the product of the updated first low-rank matrix and the second low-rank matrix to the corresponding original weight parameters to obtain the target model.

[0057] Optionally, the processing module is further configured to adjust the computational intensity of parameter updates according to the memory occupancy and power consumption of the industrial edge device during the process of updating the parameters of the active block of the model based on the feature data.

[0058] Optionally, the processing module is further configured to, when the memory occupancy rate is detected to exceed a preset memory threshold, increase the filtering intensity when performing feature filtering processing on the multi-source heterogeneous data, and switch the parameter storage precision of the pre-trained model from a high-precision format to a low-precision format;

[0059] The processing module is also used to increase the freezing ratio of model layer blocks and reduce the number of active model blocks participating in the calculation in a single iteration when the power consumption is detected to exceed a preset power consumption threshold.

[0060] The processing module is also used to reduce the filtering intensity and restore the parameter storage precision to the high-precision format when the memory occupancy rate is detected to be lower than the preset memory threshold and the power consumption is lower than the preset power consumption threshold, so as to increase the computational intensity of parameter updates.

[0061] Thirdly, this application provides an electronic device, including: a processor, and a memory communicatively connected to the processor;

[0062] The memory stores computer-executed instructions;

[0063] The processor executes computer execution instructions stored in the memory to implement the industrial model fine-tuning and inference method as described in the first aspect and various possible implementations of the first aspect above.

[0064] Fourthly, this application provides a computer-readable storage medium storing computer-executable instructions thereon, which, when executed by a processor, are used to implement the industrial model fine-tuning and inference method as described in the first aspect and various possible implementations of the first aspect.

[0065] Fifthly, this application provides a program product, including a computer program, which, when executed by a processor, implements the industrial model fine-tuning and inference method described above.

[0066] The industrial model fine-tuning and inference method, device, and electronic equipment provided in this application are applicable to industrial edge computing scenarios. This method acquires multi-source heterogeneous data collected by industrial edge devices and performs differentiated feature filtering based on data type to eliminate redundant information. It establishes a mapping relationship between current process parameters and the functions of internal blocks in the pre-trained model, locking and updating parameters only for "model active blocks" strongly correlated with the current process, significantly reducing the number of parameters and memory usage, and obtaining a target model adapted to the current production state. Based on the target model, it performs inference on industrial data, outputting industrial quality inspection results, equipment fault prediction results, or process parameter optimization suggestions, accurately guiding production line operation. This solves the problems of high computational load and loss of key features due to general data processing in traditional full-scale fine-tuning, ensuring that the model maintains high-precision detection capabilities even with frequent process changes, while significantly reducing the memory and power consumption of edge devices, achieving real-time, accurate monitoring and closed-loop control of industrial production lines. Attached Figure Description

[0067] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.

[0068] Figure 1 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application. Figure 1 ;

[0069] Figure 2 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application. Figure 2 ;

[0070] Figure 3 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application. Figure 3 ;

[0071] Figure 4 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application. Figure 4 ;

[0072] Figure 5 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application. Figure 5 ;

[0073] Figure 6 A schematic diagram of the structure of an industrial model fine-tuning and inference device provided in this application;

[0074] Figure 7 This is a schematic diagram of the structure of an electronic device provided in this application.

[0075] The accompanying drawings have illustrated specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to specific embodiments. Detailed Implementation

[0076] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.

[0077] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of the relevant data all comply with the relevant laws, regulations, and standards of the relevant regions, have taken necessary confidentiality measures, do not violate public order and good morals, and provide corresponding operation portals for users to choose to authorize or refuse.

[0078] With the deepening of industrial and intelligent manufacturing, deep learning models are widely used in industrial quality inspection, equipment failure prediction, process optimization, and other scenarios. The demand for automation, intelligence, and real-time performance in industrial production is increasing, and pre-trained large models, fine-tuned to adapt to specific industrial tasks, have become a key means of improving production efficiency.

[0079] Modern industrial production lines are often highly complex and dynamic. For example, the manufacturing mode of multiple varieties and small batches leads to frequent changes in product models and real-time adjustments to processing parameters. This requires models deployed at the edge of the production line to have strong adaptability, be able to process heterogeneous data from multiple sources such as vision cameras (images), vibration and temperature sensors (time-series data), and operation logs and maintenance records (text data), and complete a closed loop from data perception to decision output to ensure continuous and stable high-quality production.

[0080] In existing industrial data processing and model application solutions, when faced with multi-source heterogeneous data of different types, such as images, time series, and text, a unified data preprocessing process or a general feature extraction strategy is usually adopted for standardized processing. This ignores the differences in physical characteristics and spatiotemporal distribution of various types of data, and fails to design differentiated feature extraction strategies for their specific characteristics. For example, existing technical solutions often simply concatenate data of different modalities or encode them in a "one-size-fits-all" manner using uniform convolutional kernels and fully connected layers, failing to fully consider the spatial locality of defect areas in image data, the temporal correlation of process cycles in time series data, and the semantic sparsity of technical terms in text data.

[0081] However, existing technologies fail to design differentiated feature extraction strategies for multi-source industrial data such as images, time series, and text, resulting in insufficient extraction of core features. A large amount of redundant information interferes with the model's capture of key process features, thereby affecting the accuracy and efficiency of subsequent tasks such as quality inspection and fault prediction, making it difficult to meet the requirements of high-end intelligent manufacturing for extreme accuracy and real-time response.

[0082] To address the aforementioned issues, this application provides an industrial model fine-tuning and inference method, apparatus, and electronic device. By constructing a mapping relationship between process parameters and model layer block functions, it accurately locates the active blocks of the model to be updated. Furthermore, it employs differentiated feature selection strategies to extract core features for different types of data, such as images, time series, and text. This enables efficient local fine-tuning and real-time inference of the model on resource-constrained industrial edge devices. This application can be widely applied to industrial quality inspection, equipment fault prediction, and process parameter optimization scenarios involving multi-source data fusion, and is particularly suitable for intelligent manufacturing production lines with strict requirements for real-time performance, detection accuracy, and equipment resource consumption. The executing entity of this application can be an industrial edge computing gateway, an embedded industrial control computer, or a smart sensor with computing capabilities; this application does not limit this to any particular type.

[0083] It should be noted that the industrial model fine-tuning and inference method, device and electronic equipment provided in this application can be used in the field of industrial artificial intelligence technology, or in any field other than industrial artificial intelligence. This application does not limit the application field of the industrial model fine-tuning and inference method, device and electronic equipment.

[0084] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will now be described with reference to the accompanying drawings.

[0085] Figure 1 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application embodiment. Figure 1 .like Figure 1 As shown, the industrial model fine-tuning and inference method provided in this embodiment includes:

[0086] S101. Acquire multi-source heterogeneous data from industrial edge devices, perform feature filtering on the multi-source heterogeneous data, and obtain feature data that characterizes the multi-source heterogeneous data.

[0087] As is understandable, industrial edge devices refer to computing devices deployed at the industrial production site, i.e., at the source of data, such as robot controllers, embedded industrial PCs, and edge computing gateways. Industrial edge devices have a certain computing power and can process industrial data in real time, but their resources (memory, computing power, and power consumption) are relatively limited.

[0088] Multi-source heterogeneous data refers to multiple types of data that exist simultaneously in an industrial scenario, such as: product surface image data captured by industrial cameras (e.g., defect images, assembly images); equipment operation data monitored in real time by sensors (e.g., time-series signals such as temperature, vibration, and current); and process text data recorded by the production management system (e.g., process parameter documents, operation logs, quality inspection reports, etc.).

[0089] Feature filtering refers to the process of identifying and extracting core information relevant to the current industrial task from raw, multi-source, heterogeneous data, while removing redundant and noisy data. It aims to retain key features that characterize product quality, equipment status, or process characteristics, providing high-quality input data for subsequent model fine-tuning.

[0090] During the operation of industrial production lines, industrial edge devices acquire multi-source heterogeneous data in real time through various data acquisition interfaces (such as industrial cameras, sensor data acquisition cards, industrial Ethernet, etc.). Multi-source heterogeneous data contains a large amount of redundant information and environmental noise, such as background areas in image data, stationary segments in time series data, and irrelevant descriptions in text data. If directly used for model fine-tuning, it will not only increase the computational burden but may also interfere with the model's capture of key features.

[0091] Feature filtering is performed on the acquired multi-source heterogeneous data. Specifically, corresponding filtering strategies are adopted for different types of data: for image data, key areas that may contain defects are identified and retained; for time-series data, specific time windows related to the process cycle are focused on; for text data, keywords describing process parameters and their context are extracted. Through feature filtering, core feature data that can characterize product quality, equipment status, or process characteristics are extracted from massive amounts of raw data, serving as input for subsequent model fine-tuning.

[0092] By performing feature filtering on multi-source heterogeneous data, the amount of data computation in the subsequent model fine-tuning process is reduced, alleviating the resource burden on industrial edge devices; redundant information is avoided from interfering with model training, laying the foundation for improving the accuracy of tasks such as industrial quality inspection and fault prediction.

[0093] S102. Establish the mapping relationship between the current process parameters and the functional attributes of each layer block of the pre-trained model. Based on the mapping relationship, determine the active block of the model to be updated, and update the parameters of the active block of the model based on the feature data to obtain the target model that adapts to the current process parameters.

[0094] Understandably, process parameters refer to variables used to control product quality and production processes during industrial production, including but not limited to processing temperature, pressure, production line speed, and equipment operating thresholds.

[0095] A pre-trained model refers to a deep learning model that has been pre-trained on a large-scale general dataset and possesses powerful general feature extraction capabilities. Layer block functional attributes refer to the roles and functions of each layer block in the feature extraction process. Different layers blocks may be responsible for capturing features of different granularities.

[0096] Active model blocks refer to the layers selected from the pre-trained model that are most relevant to the current process task. Active model blocks are activated and their parameters are updated during fine-tuning.

[0097] The target model refers to the industrial model obtained after fine-tuning by this method, which is adapted to the current specific process parameters and can perform accurate reasoning for the current production task.

[0098] One possible implementation involves establishing a mapping relationship between the current process parameters and the functional attributes of each layer of the pre-trained model after acquiring feature data. Specifically, the current process parameters are characterized, and the functional characteristics of each layer of the pre-trained model are quantitatively analyzed to calculate the correlation between the two, thus constructing a mapping relationship between the process parameters and the model layers. Based on this mapping relationship, layers with a high correlation to the current process parameters are selected from the pre-trained model and identified as "active model blocks" to be updated. Based on the feature data, parameters are updated only for the selected active model blocks. Specifically, the parameters of the remaining layers of the pre-trained model remain frozen, significantly reducing the number of parameters that need to be updated and the computational resource consumption. Through iterative optimization, the parameters of the active model blocks gradually adapt to the needs of the current process task, ultimately resulting in a target model that can serve the current production task.

[0099] By establishing a mapping relationship between process parameters and model layer block functions, and determining the active blocks of the model based on this mapping relationship, the alignment between process task requirements and model structure was achieved. The strategy of updating parameters only for active blocks compresses the amount of parameters that need adjustment, reducing computational resource consumption and memory usage during model fine-tuning. This enables resource-constrained industrial edge devices to efficiently complete model adaptation tasks. It also avoids introducing noise by incorrectly adjusting irrelevant layers, significantly improving the accuracy of the fine-tuned model on specific industrial tasks. The final target model retains the general knowledge of the pre-trained model while adapting to the requirements of the current process parameters.

[0100] S103. Based on the target model, reason about industrial data to obtain at least one of the following: industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions, so as to guide the operation of industrial production lines.

[0101] Understandably, industrial data refers to data collected in real time from industrial production lines that requires inference and analysis, including but not limited to product images of the current production batch, real-time equipment operating status data, and current process parameter records.

[0102] Industrial quality inspection results refer to the quality inspection conclusions output by the target model after analyzing the product image, including information such as whether the product has defects, the type of defects (such as scratches, dirt, deformation), the coordinates of the defect location, and the severity level of the defects.

[0103] Equipment failure prediction results refer to the equipment status assessment output by the target model after analyzing the time-series data of the equipment sensors, including information such as the current health status of the equipment, the prediction of remaining service life, and the types of potential failures.

[0104] Process parameter optimization suggestions refer to process adjustment suggestions output by the target model after comprehensive analysis of current production data. These suggestions include the direction, magnitude, and expected optimization effects of key process parameters, and are used to guide the dynamic optimization of production line process parameters.

[0105] One possible implementation involves industrial edge devices continuously acquiring real-time industrial data from the production line, including images of the current batch of products, sensor data from equipment operation, and records of current process parameters. This real-time industrial data is then input into a target model. Based on its feature extraction capabilities learned during fine-tuning, the target model performs calculations on the input data and outputs corresponding inference results.

[0106] Specifically, the inference results output by the target model can take different forms: in quality inspection scenarios, the model outputs product defect detection results; in equipment predictive maintenance scenarios, the model outputs equipment fault prediction results; and in process optimization scenarios, the model outputs process parameter optimization suggestions. In some embodiments, the inference results are fed back to the production line control system or maintenance personnel in real time through a human-machine interface. Based on the inference results output by the target model, the production line control system or maintenance personnel take corresponding control measures or make adjustment decisions. For example, defective products are rejected based on quality inspection results; equipment maintenance is scheduled in advance based on fault prediction results; and production parameters are dynamically adjusted based on process optimization suggestions, thereby providing guidance for the entire production line operation process.

[0107] This embodiment provides an industrial model fine-tuning and inference method. This method acquires multi-source heterogeneous data collected from industrial edge devices and performs differentiated feature filtering based on data type to eliminate redundant information. It establishes a mapping relationship between current process parameters and the functions of internal blocks in the pre-trained model, locking and updating parameters only for "model active blocks" strongly related to the current process, significantly reducing the number of parameters and memory usage, and obtaining a target model adapted to the current production state. Based on the target model, it performs inference on industrial data, outputting industrial quality inspection results, equipment fault prediction results, or process parameter optimization suggestions, accurately guiding production line operation. This method solves the problems of high computational load and loss of key features due to general data processing in traditional full-scale fine-tuning, ensuring that the model maintains high-precision detection capabilities even with frequent process changes, while significantly reducing the memory and power consumption of edge devices, achieving real-time, accurate monitoring and closed-loop control of industrial production lines.

[0108] Figure 2 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application embodiment. Figure 2 .like Figure 2 As shown, in Figure 1 Based on the examples, the fine-tuning and inference methods of the industrial model are described in detail, including:

[0109] S201. Obtain multi-source heterogeneous data from industrial edge devices, extract the dimensional features and statistical features of the multi-source heterogeneous data, and construct feature vectors.

[0110] Understandably, dimensional features refer to metadata features that describe the physical form and structural attributes of data. The specific meaning of dimensional features differs for different modalities of data. For example, image data is typically a 4-dimensional tensor (N, C, H, W), where N is the batch size (number of image samples), C is the number of channels (number of color channels in the image, each channel representing a color or feature component), H is the height (number of pixels in the vertical direction of the image, i.e., the number of rows in the image), and W is the width (number of pixels in the horizontal direction of the image, i.e., the number of columns in the image). Temporal data is typically a 3-dimensional tensor (N, T, D), where N is the batch size (number of data samples), T is the time step (number of time points contained in each temporal data segment, i.e., the sequence length of the sampling points), and D is the feature dimension (number of features collected at each time point). Text data is typically a 3-dimensional tensor (N, L, E), where N is the batch size (number of text samples input to the model), L is the sequence length (number of tokens contained in each text segment, i.e., the sequence length of the text), and E is the embedding dimension (the vector dimension of each token after embedding layer mapping).

[0111] Statistical features refer to characteristics that describe the numerical distribution of data, including pixel value range, mean, variance, time step, word embedding dimension, etc. For example, the pixel values of image data are usually distributed within [0, 255]; the sampling frequency and acquisition duration of time series data determine the time step; and the word embedding dimension of text data depends on the configuration of the pre-trained embedding model.

[0112] An eigenvector is a multidimensional vector composed of dimensional features and statistical features. Where dim represents the number of data dimensions, range represents the range of data values, and stat represents the mean and variance of the data. Statistical features and dimensional features together constitute the basis for determining data types.

[0113] During the operation of industrial production lines, industrial edge devices acquire multi-source heterogeneous data in real time through various data acquisition interfaces. This multi-source heterogeneous data may originate from different data sources and acquisition devices, and possess different formats and structures. To enable differentiated feature filtering strategies for different types of data, it is necessary to determine the data type.

[0114] A preliminary analysis was performed on the acquired multi-source heterogeneous data to extract its dimensional and statistical features. Dimensional features reflect the structural form of the data; for example, by examining the number of dimensions of a tensor, it can be determined whether the data is image (4D), time series data (3D), or text data (3D, but with specific ranges for sequence length and embedding dimension). Statistical features further characterize the numerical distribution of the data; for example, the pixel value range of image data is typically [0, 255], the number of sampling points in time series data reflects the length of the time window, and the word embedding dimension of text data depends on the language model used. The extracted dimensional and statistical features were then fused to construct a feature vector. Multidimensional feature vectors By combining the structural and numerical distribution information of the data, it can effectively characterize the inherent characteristics of different data types, laying the foundation for subsequent data type identification.

[0115] One possible implementation of image data processing involves an industrial camera capturing images of the product surface at a resolution of 224×224 pixels with RGB three channels. The image tensor acquired by the edge device is [1, 3, 224, 224]. The dimensional feature dim=4 is extracted, with pixel values ranging from [0, 255]. The sum of the values of all pixels in the image is divided by the total number of pixels, for example, using the following formula to calculate the pixel mean. and pixel variance :

[0116]

[0117]

[0118] in, The average pixel value. Here, C represents the pixel value at each position in each channel, H represents the number of channels (the number of color channels in the image, each channel representing a color or feature component), H represents the height (the number of pixels in the vertical direction of the image, i.e., the number of rows in the image), and W represents the width (the number of pixels in the horizontal direction of the image, i.e., the number of columns in the image). denoted as pixel variance.

[0119] For example, the pixel mean value is calculated from the current image data. =128.5 and pixel variance =2450, meaning the feature vector of the image data is:

[0120]

[0121] One possible implementation of time-series data processing involves a vibration sensor acquiring device operation data at a sampling frequency of 1000Hz, with a sampling duration of 1 second, collecting 1000 time points each time. The time-series tensor acquired by the edge device is [1, 1000, 1]. A dimensional feature dim=3 is extracted, with a time step T=1000, and statistical calculations are performed on all sampling points in the time-series data. For example, the signal mean is calculated using the following formula. and signal variance :

[0122]

[0123]

[0124] in, The mean value of the signal reflects the overall level of the vibration signal. Here, T represents the signal variance, reflecting the intensity of the vibration signal fluctuations. T is the time step, and t represents the index of the time point, ranging from 1 to T. This represents the vibration amplitude collected at time point t.

[0125] For example, the signal mean is calculated from the current time series data. =0.02 and signal variance =0.15, that is, the feature vector of the time series data is:

[0126]

[0127] One possible implementation of text data processing involves the production line system uploading a process parameter document, which, after word segmentation and embedding, yields 512 tokens with an embedding dimension of 768. The text tensor acquired by the edge device is [1, 512, 768]. The dimensionality feature dim=3, sequence length L=512, and embedding dimension E=768 are extracted, and statistical calculations are performed on all embedding vector elements in the text data. For example, the embedding mean is calculated using the following formula. and embedding variance :

[0128]

[0129]

[0130] in, The embedding mean reflects the overall distribution center of the entire text segment in the semantic space. E represents the embedding variance, reflecting the semantic dispersion of each word embedding vector relative to the center point in a text segment. L is the sequence length, and E is the embedding dimension. This is the lexical position index, with values ranging from 1 to L. For embedded dimension indexes, the value range is 1 to E. The embedding value for a positional lexical unit on a specified dimension.

[0131] For example, the embedding mean is calculated from the current text data. =0.01 and embedding variance =0.08, meaning the feature vector of the text data is:

[0132]

[0133] By extracting dimensional and statistical features from multi-source heterogeneous data and constructing feature vectors, this method achieves quantitative characterization of industrial multi-source data, providing a unified input format for subsequent accurate data type identification. This method does not rely on specific data content, but only on the data's structure and statistical attributes for feature extraction. It features low computational overhead and is suitable for resource-constrained operating environments of industrial edge devices.

[0134] S202. Match feature vectors using preset rules to determine the data type identifiers corresponding to multi-source heterogeneous data.

[0135] Understandably, preset rules refer to pre-defined data type discrimination rules used to map feature vectors to corresponding data types. One possible implementation is that the preset rules are represented by a lightweight classifier, which is pre-trained based on a large number of multi-source industrial data samples and can output the corresponding data type identifier based on the input feature vector.

[0136] Data type identifiers are symbols used to distinguish different industrial data types, typically including image data identifiers (such as "image" or "I"), time-series data identifiers (such as "time-series" or "S"), and text data identifiers (such as "text" or "T"). Data type identifiers are used to subsequently invoke the corresponding feature filtering rules.

[0137] One possible implementation involves inputting the feature vector F constructed in step S201 into a pre-trained classifier. The classifier is trained on a large number of multi-source industrial data samples and has learned the distribution patterns of different data types in the feature space. After receiving the feature vector F, the classifier outputs the corresponding data type identifier through its internal computational logic (such as linear weighted summation and activation function mapping in logistic regression).

[0138] For example, when the input feature vector F has features such as dim=4 and range=[0, 255], the classifier outputs an image data identifier; when dim=3 and the time step T conforms to the characteristics of time-series data, the classifier outputs a time-series data identifier; when dim=3 and the embedding dimension E conforms to the characteristics of a language model, the classifier outputs a text data identifier. Through the classifier, the specific type of the currently processed multi-source heterogeneous data can be automatically identified, providing a basis for calling the corresponding feature selection rules in subsequent steps.

[0139] Specifically, a logistic regression classifier was pre-trained for data type discrimination. The training dataset for this classifier contains 100,000 labeled industrial data samples, covering three types: images, time series, and text. Optionally, when processing a batch of product surface images, the feature vector constructed in step S201 is... The vector is input into the classifier, which calculates and outputs a data type identifier: "image" or "I", indicating the current feature vector. Image data;

[0140] Optionally, when processing a batch of vibration sensor data, the feature vector constructed in step S201 is: The vector is then input into the classifier, which calculates and outputs a data type identifier: "time-series" or "S", indicating the current feature vector. It is time-series data;

[0141] Optionally, when processing a batch of process parameter documents, the feature vector constructed in step S201 above is: The vector is input into the classifier, which calculates and outputs a data type identifier: "text" or "T", indicating the current feature vector. It is text data.

[0142] By matching feature vectors using pre-defined rules to determine data type identifiers, this approach fully leverages the dimensional, value range, and statistical characteristics of the data, comprehensively considering multiple aspects of the data and achieving higher accuracy. The pre-defined rules are based on a large number of training samples, adapting to the distribution characteristics of different types of industrial data and maintaining stable discrimination performance even when encountering new data samples. The discrimination process requires no manual intervention, making it suitable for the real-time data processing needs of industrial edge devices and providing accurate data type criteria for subsequent differentiated feature selection.

[0143] S203. If the data type is identified as image data, a mask matrix is generated based on the preset defect area coordinate library, and the original image pixels are weighted based on the mask matrix, retaining the pixel features of the area covered by the mask matrix as image feature data.

[0144] Understandably, a defect area coordinate library refers to a pre-established database that stores historical product defect location information. The coordinate library is generated based on the annotation results of a large amount of historical quality inspection data, recording the coordinate regions of various product defects in images. Each record in the coordinate library contains information such as defect type, coordinate range (e.g., upper left and lower right corner coordinates), and confidence level.

[0145] A mask matrix is a binary matrix or weight matrix with the same dimensions as the original image, used to identify the importance of each pixel in the image. The dimensions of the mask matrix are the same as the dimensions of the image activation values, represented as [N, C, H, W] (N is the batch size, C is the number of channels, H is the height, and W is the width). Each element in the mask matrix corresponds to a pixel in the original image, and its value indicates whether the pixel belongs to a defective region that needs to be preserved.

[0146] Weighted processing refers to the operation of multiplying the pixels of the original image element-wise using a mask matrix to enhance defective areas and suppress background areas. Specifically, the pixel values of the original image are multiplied by the corresponding elements of the mask matrix, so that only pixels in defective areas are retained.

[0147] One possible implementation is that when step S202 determines that the data type being processed is image data, a differential feature filtering process for the image data is initiated. This involves utilizing prior knowledge of defect locations in industrial scenarios to accurately extract key areas related to product quality from the image, while suppressing redundant information such as background in the image.

[0148] Specifically, a pre-defined defect area coordinate library is invoked. This library is accumulated during historical production processes, such as through manually annotated historical defect samples, recording the location range of various product defects in images. The data in the coordinate library can be dynamically updated according to adjustments in production processes to ensure it always adapts to the current production task.

[0149] For the current batch of input images, a mask matrix M with the same dimensions as the image tensor is generated based on the defect coordinate range recorded in the coordinate library. The generation rule for the mask matrix is as follows: for pixel positions within defect regions identified in the coordinate library, the corresponding element in the mask matrix is set to 1 (or a larger weight value); for non-defect regions, it is set to 0 (or a smaller weight value). The original image tensor x is multiplied element-wise by the mask matrix M, and then multiplied by the Bernoulli sampling variable Z to obtain the filtered image feature data. The calculation process is as follows:

[0150]

[0151] in, This refers to the image feature data obtained after filtering and processing, i.e., the final output pixel values. This represents the pixel value of the original image at channel c, height h, and width w. These are elements of a mask matrix generated based on a defect region coordinate library, with values of 0 or 1. is a Bernoulli sampling variable, taking the value 0 or 1, used to introduce random sparsity, and I is an identifier for the image data type.

[0152] The generated mask matrix is used to weight the pixels of the original image. This can be achieved, for example, through element-wise multiplication. Specifically, each pixel value in the original image tensor is multiplied by its corresponding element in the mask matrix. For defect region pixels with a mask value of 1, their pixel values are fully preserved; for background region pixels with a mask value of 0, their pixel values are set to zero, thus eliminating redundant information. The resulting image tensor after weighting is the image feature data after feature filtering. This data retains the information from the regions in the original image most relevant to the defect detection task, while significantly reducing the amount of data required for subsequent processing.

[0153] Specifically, a production line uses industrial cameras to capture images of the surface of mobile phone casings to detect scratches and defects. During the production line commissioning phase, process engineers collected 1000 historical sample images containing scratches and manually labeled the scratched areas in each image, recording their coordinate range. Statistical analysis revealed that scratches and defects mainly appeared in the four corners of the mobile phone casing and around the center logo. The coordinate database stores records in the following format:

[0154] Defect type: Scratches;

[0155] Coordinate range: [(100,150,200,250),(300,50,400,150),...] (corresponding to the coordinates of the upper left and lower right corners of the scratch, respectively).

[0156] The current batch input is an RGB image with a resolution of 224×224 and a tensor dimension of [1,3,224,224]. A mask matrix M is generated based on the coordinates of the defect areas recorded in the coordinate library. For the four corner areas and the area around the center marker, the corresponding elements in M are set to 1; for other background areas, they are set to 0. The mask matrix M also has dimensions of [1,3,224,224], where the three channels share the same mask values.

[0157] At the same time, Bernoulli sampling variables are introduced. To achieve random sparsity, the sampling probability is set to P=1- ,in The sparsity is a preset value. In this embodiment, a feature region sparsity is set for the defect region. =0.1, meaning that pixels in the defective area have a 90% probability of being preserved; for the background area, set the background sparsity ratio. =0.8, meaning that there is only a 20% probability that background pixels will be preserved.

[0158] The original image tensor x is multiplied element-wise with the mask matrix M, and then multiplied by the Bernoulli sampling variable Z to obtain the filtered image feature data. For example, for a defective pixel at coordinates (120, 160), the original pixel value is (128, 128, 128), the mask value M=1, and the Bernoulli sampling variable Z=1 (retained), then the filtered pixel value remains (128, 128, 128); for a background pixel at coordinates (50, 50), the original pixel value is (200, 200, 200), and the mask value M=0, regardless of the value of the sampling variable, the filtered pixel value is (0, 0, 0).

[0159] After the above processing, the filtered image feature data is obtained. Its dimensions are still [1,3,224,224], but most background pixels have been set to zero, and only pixel information near the defect area is retained. The filtered image feature data will be used as input for subsequent model fine-tuning to train the defect detection model.

[0160] By leveraging prior defect knowledge accumulated in industrial scenarios, key regions related to product quality in images can be accurately located, avoiding wasting computational resources on large background areas and significantly improving feature extraction efficiency. The introduction of a Bernoulli sampling mechanism, while preserving core defect features, introduces random sparsity, enhancing the model's robustness to noise and preventing overfitting to fixed defect location patterns. The filtered image feature data retains the most critical defect region information while significantly reducing the amount of data, lowering memory usage and computational overhead during subsequent model fine-tuning, enabling resource-constrained industrial edge devices to operate efficiently.

[0161] S204. If the data type is identified as time series data, the time window length is determined based on the industrial process cycle, and the correlation between the data and process parameters within the time window is calculated. Time series segments with a correlation higher than the preset correlation threshold are retained as time series feature data.

[0162] Understandably, the industrial process cycle refers to the time required for a complete process cycle in industrial production, such as a molding cycle of an injection molding machine or an assembly cycle of an assembly robot.

[0163] The time window length refers to the length of the segment extracted from continuous time-series data for analysis. The window length ensures that each time window exactly covers a complete process cycle, facilitating the analysis of the correlation between data and process parameters.

[0164] Process parameters refer to the key control variables of the current production task, such as target temperature, pressure setpoint, and operating speed. Changes in process parameters are directly reflected in the time-series data collected by sensors.

[0165] Correlation refers to the degree of correlation between a certain time window in time series data and the current process parameters. The value range is [-1, 1]. The closer the absolute value is to 1, the stronger the correlation.

[0166] One possible implementation is to initiate a differential feature filtering process for time-series data when step S202 determines that the data being processed is time-series data. This involves utilizing industrial process cycle knowledge to extract the time segments most relevant to the current process parameters from continuously acquired sensor data, while suppressing redundant data unrelated to the process.

[0167] Specifically, based on the industrial process cycle of the current production task. and sensor sampling frequency Calculate the length of the time window. The calculation formula is:

[0168]

[0169] in, This indicates a rounding operation, ensuring the window length is an integer number of sampling points. For example, the process cycle. seconds, sampling frequency Hz, then the window length Each sampling point is used. The window length ensures that each time window covers exactly one complete process cycle, facilitating subsequent analysis of the overall correlation between the data within the window and the process parameters.

[0170] Continuous time series data are arranged according to the length of the time window. Divided into several non-overlapping or partially overlapping time windows Each time window contains A series of consecutive sampling points. For any given time window Calculate its correlation degree C with the current process parameter P. Specifically, the formula for calculating the correlation degree is:

[0171]

[0172] in, window The covariance between the data sequence and the process parameter P reflects the trend of their common change; For window The variance of the data within the window reflects the degree of fluctuation in the data within that window. The variance of process parameter P reflects the stability of the process parameter.

[0173] correlation The value range is [-1, 1]. A positive value indicates a positive correlation, meaning the window data increases as the process parameter increases; a negative value indicates a negative correlation, meaning the window data decreases as the process parameter increases. The closer the absolute value is to 1, the stronger the correlation between the window data and the process parameter.

[0174] Based on correlation Calculate the sparsity of each window. The formula for calculating sparsity is:

[0175]

[0176] The above formula strictly limits the sparsity to a certain value using the max and min functions. Within the interval, avoid excessive or insufficient sparsity. That is: when When it is close to 1 (highly correlated), Sparsity close to 0 A value close to 0.1 indicates that the data in this window is important and most data points should be retained; when... When it is close to 0 (unrelated), Sparsity close to 1 A value close to 0.8 indicates that the data in this window has high redundancy and can be significantly sparsified, retaining only a small number of data points.

[0177] Based on the sparsity of each window Randomly or periodically sample the data within the window and retain it. The proportion of sampling points. Specifically, for a window. within Each sampling point, with probability Each sampling point is retained. The retained sampling points in each window are then recombined in chronological order to form the filtered time-series feature data.

[0178] Specifically, a factory uses vibration sensors to monitor the operating status of motors, with a sampling frequency of... Hz, motor manufacturing cycle A second, meaning one complete rotation cycle is completed every 2 seconds. The length of the time window is calculated using the formula. A series of continuous sampling points, covering a complete motor rotation cycle.

[0179] Vibration data was continuously collected for 10 seconds, totaling 10,000 sampling points. According to... Divided into 5 non-overlapping time windows The current process parameter P is the motor speed setpoint, set to 1500 rpm. Using the aforementioned correlation calculation formula, the correlation between each window and the process parameter P is calculated, and the results are as follows:

[0180] This indicates that window 1 is highly correlated with the speed, corresponding to the stable operation stage of the motor;

[0181] This indicates that window 2 has a higher correlation.

[0182] This indicates that window 3 has weak correlation, corresponding to the period when the sensor is interfered with;

[0183] This indicates that window 4 has a high correlation.

[0184] This indicates that window 5 has a weak correlation, corresponding to the period when the motor is idling.

[0185] The sparsity of each window is calculated using the correlation coefficient formula. The calculation yields:

[0186] for , , ;

[0187] for , , ;

[0188] for , , ;

[0189] for , , ;

[0190] for , , .

[0191] Data is retained based on the sparsity of each window, i.e.:

[0192] for The sparsity rate is 0.135, meaning that 86.5% of the sampling points are retained, approximately 1730 sampling points;

[0193] for The sparsity rate is 0.184, meaning that 81.6% of the sampling points are retained, approximately 1632 sampling points;

[0194] for The sparsity is 0.695, meaning that 30.5% of the sampling points are retained, approximately 610 sampling points;

[0195] for The sparsity rate is 0.156, meaning that 84.4% of the sampling points are retained, approximately 1688 sampling points;

[0196] for The sparsity rate is 0.716, meaning that 28.4% of the sampling points are retained, approximately 568 sampling points.

[0197] The sampling points retained in each window are combined in chronological order to obtain time-series feature data with a total length of approximately 1730 + 1632 + 610 + 1688 + 568 = 6228 sampling points. Compared to the original 10,000 sampling points, the data volume is reduced by about 37.7%, while retaining the key time segments most relevant to motor speed.

[0198] By quantifying the relevance of each time window to the current process task through correlation calculation, key time segments reflecting changes in process status can be automatically identified and retained, while suppressing redundant data unrelated to the current task, such as sensor noise and equipment idling. Based on the correlation, the sparsity rate is dynamically adjusted, ensuring that important windows retain more detail while significantly compressing unimportant windows. This maximizes data compression while maintaining the integrity of key information, significantly reducing the amount of data and computational overhead for subsequent model processing.

[0199] S205. If the data type is identified as text data, then the text content is matched based on the preset process keyword dictionary, and the matching keywords and their context vectors are extracted as text feature data.

[0200] Understandably, a process keyword dictionary refers to a pre-built dictionary containing core terms in the industrial field. The dictionary is generated by extracting keywords from historical process documents, operation manuals, quality inspection reports, and other textual materials, resulting in thousands of core process terms.

[0201] Context vectors refer to the embedding vectors corresponding to the keyword's location and the surrounding words within a certain range. The context range is usually preset according to the needs of the industrial task, for example, retaining the keyword and two to three words before and after it to capture the complete semantic environment of the keyword.

[0202] Text feature data refers to the text vector representations obtained after matching and extraction, which contain process keywords and their contextual information, and are used as input for subsequent model fine-tuning.

[0203] One possible implementation is to initiate a differential feature filtering process for text data when step S202 determines that the data being processed is text data. This process utilizes domain knowledge to extract key information most relevant to the current production task from the unstructured text data, while suppressing redundant text content unrelated to the process.

[0204] Specifically, a pre-defined process keyword dictionary D is loaded to perform word segmentation on the original text, dividing the continuous natural language text into word sequences. Where L is the sequence length. Each word is mapped to an E-dimensional embedding vector using a pre-trained embedding model, resulting in the text tensor. .

[0205] For each position j (j=1,2,…,L) in the lexical sequence, examine the lexical at that position. Does it exist in the process keyword dictionary D? Matching rules can include exact match, stemming match, or fuzzy match. The matching results generate a keyword position mask. :

[0206]

[0207] Mask vector The length of is the same as the length L of the word sequence, where a value of 1 indicates that the word is a process keyword, and a value of 0 indicates that it is not a keyword.

[0208] To further enhance the robustness of the model and control the amount of data, a Bernoulli sampling variable is introduced. Random sparsification is performed separately for keywords and non-keywords. The sparsification process can be represented as:

[0209]

[0210] in, This refers to the text feature data obtained after filtering and processing. Let j be the embedding vector of the text tensor at position j. Let j be the value of the keyword position mask. Let Bernoulli be the sampling variable, and its probability distribution is: for the position of a successful match, For positions where a match fails, .

[0211] For example, setting keyword sparsity. This means that keywords have an 80% probability of being retained; non-keyword sparsity... This means that non-keywords have only a 10% chance of being retained. This ensures that core keywords are retained with a high probability, while redundant text is significantly compressed.

[0212] After sparsification, the retained keyword positions are selected. For each retained keyword position j, its context vector is extracted. The context range is usually defined as the k words before and after the keyword position, for example, retaining the keyword and the embedding vectors of the two words before and after it. If the keyword is located at the text boundary, only the valid words within the boundary are retained. The embedding vectors of the keyword position j and the k positions before and after it are concatenated in sequence to form a (2k+1)×E (2k+1)×E dimensional context representation. This representation includes both the semantic information of the keyword itself and its surrounding semantic environment, which can more accurately express the meaning of the keyword in the specific process description. All retained keyword context vectors are organized in the original text order to form text feature data. If g keyword positions are retained in the original text, the dimension of the text feature data is: [N, g, (2k+1)×E]. This feature data serves as the input for subsequent model fine-tuning and is used for industrial tasks such as process parameter understanding and fault diagnosis.

[0213] Specifically, a factory needs to automatically analyze process parameter documents on its production line to extract key process information for model fine-tuning. 5000 historical process documents were pre-collected, and a dictionary D containing 3000 process keywords was constructed using an algorithm to extract core terms. Example entries in the dictionary include: "temperature," "pressure," "speed," "heat treatment," "assembly," and "calibration."

[0214] The input is a process parameter document with the content: "Heat treatment process: heating temperature 850℃, holding time 30 minutes, cooling method oil cooling." Word segmentation of this text yields the following sequence: ["heat treatment", "process", ":", "heating", "temperature", "850", "℃", ",", "holding", "time", "30", "minutes", ",", "cooling", "method", "oil cooling", "."], with a sequence length L=17.

[0215] Each term is mapped to a 768-dimensional embedding vector. Each term is traversed, and it is determined whether it is in the process keyword dictionary D, thus obtaining the keyword position mask. Set keyword sparsity Non-keyword sparsity For each position, Bernoulli sampling is performed. After sparsification, all keywords are retained, and all non-keywords are removed.

[0216] Set the context range k = 2, that is, retain the keyword and the two word tokens before and after it. Taking the word token "temperature" as an example, the first two word tokens are ": " and "heating"; the last two word tokens are "850" and "℃". The concatenated context vector contains embeddings at 5 positions, with a dimension of: 5 * 768 = 3840.

[0217] For the position of the retained keywords, extract a 3840 - dimensional context vector for each keyword. Finally, the dimension of the text feature data is [1, 8, 3840]. This feature data contains all the key terms and their semantic environments in the process document, while excluding irrelevant text and punctuation marks. The data volume is compressed from the original 17×768 = 13056 dimensions to 8×3840 = 30720 dimensions (if only the keywords themselves are retained, it is 8×768 = 6144 dimensions).

[0218] Build a process keyword dictionary using industrial domain knowledge, which can automatically identify the core terms most relevant to the current production task in the text, avoiding wasting computing resources on stop words such as "of", "is", "in", etc. or common words irrelevant to the process, significantly improving the pertinence of feature extraction. Introduce a keyword position mask and a random sparsification mechanism to greatly compress non - keywords while retaining core keywords, which not only ensures the integrity of key information but also effectively reduces the data volume and computational overhead. Keywords and non - keywords adopt different sparsity rates, so that core information is retained with a high probability and redundant information is effectively suppressed.

[0219] S206. Establish the mapping relationship between the process parameters at the current moment and the functional attributes of each layer block of the pre - trained model.

[0220] It can be understood that the process parameters at the current moment refer to the key control variables at the current operating moment of the industrial production line, such as temperature, pressure, tolerance, speed, etc., which reflect the specific requirements of the current production task.

[0221] The pre - trained model refers to a deep learning model pre - trained on a large - scale dataset, which is composed of multiple layer blocks inside, and each layer block is responsible for extracting features at different granularities.

[0222] The functional attributes of the layer block refer to the functional characteristics of each layer block in the pre - trained model, including the input and output dimensions of the layer block, the receptive field size, the activation value distribution, etc., which reflect the role positioning of the layer block in feature extraction.

[0223] The mapping relationship refers to the degree of association between the current process parameters and each layer block, which is used to measure the importance of each layer block to the current process task.

[0224] One possible approach involves establishing a mapping relationship between the current process parameters and each layer of the pre-trained model after acquiring feature data. This provides a basis for subsequently selecting active blocks of the model to be updated. Specifically, the process parameters at the current moment are quantified and transformed into a form that can be associated with model layers. Simultaneously, functional analysis is performed on each layer of the pre-trained model to extract attribute information that characterizes its functional positioning. Based on the quantified representation of the process parameters and the functional attributes of the layers, the degree of correlation between the two is calculated. Based on the calculated degree of correlation, a mapping relationship between the current process parameters and each layer is constructed. This mapping relationship reflects the importance of each layer to the current process task, laying the foundation for subsequently determining active blocks of the model to be updated.

[0225] Specifically, taking the task of detecting scratches on a mobile phone casing as an example, the current process parameters include the required detection accuracy and the production line speed. The pre-trained model contains multiple convolutional blocks. A mapping relationship is established between the current process parameters and each block. The scratch detection task focuses on minute defects on the product surface, and is highly correlated with shallow blocks responsible for extracting low-level features such as edges and textures, and less correlated with deep blocks responsible for extracting semantic features. This mapping relationship will be used to subsequently determine the active blocks of the model to be updated.

[0226] By establishing a mapping relationship between process parameters and model block functions, a basis is provided for selectively updating specific blocks in the future, avoiding blind adjustments to all parameters and improving the targeting and efficiency of model fine-tuning.

[0227] S207. Extract the functional feature vectors of each layer block in the pre-trained model, and extract the process semantic vector of the process parameters at the current moment.

[0228] S208. Calculate the cosine similarity between the functional feature vector and the process semantic vector, and mark the layer blocks with a cosine similarity greater than the preset similarity threshold as candidate active blocks.

[0229] Understandably, a functional feature vector is a vector obtained by quantizing the representation of each layer block in a pre-trained model, reflecting the functional characteristics of that layer block in feature extraction.

[0230] The process semantic vector is a vector obtained by vectorizing the process parameters at the current moment, representing the requirements of the current production task.

[0231] One possible implementation is to extract the feature vector for each layer block in the pre-trained model. Functional feature vector It integrates structural information (such as layer type and dimension) and functional positioning (such as shallow texture extraction and deep semantic abstraction) of the layers. Simultaneously, it extracts process semantic vectors from the process parameters at the current moment. The numerical parameters such as temperature and pressure are normalized and combined into a vector form. The cosine similarity between the functional feature vector and the process semantic vector of each block is calculated using the following formula:

[0232]

[0233] Blocks with a cosine similarity greater than a preset similarity threshold are marked as candidate active blocks.

[0234] By calculating the cosine similarity between the functional feature vector and the process semantic vector, a quantitative assessment of the correlation between the layer block and the process task is achieved. Candidate active blocks are selected based on a preset threshold, ensuring relevance to the process task while effectively narrowing down the range of layers requiring further fine-tuning, thus improving the targeting and efficiency of model fine-tuning.

[0235] S209. Calculate the average gradient magnitude of each candidate active block during the historical fine-tuning process, and determine the preset number of candidate active blocks with the highest average gradient magnitude as the active blocks of the model to be updated.

[0236] S210. Update the parameters of the active block of the model based on the feature data to obtain the target model that adapts to the current process parameters.

[0237] Understandably, feature data refers to image, time series, or text feature data obtained after feature filtering processing.

[0238] Parameter update refers to iteratively optimizing the trainable parameters in the active block of the model so that the model can better adapt to the current process task.

[0239] The target model refers to the industrial model obtained after fine-tuning using this method, which is adapted to the specific process parameters and can make accurate inferences for the current production task.

[0240] In one embodiment, all original weight parameters in the pre-trained model are kept frozen, and a trainable low-rank parameter matrix is introduced only into the active blocks of the model to be updated. The number of parameters in the low-rank matrix is much smaller than that of the original weights, which can significantly reduce computational overhead while maintaining model capacity. Feature data is input into the pre-trained model for forward propagation to obtain the model's predicted output for the current industrial task. A loss function is constructed based on the difference between the predicted output and the true label, which quantifies the model's performance on the current task. The contribution of each element in the low-rank parameter matrix to the prediction error is calculated according to the loss function, and the adjustment direction and magnitude of each element are determined accordingly. Only the elements in these low-rank parameter matrices are iteratively updated, while the original weights remain unchanged. After iteration, the updated low-rank parameter matrix is combined with the corresponding original weight parameters to obtain the updated model active block parameters. All updated active blocks and the unupdated layers in the original model together constitute the target model adapted to the current process parameters.

[0241] By introducing low-rank parameters for updating only a small number of active blocks, the number of trainable parameters and computational resource consumption are significantly reduced, enabling resource-constrained industrial edge devices to quickly complete model fine-tuning.

[0242] In some embodiments, during the process of updating the parameters of the active blocks of the model based on feature data, the computational intensity of parameter updates is adjusted according to the memory usage and power consumption of the industrial edge devices.

[0243] Understandably, memory utilization refers to the ratio of currently used memory to the maximum memory of an industrial edge device, reflecting the degree of scarcity of the device's memory resources.

[0244] Power consumption refers to the real-time power consumption of industrial edge devices under current operating conditions, reflecting the energy consumption of the devices.

[0245] Computational intensity refers to the amount of computation performed per unit time during parameter updates. In this embodiment, computational intensity is dynamically controlled by adjusting feature filtering intensity, parameter storage precision, and model layer block freezing ratio.

[0246] One possible implementation involves collecting memory usage and power consumption data from industrial edge devices at a fixed frequency during parameter updates. For example, memory usage could be obtained by calling the GPU / CPU's memory read interface, and power consumption could be read through the device's power management interface.

[0247] Based on the collected memory usage and power consumption, determine whether the current device resource status exceeds the safety threshold:

[0248] If the memory usage rate is greater than the memory warning threshold, it indicates that memory resources are scarce and memory usage needs to be reduced.

[0249] If the power consumption is greater than the power consumption warning threshold, it means that the power consumption is close to the upper limit and the computing power consumption needs to be reduced.

[0250] If both memory usage and power consumption exceed the corresponding warning threshold or the weighted sum exceeds the comprehensive threshold, collaborative optimization is triggered.

[0251] Based on the triggered adjustment conditions, the corresponding resource adjustment strategy is executed. By monitoring the device status in real time and dynamically adjusting the computing intensity during parameter updates, adaptive balance control between memory and power consumption is achieved. The entire adjustment process is fully automated, requiring no manual intervention, adapting to the complex and ever-changing operating environment of industrial edge devices, and improving robustness and reliability.

[0252] S211. Based on the target model, reason about industrial data to obtain at least one of the following: industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions, so as to guide the operation of industrial production lines.

[0253] Step S211 is similar to step S103 above, and will not be repeated here.

[0254] This embodiment provides an industrial model fine-tuning and inference method. This method accurately identifies data types by performing dimensional and statistical feature analysis on multi-source heterogeneous data collected from industrial edge devices, and executes differentiated feature selection strategies accordingly. This eliminates redundant noise at the source while retaining core process features. A mapping relationship is constructed between the semantics of current process parameters and the functional attributes of pre-trained model blocks. Combined with historical gradient contribution, the method identifies the "active blocks" that need updating, iterates parameters only on the "active blocks," and freezes the rest, obtaining a target model adapted to the current production state. This target model is then used to infer from real-time industrial data to obtain quality inspection conclusions, fault warnings, or optimization strategies. This effectively solves the problems of key features being overwhelmed by noise in traditional general data processing, and the computational overload and response lag of full-scale fine-tuning in resource-constrained scenarios. It improves the model's feature capture capability for multi-source heterogeneous data and its adaptation speed to process changes, achieving real-time, high-precision monitoring and closed-loop control of industrial production lines.

[0255] Figure 3 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application embodiment. Figure 3 .like Figure 3 As shown, in Figure 1 Based on the embodiments, a possible implementation method for updating the parameters of the active blocks of the model based on feature data to obtain a target model adapted to the current process parameters is described in detail, including:

[0256] S301. Keep the original weight parameters of the pre-trained model frozen. Inject the first low-rank matrix and the second low-rank matrix into the active block of the model to be updated. The product of the first low-rank matrix and the second low-rank matrix is used to represent the amount of the original weight parameters to be adjusted.

[0257] Understandably, freezing the original weight parameters means keeping the original weight parameters of the pre-trained model unchanged during model fine-tuning, without updating the parameters. This preserves the general knowledge learned by the pre-trained model on large-scale data.

[0258] A low-rank matrix is a trainable matrix injected into the active block of a model. It refers to a matrix whose rank (i.e., the maximum number of linearly independent rows or columns) is much smaller than its number of rows or columns.

[0259] One possible implementation is to maintain the original weight matrix for each active block of the model to be updated. Completely frozen. During subsequent backpropagation, the original weight matrix is not calculated. The gradient is also not related to the original weight matrix. No numerical modifications are allowed. The freeze operation ensures that the general feature extraction capabilities learned by the pre-trained model on large-scale data are fully preserved.

[0260] A trainable first and second low-rank matrix is injected into the bypass of the original weight matrix of the active block. The ranks of these two matrices are much smaller than the dimension of the original weight matrix. The product of the two low-rank matrices is used to represent the amount to be adjusted in the original weight parameters. Since both the first and second low-rank matrices are low-rank matrices, their product is also low-rank, thus restricting parameter updates to a low-dimensional space.

[0261] During forward propagation of the model, the actual weights used are the sum of the original weights and the weights to be adjusted. Since only the first and second low-rank matrices are trainable, and the original weight matrix... By keeping the model frozen, the original knowledge is preserved while adjustments for the current process task are introduced through a low-rank matrix, which greatly reduces the number of parameters that need to be optimized.

[0262] Specifically, for example, the first layer of the model contains a convolutional layer with an original weight matrix dimension of 256. The current task is to detect scratches on a mobile phone casing, and the first layer has been identified as the active layer for this round of fine-tuning.

[0263] The original weight matrix of the first layer block Set to frozen state, and do not calculate gradients in subsequent training; inject two low-rank matrices into the bypass of the first layer block, the first low-rank matrix... The second low-rank matrix The rank of the two matrices is 8, which is much smaller than the original dimension of 256. The number of trainable parameters introduced is 8 × 256 + 256 × 8 = 4096, while the number of parameters in the original weight matrix is 256 × 256 = 65536. By introducing a low-rank matrix, the number of trainable parameters in the model is reduced to only 6.25% of the original number of parameters.

[0264] During model forward propagation, the actual output of the first layer block .in, The dimension is ,and The same. The actual output of the first layer block represents the amount of weight change that needs to be adjusted for the current scratch detection task.

[0265] By keeping the original weights frozen and injecting low-rank matrices into the active blocks, the general feature extraction capabilities learned by the pre-trained model on large-scale data are preserved, the number of trainable parameters is greatly reduced, and the computational resources and memory usage are reduced. The product form of the low-rank matrices allows the parameters to be adjusted to still express rich adaptability, adapt to the needs of various industrial process tasks, and realize efficient fine-tuning of the pre-trained model.

[0266] S302. Input the feature data into the pre-trained model to obtain the prediction result of the current industrial task, and construct a loss function based on the difference between the prediction result and the expected real result of the industrial task.

[0267] Understandably, the prediction result refers to the inference result output by the model after performing forward propagation calculations on the input feature data. Specifically, depending on the specific industrial task, the prediction result may be the probability of product defect categories, equipment failure types, predicted remaining service life, process parameter optimization suggestions, etc.

[0268] The true result refers to the actual label or expected output value corresponding to the input feature data, which usually comes from manual annotation, the actual state of the sensor, or historical production records. For example, in industrial quality inspection tasks, the true result is the manually labeled result indicating whether the product has defects.

[0269] A loss function is a mathematical function used to quantify the difference between a model's predictions and the actual results. The smaller the loss function value, the more accurate the model's predictions.

[0270] One possible implementation involves inputting the aforementioned feature data into a pre-trained model after injecting a low-rank matrix. The model performs forward propagation computation: the data sequentially passes through each layer block (including the frozen original weight layer and the injected low-rank matrix bypass), undergoes feature transformation layer by layer, and finally produces a prediction result for the current industrial task at the output layer.

[0271] In the active block where a low-rank matrix is injected, the forward propagation is calculated as follows: Output .in, For the frozen original weights, The variable to be adjusted is represented by a low-rank matrix product. In inactive blocks where no low-rank matrix is injected, forward propagation maintains the original computation method.

[0272] Depending on the specific type of the current industrial task, the model outputs prediction results in different forms. In industrial quality inspection tasks, the output is the probability distribution of product defect categories, such as "defective" probability 0.92 and "no defect" probability 0.08; in equipment failure prediction tasks, the output is the predicted value of the equipment's remaining service life, such as "remaining service life 128 hours"; in process parameter optimization tasks, the output is a suggestion for adjusting process parameters, such as "increase the temperature by 5℃".

[0273] Obtain the actual results corresponding to the input feature data from the production system or labeled database. Construct a loss function based on the difference between the model's predictions and the actual results. The loss function quantifies the current model's performance on the industrial task; a smaller value indicates more accurate model predictions.

[0274] Specifically, taking the mobile phone casing scratch detection task as an example, the selected image feature data (224×224 images retaining the defect area, batch size of 1) is input into the model after injecting a low-rank matrix; the model performs forward propagation calculation, and after forward propagation through each layer block, the final output layer of the model produces the prediction result; the model output result is a scratch probability of 0.92 and a no-scratch probability of 0.08, that is, the model predicts that the current product has a scratch defect; the real label of the product is read from the quality inspection database. After manual re-inspection, it is confirmed that the product has a scratch defect; a loss function is constructed based on the difference between the model's predicted probability of the current product having a scratch and the actual result.

[0275] By inputting feature data into a pre-trained model to obtain prediction results, and constructing a loss function based on the difference between the prediction results and the actual results, a quantitative evaluation of the model's performance on the current industrial task is achieved. The loss function, as the optimization objective, provides a clear direction for subsequent gradient calculations of the low-rank matrix and parameter updates, enabling the model to iteratively optimize towards improving task accuracy.

[0276] S303. Calculate the contribution of each element in the first low-rank matrix and the second low-rank matrix to the prediction error according to the loss function, and determine the adjustment direction and adjustment magnitude of each element according to the contribution, and iteratively update the elements in the first low-rank matrix and the second low-rank matrix.

[0277] As is understandable, contribution refers to the degree of influence of each element in the low-rank matrix on the current prediction error, and is usually measured by the partial derivative of the loss function with respect to that element (i.e., the gradient). The larger the gradient value, the greater the contribution of that element to the prediction error.

[0278] The adjustment direction refers to the positive or negative direction of parameter updates, determined by the sign of the gradient. A positive gradient indicates that increasing the value of that element will increase the loss; a negative gradient indicates that decreasing the value of that element will increase the loss.

[0279] The adjustment step size refers to the size of the parameter update step, which is determined by both the gradient value and the learning rate. Adjustment step size = Learning rate × Gradient value. The learning rate is a pre-set hyperparameter used to control the speed of parameter updates.

[0280] One possible implementation involves calculating the partial derivatives (i.e., gradients) of each element in the first and second low-rank matrices based on the loss function. The adjustment direction and magnitude of each parameter are determined based on the gradient values. The opposite direction of the gradient is the direction of the fastest descent of the loss function; therefore, the parameter update direction is the negative direction of the gradient. The adjustment magnitude is determined by the absolute value of the gradient and the learning rate: Adjustment magnitude = Learning rate × Gradient value. The elements in the low-rank matrices are then updated using a gradient descent algorithm, for example:

[0281]

[0282]

[0283] in, Let A be the first low-rank matrix at the t-th iteration, and let A represent the current parameter value of the first low-rank matrix A in the current iteration round. Let A be the first low-rank matrix at the (t+1)th iteration, and let A represent the new parameter values of the first low-rank matrix A after the (t+1)th update. Let B be the second low-rank matrix at the t-th iteration, representing the current parameter value of the second low-rank matrix B in the current iteration round; Let B be the second low-rank matrix at the (t+1)th iteration, and let B represent the new parameter values of the second low-rank matrix B after the (t+1)th update. The learning rate is a preset positive hyperparameter used to control the step size of parameter updates. A learning rate that is too large may lead to unstable training, while a learning rate that is too small will result in slow convergence. Let be the gradient matrix of the loss function with respect to A, and let be the matrix formed by the partial derivatives of the loss function with respect to each element of the first low-rank matrix A at the t-th iteration. Let be the gradient matrix of the loss function with respect to B, and let be the matrix of partial derivatives of the loss function with respect to each element of the second low-rank matrix B at the t-th iteration.

[0284] The adjustment direction and magnitude are repeatedly determined, and parameter updates are performed. In each iteration: feature data is input into the model for forward propagation to obtain the prediction result; the loss between the predicted and actual results is calculated; the gradient of the low-rank matrix is calculated via backpropagation; and the elements in the low-rank matrix are updated based on the gradient. As the number of iterations increases, the loss function value gradually decreases, and the model's prediction accuracy gradually improves. Iteration stops when the loss function converges (e.g., the change is less than a threshold after multiple consecutive iterations) or when the preset maximum number of iterations is reached.

[0285] By calculating gradients and updating only a small number of elements in the low-rank matrix, the computational cost of backpropagation is significantly reduced, enabling industrial edge devices to fine-tune the model under resource constraints. Through multiple iterations, the loss function value is gradually reduced, continuously improving the model's adaptability to the current process task, ultimately achieving the target model that meets industrial accuracy requirements.

[0286] S304. After the iteration is completed, the product of the updated first low-rank matrix and the second low-rank matrix is added to the corresponding original weight parameters to obtain the target model.

[0287] Obtain the first and second low-rank matrices after the final iteration; calculate the product of the updated first and second low-rank matrices to obtain the weight adjustment learned for the current process task. The matrices have undergone multiple gradient descent optimizations, and their product can effectively reduce the loss function value for the current process task; combine the calculated weight adjustment with the frozen original weight matrix. Perform matrix addition, element-wise, to obtain the updated actual weight matrix. This refers to the weight parameters actually used by the active blocks of the model in subsequent inference.

[0288] For all active blocks of the model to be updated, repeat the above steps to obtain the updated weights for each active block. For other blocks not in the candidate active block set, keep the original pre-trained parameters unchanged. Combine these updated active blocks with the unupdated blocks to form a complete target model adapted to the current process parameters.

[0289] This embodiment provides an industrial model fine-tuning and inference method. This method injects a first low-rank matrix and a second low-rank matrix into the active block of the model to be updated, using their product to represent the adjustment amount of the original weight parameters. The original weight parameters of the pre-trained model are kept frozen. Feature data is input into the pre-trained model to obtain the prediction result of the current industrial task. A loss function is constructed based on the difference between the prediction result and the actual result. The contribution of each element in the two low-rank matrices to the prediction error is calculated according to the loss function, determining the adjustment direction and magnitude of each element. Only the elements in the low-rank matrices are iteratively updated. After iteration, the product of the updated low-rank matrices is added to the original weight parameters to obtain the target model adapted to the current process parameters. This method achieves efficient fine-tuning by introducing only a small number of trainable parameters into the active block of the model, while keeping the original model parameters unchanged. It significantly reduces the number of parameters and memory usage, solving the technical problems of traditional fine-tuning methods that require updating all parameters and consume large amounts of computational resources. This enables resource-constrained industrial edge devices to quickly adapt to different process tasks and improves model iteration efficiency.

[0290] Figure 4 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application embodiment. Figure 4 .like Figure 4 As shown, in Figure 1 Based on the embodiments, a possible implementation of adjusting the computational intensity of parameter updates according to the memory occupancy and power consumption of industrial edge devices is described in detail, including:

[0291] S401. When the memory usage rate is detected to exceed the preset memory threshold, the filtering intensity is increased when performing feature filtering on multi-source heterogeneous data, and the parameter storage accuracy of the pre-trained model is switched from high-precision format to low-precision format.

[0292] Understandably, parameter storage precision refers to the numerical precision format used when model parameters are stored in memory. High-precision formats are storage formats that can retain high numerical precision and are suitable for computational stages with high precision requirements; low-precision formats are formats that reduce parameter storage precision through quantization techniques, which can significantly reduce memory usage.

[0293] One possible implementation involves calling the system memory read interface at a fixed frequency during parameter updates to obtain the currently used memory and the device's maximum memory, thus calculating the real-time memory usage rate. For example, for an industrial edge device with 4GB of memory, if the currently used memory is 3.6GB, the memory usage rate would be 3.6 / 4.0 = 0.9. The calculated memory usage rate is then compared to a preset memory threshold. If the memory usage rate exceeds the preset threshold, it indicates memory resource shortage, requiring the initiation of a memory optimization process; otherwise, monitoring continues without performing any adjustments.

[0294] Enhance the filtering strength when performing feature filtering on multi-source heterogeneous data. The methods for enhancing the filtering strength vary depending on the data type; for example:

[0295] For image data: Increase the sparsity of the mask matrix, for example, from 0.3 to 0.5, so that more background pixels are removed;

[0296] For time series data: Increase the sparsity of the window data, for example, from 0.2 to 0.4, so that each window retains fewer sampling points;

[0297] For text data: Increase the sparsity of non-keywords, for example, from 0.8 to 0.95, so that more non-keywords are removed.

[0298] The more memory is exceeded, the greater the filtering intensity increases. Optionally, each increase should not exceed 0.1 to avoid excessive compression leading to the loss of key information. While increasing the filtering intensity, the parameter storage precision of the pre-trained model is switched from a high-precision format to a low-precision format, storing the quantized low-precision parameters.

[0299] After performing memory optimization, continue monitoring memory usage. If memory usage remains above the threshold, further intensify the filtering or consider more aggressive quantification strategies; if memory usage falls back to a safe range, maintain the current settings or gradually restore them. By increasing the filtering intensity and switching to low-precision storage when memory usage exceeds limits, dynamic balancing control of memory resources can be achieved.

[0300] S402. When the power consumption is detected to exceed the preset power consumption threshold, increase the freezing ratio of model layer blocks and reduce the number of active model blocks participating in the calculation in a single iteration.

[0301] As is understandable, the model layer block freeze ratio refers to the proportion of model layer blocks that are frozen (not participating in parameter updates) out of all candidate active blocks during the current fine-tuning process. The higher the freeze ratio, the fewer layer blocks participate in the computation, and the lower the computational power consumption.

[0302] A single iteration refers to a complete parameter update cycle, including forward propagation, loss calculation, backpropagation, and parameter update.

[0303] One possible implementation involves reading real-time power consumption data at a fixed frequency through the device's power management interface during parameter updates. For devices containing both a GPU and a CPU, the total power consumption is the sum of both. Simultaneously, the device's rated maximum power consumption is obtained, and the current power consumption rate is calculated. The calculated power consumption rate is compared with a preset power consumption threshold. If the power consumption exceeds the preset threshold, it indicates that the power consumption is close to the upper limit, and the power optimization process needs to be initiated; otherwise, monitoring continues without performing any adjustments.

[0304] After power optimization is triggered, the required increase in the freezing ratio is calculated based on the degree of power consumption exceeding the limit. The greater the power consumption exceeding the limit, the greater the increase in the freezing ratio. Specifically, some layers are selected for freezing from the currently active model blocks participating in the calculation. Layers with lower average gradient magnitudes (i.e., layers with smaller historical contributions) are frozen first; layers with higher average gradient magnitudes are retained to continue participating in updates. Frozen layers do not participate in forward propagation calculations, backpropagation, or parameter updates in subsequent iterations; only parameter storage is retained, and calculation-related operations are paused.

[0305] Selected layers are frozen to reduce the number of active model blocks participating in computation in a single iteration. The optimized computational load is reduced, leading to a decrease in power consumption. After power optimization, device power consumption continues to be monitored. If power consumption remains above the threshold, the freezing ratio can be further increased; if power consumption falls back to a safe range, the freezing ratio can be gradually reduced, resuming computation for some layers. By increasing the freezing ratio of model layers when power consumption exceeds the threshold, dynamic balance control of computational power consumption is achieved.

[0306] S403. When the memory usage rate is detected to be lower than the preset memory threshold and the power consumption is lower than the preset power consumption threshold, the filtering intensity is reduced and the parameter storage precision is restored to the high precision format to increase the computational intensity of parameter updates.

[0307] The system monitors the memory usage and power consumption of the data acquisition device in real time. If both the memory usage and power consumption are below a preset threshold, the device is considered to be in a "resource-sufficient" state. The sparsity threshold for feature selection is lowered. For example, in image processing, the coverage of the defect mask is expanded to retain more background context; in time series processing, the time window length is extended or the correlation threshold is lowered to include more potentially relevant signal segments, resulting in more comprehensive process information in the feature data of the input model.

[0308] The model parameters are switched from low-precision storage to high-precision format, and the parameters loaded into the computation unit are restored from low-precision quantized values to high-precision floating-point values. This ensures that subsequent matrix multiplication and gradient accumulation are performed in high precision, reducing errors introduced by quantization noise. Based on the more complete feature data and higher-precision parameter representation, parameter update iterations are performed. Although this consumes more resources, it yields finer gradient directions and better convergence results.

[0309] When equipment resources permit, accuracy is no longer sacrificed for resource conservation. High accuracy and full feature set are restored, improving the detection accuracy and generalization ability of the fine-tuned model and avoiding performance bottlenecks caused by over-compression.

[0310] This embodiment provides an industrial model fine-tuning and inference method. This method dynamically executes an adaptive resource adjustment strategy by monitoring the memory usage and power consumption of industrial edge devices in real time: when memory is limited, it automatically increases the data feature filtering intensity and switches to a low-precision storage format to reduce GPU memory usage; when power consumption exceeds limits, it proactively increases the model layer block freezing ratio to reduce computational load; and when resources are abundant, it restores the high-precision format and performs full feature processing to maximize computational intensity. This effectively solves the problem of device overload or wasted computing power caused by fixed resource configurations, ensuring that the model fine-tuning process always operates safely and efficiently within the physical limits of the edge devices, achieving optimal performance under resource constraints.

[0311] Figure 5 A flowchart illustrating an industrial model fine-tuning and inference method provided in this application embodiment. Figure 5 .like Figure 5 As shown, in Figure 1 Based on the examples, the fine-tuning and inference methods of the industrial model are described in detail, including:

[0312] S501. After completing the parameter update of the active block of the model to obtain the target model, the target model is deployed as an online inference model, and the process parameters of the production line are monitored in real time during the operation of the online inference model.

[0313] Understandably, an online inference model refers to a model instance loaded into the inference engine of an industrial edge device, running in real time, directly receiving production line data and outputting quality inspection or control instructions.

[0314] One possible implementation is to load the generated target model weight file into the inference engine in memory after the model active block parameter update iteration is completed and converged; and to use a "double buffering" mechanism to seamlessly switch the original inference instance to the target model instance without interrupting data acquisition, ensuring a seamless upgrade on the production line.

[0315] While the online inference model performs forward propagation (i.e., processes image and time-series data for quality inspection / prediction), a separate process parameter monitoring thread is started. For example, the thread reads the current set of process parameters in real time through the industrial bus interface.

[0316] The read process parameters are accurately timestamped and associated with the "baseline process parameters" upon which the current inference model is based. This parameter set is continuously polled or updated in an event-triggered manner to form a continuous process state time series, providing real-time data input for subsequent determination of whether to trigger "secondary fine-tuning".

[0317] S502. When the difference between the new process parameters and the current process parameters corresponding to the online inference model is detected to be greater than the change threshold, the parameters of the active block of the model are finely adjusted based on the parameters of the online inference model and / or the finely tuned model parameters of historical similar process parameters, and the updated new model is obtained.

[0318] Understandably, the change threshold refers to the preset tolerance for fluctuations in process parameters (such as cosine distance or the absolute difference of key parameters). When the difference between the old and new process parameters exceeds the change threshold, it is determined that a substantial change in operating conditions has occurred (such as product change or material replacement), and the original model is no longer applicable.

[0319] The baseline parameter refers to the starting point of the initial weights for secondary fine-tuning. This embodiment provides two strategies: online inference model parameters, which are the weights of the currently running model, suitable for continuous fine-tuning of small process evolutions; and fine-tuned model parameters of historically similar process parameters, which are the fine-tuned weights corresponding to the past processes most similar to the current new process retrieved from the historical knowledge base.

[0320] One possible implementation involves calculating the difference between the new process parameter vector and the corresponding process parameter vector in the current model in real time. If the difference exceeds a change threshold, the system immediately switches to a backup rule mode, triggering a secondary fine-tuning process. Specifically, a dual-model switching strategy is adopted, simultaneously storing the "current online model" and the "background updated model" in the device memory. Once the new model is validated, only the pointer or routing rule is switched to direct traffic to the new model, and then the old model resources are released. If the new process and the old process have a high similarity (though exceeding the threshold, they are still in the same family), the current parameters of the online inference model are used as the benchmark, and the existing low-rank matrix state is retained. If the new process is a completely new type, the historical records most similar to the semantic vector of the new process are retrieved from the historical knowledge base, and the corresponding fine-tuned model parameters (especially the low-rank matrix part) are loaded as the initial state.

[0321] Based on the new process parameters, the "establish mapping relationship" step is repeated to determine new active blocks of the model adapted to the new process. Using a small amount of newly collected new process sample data, starting from the selected baseline parameters, gradient updates are performed only on the low-rank matrix in the new active blocks. Since the baseline parameters already contain rich feature extraction capabilities, this process converges with very few iterations. The updated low-rank increments are then fused with the original weights (or historical increments) to generate a new, updated model adapted to the new process.

[0322] S503. Switch the current online inference model to the updated model, and use the switched model to infer industrial data to guide the operation of the production line.

[0323] One possible implementation involves maintaining two model instances simultaneously in the memory of the industrial edge device: an online inference model (currently running and handling real-time inference requests) and an offline micro-tuning model (a new model that has just completed micro-tuning and is awaiting switching). The two models share the outputs of the data preprocessing module and the device status monitoring module, but their model parameters are stored in independent memory areas, ensuring they do not interfere with each other.

[0324] Once the offline fine-tuning model is completed and validated, a switching signal is generated to initiate the model switching process. The switching signal includes information such as the new model's memory address, parameter dimensions, and input / output formats. The process involves pausing the current online inference model's processing requests for the new industrial data, while allowing ongoing inference tasks to complete; loading the parameters of the offline fine-tuning model from its independent memory area into the inference memory area; updating the model pointer so that the inference engine points to the new model; and resuming processing requests for the industrial data, using the new model for inference computation.

[0325] After the switch is complete, the original online inference model goes into standby mode or is released from memory, and the new model begins to continuously receive real-time industrial data and perform inference. It receives product images captured by industrial cameras and outputs defect detection results; it receives equipment operation data collected by sensors and outputs fault prediction results; it receives current process parameters and outputs process optimization suggestions.

[0326] Based on the reasoning results of the new model, the production line control system or operators take corresponding control measures. Specifically, based on defect detection results, unqualified products are automatically rejected, or re-inspection prompts are issued to operators; based on equipment health status predictions, maintenance plans are arranged in advance to avoid sudden downtime; based on process parameter recommendations, production parameters such as temperature, pressure, and speed are dynamically adjusted to optimize product quality.

[0327] By rapidly switching the current online inference model to the updated model and using the new model to infer industrial data to guide production line operation, a closed-loop linkage between model iteration and production operation was achieved. The model inference results are linked with the production line control system, enabling data-driven automated decision-making, reducing manual intervention, and improving production efficiency and quality stability.

[0328] This embodiment provides an industrial model fine-tuning and inference method. This method senses production line process fluctuations in real time during model deployment and operation. Once a change in process parameters exceeds a threshold, the old model service is immediately paused, and the current model parameters or fine-tuned parameters from historically similar processes are loaded as an initialization baseline. Secondary incremental fine-tuning is then performed on the active blocks of the model under the new process. The model seamlessly switches to the updated model to continue inference tasks. This process-aware dynamic continuous learning mechanism effectively solves the response lag and cold start problems caused by the need for retraining traditional models after production line changes or process adjustments. It also addresses the technical problems of large response delays and long full-scale fine-tuning times in existing technologies, achieving rapid model adaptation and seamless connection during production line process changes, significantly improving the continuity and response speed of industrial production.

[0329] Figure 6 This is a schematic diagram of the structure of an industrial model fine-tuning and inference device provided in this application. Figure 6 As shown, this application provides an industrial model fine-tuning and inference device 600, which includes:

[0330] The acquisition module 601 is used to acquire multi-source heterogeneous data from industrial edge devices;

[0331] Processing module 602 is used to perform feature filtering on multi-source heterogeneous data to obtain feature data that characterizes the multi-source heterogeneous data;

[0332] The processing module 602 is also used to establish the mapping relationship between the current process parameters and the functional attributes of each layer block of the pre-trained model, determine the active block of the model to be updated based on the mapping relationship, and update the parameters of the active block of the model based on the feature data to obtain the target model that adapts to the current process parameters.

[0333] The processing module 602 is also used to reason about industrial data based on the target model to obtain at least one of industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions to guide the operation of industrial production lines.

[0334] Optionally, the device may also include: a determining module 603;

[0335] The processing module 602 is also used to extract dimensional features and statistical features of multi-source heterogeneous data and construct feature vectors;

[0336] The determination module 603 is used to match feature vectors according to preset rules to determine the data type identifier corresponding to multi-source heterogeneous data;

[0337] The processing module 602 is also used to call the corresponding feature filtering rules according to the data type identifier to perform feature filtering processing on multi-source heterogeneous data.

[0338] Optionally, the processing module 602 is further configured to generate a mask matrix based on a preset defect area coordinate library if the data type is identified as image data, and to perform weighted processing on the original image pixels based on the mask matrix, retaining the pixel features of the area covered by the mask matrix as image feature data.

[0339] The processing module 602 is also used to determine the length of the time window based on the industrial process cycle if the data type is identified as time series data, and to calculate the correlation between the data and process parameters within the time window, and retain time series segments with a correlation higher than a preset correlation threshold as time series feature data.

[0340] The processing module 602 is also used to, if the data type is identified as text data, match the text content based on a preset process keyword dictionary and extract the matching keywords and their context vectors as text feature data.

[0341] Optionally, the device may also include: a computing module 604;

[0342] The processing module 602 is also used to extract the functional feature vectors of each layer block in the pre-trained model and extract the process semantic vector of the process parameters at the current moment.

[0343] Calculation module 604 is used to calculate the cosine similarity between the functional feature vector and the process semantic vector;

[0344] The processing module 602 is also used to mark blocks with a cosine similarity greater than a preset similarity threshold as candidate active blocks;

[0345] The determination module 603 is also used to calculate the average gradient magnitude of each candidate active block during the historical fine-tuning process, and to determine the preset number of candidate active blocks with the highest average gradient magnitude as the active blocks of the model to be updated.

[0346] Optionally, the processing module 602 is also used to keep the original weight parameters of the pre-trained model frozen, and to inject a first low-rank matrix and a second low-rank matrix into the active block of the model to be updated. The product of the first low-rank matrix and the second low-rank matrix is used to represent the amount to be adjusted of the original weight parameters.

[0347] The processing module 602 is also used to input feature data into the pre-trained model to obtain the prediction result of the current industrial task, and to construct a loss function based on the difference between the prediction result and the expected real result of the industrial task.

[0348] The processing module 602 is also used to calculate the contribution of each element in the first low-rank matrix and the second low-rank matrix to the prediction error according to the loss function, and determine the adjustment direction and adjustment magnitude of each element according to the contribution, and iteratively update the elements in the first low-rank matrix and the second low-rank matrix.

[0349] The processing module 602 is also used to add the product of the updated first low-rank matrix and the second low-rank matrix to the corresponding original weight parameters after the iteration is completed, so as to obtain the target model.

[0350] Optionally, the processing module 602 is also used to adjust the computational intensity of parameter updates according to the memory usage and power consumption of industrial edge devices during the process of updating the parameters of the active blocks of the model based on feature data.

[0351] Optionally, the processing module 602 is also used to increase the filtering intensity when performing feature filtering on multi-source heterogeneous data when the memory usage rate is detected to exceed the preset memory threshold, and to switch the parameter storage accuracy of the pre-trained model from high-precision format to low-precision format.

[0352] The processing module 602 is also used to increase the freezing ratio of model layer blocks and reduce the number of active model blocks participating in the calculation in a single iteration when the power consumption is detected to exceed the preset power consumption threshold.

[0353] The processing module 602 is also used to reduce the screening intensity and restore the parameter storage accuracy to a high-precision format when the memory occupancy rate is detected to be lower than a preset memory threshold and the power consumption is lower than a preset power consumption threshold, so as to increase the calculation intensity of parameter updates.

[0354] The industrial model fine-tuning and inference device provided in this application embodiment is similar in principle and technical effect to the implementation of each part of the aforementioned industrial model fine-tuning and inference method, and will not be repeated here.

[0355] Figure 7 This is a schematic diagram of the structure of an electronic device provided in this application. Figure 7 As shown, the electronic device 700 includes: a receiver 701, a transmitter 702, a processor 703, and a memory 704.

[0356] Receiver 701 is used to receive instructions and data;

[0357] Transmitter 702 is used to send commands and data;

[0358] Memory 704 is used to store instructions executed by the computer;

[0359] The processor 703 is used to execute computer execution instructions stored in the memory 704 to implement the various steps performed by the industrial model fine-tuning and inference method in the above embodiments. For details, please refer to the relevant descriptions in the foregoing embodiments of the industrial model fine-tuning and inference method.

[0360] Optionally, the memory 704 can be either standalone or integrated with the processor 703.

[0361] When the memory 704 is set up independently, the electronic device also includes a bus for connecting the memory 704 and the processor 703.

[0362] The implementation principle and technical effects of the electronic device provided in this embodiment can be found in the foregoing embodiments, and will not be repeated here.

[0363] This application also provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the method of any of the foregoing embodiments.

[0364] This application also provides a computer program product, including a computer program that, when executed by a processor, implements the method of any of the foregoing embodiments.

[0365] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this application is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this application. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily essential to this application.

[0366] It should be further noted that although the steps in the flowchart are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowchart may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.

[0367] It should be understood that the above-described device embodiments are merely illustrative, and the device of this application can also be implemented in other ways. For example, the division of units / modules in the above embodiments is only a logical functional division, and there may be other division methods in actual implementation. For example, multiple units, modules, or components may be combined, or integrated into another system, or some features may be ignored or not executed.

[0368] Furthermore, unless otherwise specified, the functional units / modules in the various embodiments of this application can be integrated into one unit / module, or each unit / module can exist physically separately, or two or more units / modules can be integrated together. The integrated units / modules described above can be implemented in hardware or as software program modules.

[0369] In the above embodiments, the descriptions of each embodiment have their own emphasis. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments. The technical features of the above embodiments can be combined arbitrarily. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as the combination of these technical features does not contradict each other, it should be considered within the scope of this specification.

[0370] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this application are indicated by the following claims.

[0371] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this application is limited only by the appended claims.

Claims

1. A method for fine-tuning and reasoning in an industrial model, characterized in that, include: Extract dimensional and statistical features from multi-source heterogeneous data of industrial edge devices, and construct feature vectors; The feature vectors are matched using preset rules to determine the data type identifier corresponding to the multi-source heterogeneous data; According to the data type identifier, the corresponding feature filtering rule is invoked to perform feature filtering processing on the multi-source heterogeneous data to obtain feature data characterizing the multi-source heterogeneous data. Establish the mapping relationship between the current process parameters and the functional attributes of each layer of the pre-trained model, extract the functional feature vectors of each layer of the pre-trained model, and extract the process semantic vector of the current process parameters. Calculate the cosine similarity between the functional feature vector and the process semantic vector, and mark the blocks with a cosine similarity greater than a preset similarity threshold as candidate active blocks; The average gradient magnitude of each candidate active block during the historical fine-tuning process is statistically analyzed. A predetermined number of candidate active blocks with the highest average gradient magnitude are identified as active blocks of the model to be updated. The parameters of the active blocks of the model are updated based on the feature data to obtain a target model that adapts to the current process parameters. Based on the target model, reasoning is performed on industrial data to obtain at least one of industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions, so as to guide the operation of industrial production lines. The step of calling the corresponding feature filtering rules according to the data type identifier to perform feature filtering processing on the multi-source heterogeneous data includes at least one of the following: if the data type identifier is image data, then a mask matrix is generated based on a preset defect area coordinate library, and the original image pixels are weighted based on the mask matrix, retaining the pixel features of the area covered by the mask matrix as image feature data. If the data type is identified as time series data, the length of the time window is determined based on the industrial process cycle, and the correlation between the data and process parameters within the time window is calculated. Time series segments with a correlation higher than a preset correlation threshold are retained as time series feature data. If the data type is identified as text data, then the text content is matched based on a preset process keyword dictionary, and the matching keywords and their context vectors are extracted as text feature data.

2. The method according to claim 1, characterized in that, The step of updating the parameters of the active block of the model based on the feature data to obtain a target model adapted to the current process parameters includes: Keeping the original weight parameters of the pre-trained model frozen, a first low-rank matrix and a second low-rank matrix are injected into the active block of the model to be updated. The product of the first low-rank matrix and the second low-rank matrix is used to represent the amount of the original weight parameters to be adjusted. The feature data is input into the pre-trained model to obtain the prediction result of the current industrial task, and a loss function is constructed based on the difference between the prediction result and the expected real result of the industrial task. The contribution of each element in the first low-rank matrix and the second low-rank matrix to the prediction error is calculated according to the loss function, and the adjustment direction and adjustment magnitude of each element are determined according to the contribution. The elements in the first low-rank matrix and the second low-rank matrix are then iteratively updated. After the iteration is completed, the product of the updated first low-rank matrix and the second low-rank matrix is added to the corresponding original weight parameters to obtain the target model.

3. The method according to claim 1, characterized in that, The method further includes: During the process of updating the parameters of the active block of the model based on the feature data, the computational intensity of parameter updates is adjusted according to the memory occupancy and power consumption of the industrial edge device.

4. The method according to claim 3, characterized in that, The computational intensity of adjusting parameter updates based on the memory occupancy and power consumption of the industrial edge device includes: When the memory usage rate is detected to exceed the preset memory threshold, the filtering intensity when performing feature filtering on the multi-source heterogeneous data is increased, and the parameter storage accuracy of the pre-trained model is switched from high-precision format to low-precision format. When the power consumption is detected to exceed the preset power consumption threshold, the freezing ratio of model layer blocks is increased, and the number of active model blocks participating in the calculation in a single iteration is reduced. When the memory occupancy rate is detected to be lower than the preset memory threshold and the power consumption is lower than the preset power consumption threshold, the filtering intensity is reduced and the parameter storage precision is restored to the high-precision format to increase the computational intensity of parameter updates.

5. The method according to any one of claims 1 to 4, characterized in that, The method further includes: After completing the parameter update of the active block of the model to obtain the target model, the target model is deployed as an online inference model, and the process parameters of the production line are monitored in real time during the operation of the online inference model. When the difference between the new process parameters and the current process parameters corresponding to the online inference model is detected to be greater than the change threshold, the parameters of the active block of the model are finely adjusted based on the parameters of the online inference model and / or the finely tuned model parameters of historical similar process parameters, and the new process parameters are used as a benchmark to obtain the updated new model. The current online inference model is switched to the updated model, and the switched model is used to infer industrial data to guide the operation of the production line.

6. An industrial model fine-tuning and inference device, characterized in that, The device includes: The acquisition module is used to acquire multi-source heterogeneous data from industrial edge devices; The processing module is used to perform feature filtering processing on the multi-source heterogeneous data to obtain feature data characterizing the multi-source heterogeneous data; The processing module is also used to establish a mapping relationship between the current process parameters and the functional attributes of each layer block of the pre-trained model, determine the active block of the model to be updated based on the mapping relationship, and update the parameters of the active block of the model based on the feature data to obtain a target model that adapts to the current process parameters. The processing module is also used to reason about industrial data based on the target model to obtain at least one of industrial quality inspection results, equipment failure prediction results, or process parameter optimization suggestions, so as to guide the operation of industrial production lines. The device also includes: a determining module; The processing module is also used to extract dimensional and statistical features from multi-source heterogeneous data and construct feature vectors; The determination module is used to match feature vectors according to preset rules to determine the data type identifier corresponding to multi-source heterogeneous data; The processing module is also used to call the corresponding feature filtering rules according to the data type identifier to perform feature filtering processing on multi-source heterogeneous data; The device also includes: a computing module; The processing module is also used to extract the functional feature vectors of each layer block in the pre-trained model and to extract the process semantic vector of the process parameters at the current moment. The calculation module is used to calculate the cosine similarity between the functional feature vector and the process semantic vector; The processing module is also used to mark blocks with a cosine similarity greater than a preset similarity threshold as candidate active blocks; The determination module is also used to calculate the average gradient magnitude of each candidate active block during the historical fine-tuning process, and to determine the preset number of candidate active blocks with the highest average gradient magnitude as the active blocks of the model to be updated. The processing module is specifically used for: If the data type is identified as image data, a mask matrix is generated based on a preset defect area coordinate library, and the original image pixels are weighted based on the mask matrix, retaining the pixel features of the area covered by the mask matrix as image feature data. If the data type is identified as time series data, the length of the time window is determined based on the industrial process cycle, and the correlation between the data and process parameters within the time window is calculated. Time series segments with a correlation higher than a preset correlation threshold are retained as time series feature data. If the data type is identified as text data, then the text content is matched based on a preset process keyword dictionary, and the matching keywords and their context vectors are extracted as text feature data.

7. An electronic device, characterized in that, include: A processor, and a memory communicatively connected to the processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory to implement the method as described in any one of claims 1 to 5.

8. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method of any one of claims 1 to 5.