Domain augmentation methods, devices, equipment, and storage media for pre-trained models

By using self-supervised contrastive learning to enhance the pre-trained model in the source domain with unlabeled target domain data, the performance degradation of the pre-trained model under domain shift conditions is solved, and the feature extraction and parsing capabilities in vertical business scenarios are improved, thereby enhancing the application effect of the model in the target business.

CN115358410BActive Publication Date: 2026-06-30传申弘安智能(深圳)有限公司

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
传申弘安智能(深圳)有限公司
Filing Date
2022-08-08
Publication Date
2026-06-30

AI Technical Summary

Technical Problem

When the downstream business scenario differs significantly from the pre-training dataset scenario, the improvement effect of the existing pre-trained model drops sharply, resulting in a decrease in business performance.

Method used

By training with self-supervised contrastive learning, the source domain pre-trained model is augmented with unlabeled target domain data to obtain the target domain pre-trained model, thereby enhancing its feature extraction and parsing capabilities in vertical business scenarios.

Benefits of technology

Without increasing data collection and annotation costs, this method enhances the feature extraction and parsing capabilities of pre-trained models in target domains within vertical business scenarios, thereby improving the application effectiveness of the models in target businesses.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN115358410B_ABST
    Figure CN115358410B_ABST
Patent Text Reader

Abstract

This invention provides a method, apparatus, device, and storage medium for domain enhancement of a pre-trained model. The method includes: acquiring a source domain pre-trained model corresponding to a trained domain scene, and target domain data corresponding to a target domain scene; performing self-supervised comparative learning training on the source domain pre-trained model based on the target domain data to obtain a target domain pre-trained model after domain enhancement based on the target domain scene. When domain shift exists, the target domain data corresponding to the target domain scene can be directly used to perform self-supervised domain enhancement learning on the source domain pre-trained model. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model while adding feature extraction and parsing capabilities for the environment and targets in vertical business scenarios, thereby improving the business performance of various target businesses in the target domain.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of deep learning technology, and in particular to a domain augmentation method for a pre-trained model, a domain augmentation device for a pre-trained model, a corresponding electronic device, and a corresponding computer storage medium. Background Technology

[0002] In practical applications, different business domains tend to have their own pre-trained models. Currently, the common model production method in deep learning typically involves fine-tuning a pre-trained model using target task data to generate a pre-trained model with strong feature extraction and generalization capabilities. This allows for achieving good model training results with less data during the model production process, leading to better business application outcomes.

[0003] The performance of a model fine-tuned based on this may be several times better than that of a model produced based on direct training. However, the model produced based on fine-tuning has a certain range of applicability. When the downstream business scenario differs greatly from the scenario of the pre-trained dataset, the improvement effect of the pre-trained model will drop significantly, which will in turn affect the relevant business performance when the model is applied. Summary of the Invention

[0004] In view of the above problems, embodiments of the present invention are proposed to provide a domain augmentation method for a pre-trained model, a domain augmentation device for a pre-trained model, a corresponding electronic device, and a corresponding computer storage medium to overcome or at least partially solve the above problems.

[0005] This invention discloses a domain augmentation method for a pre-trained model, the method comprising:

[0006] Obtain source domain pre-trained models corresponding to trained domain scenarios, and target domain data corresponding to target domain scenarios; the source domain pre-trained models are generated based on the dataset of trained domain scenarios, and the dataset of trained domain scenarios and the target domain data are not scenario data belonging to vertical business scenarios;

[0007] The source domain pre-trained model is trained using self-supervised comparative learning based on the target domain data to obtain a target domain pre-trained model with domain enhancement based on the target domain scene.

[0008] Optionally, the target domain data includes unlabeled target domain data from multiple target domain scenarios; the self-supervised contrastive learning training is implemented based on data reconstruction using a training model.

[0009] The step of performing self-supervised contrastive learning training on the source domain pre-trained model based on the target domain data to obtain a target domain pre-trained model after domain augmentation based on the target domain scene includes:

[0010] Obtain the corresponding training framework based on the type of the source domain pre-trained model;

[0011] Within the training framework, based on the various target domain scenarios and unlabeled target domain data, the source domain pre-trained model is trained to reconstruct the target domain data, resulting in a target domain pre-trained model with domain enhancement.

[0012] Optionally, the step of training the source domain pre-trained model to reconstruct the target domain data based on the multiple target domain scenarios and unlabeled target domain data to obtain a target domain pre-trained model with domain augmentation includes:

[0013] Partial occlusion is applied to data from multiple target domains to generate occluded data;

[0014] The source domain pre-trained model is trained to reconstruct the occlusion data to obtain the target domain pre-trained model.

[0015] Optionally, the target domain data includes image data; the step of partially occluding multiple target domain data to generate occlusion data includes:

[0016] Multiple image data are split into at least one local image block;

[0017] From the at least one image block, a target image block is randomly selected, and a random occlusion operation is performed on the local image block to obtain occlusion data.

[0018] Optionally, the step of training the source domain pre-trained model to reconstruct the occlusion data to obtain the target domain pre-trained model includes:

[0019] Encode the occluded data with missing information to obtain feature codes for the target domain scene;

[0020] Based on the feature encoding of the target domain scene, the occlusion data of the target domain scene is reconstructed to obtain reconstructed data with complete information;

[0021] Calculate the mean squared error loss of the reconstructed data and the target domain data, and update the source domain pre-trained model based on the mean squared error loss to obtain the updated target domain pre-trained model.

[0022] Optionally, it also includes:

[0023] Obtain the pre-trained model for the target domain, and the labeled business data corresponding to the target business in the target domain scenario;

[0024] Based on the labeled business data, fine-tune the target domain pre-trained model to produce a business model corresponding to the target business.

[0025] Optionally, the trained domain scene dataset includes an image dataset of natural scenes, and the target domain data corresponding to the target domain scene includes image data of power grid scenes; the source domain pre-trained model includes a natural scene processing model corresponding to the image dataset of natural scenes, and the target domain pre-trained model after domain enhancement based on the target domain scene includes a power grid scene processing model after domain enhancement based on the power grid scene; the labeled business data includes transmission line defect labeled data, and the business model includes a detection model for transmission line defect detection.

[0026] This invention also discloses a domain augmentation device for a pre-trained model, the device comprising:

[0027] The source domain pre-trained model acquisition module is used to acquire a source domain pre-trained model corresponding to the trained domain scenario; the source domain pre-trained model is generated based on the dataset of the trained domain scenario, and the dataset of the trained domain scenario and the target domain data are not scenario data belonging to the vertical business scenario;

[0028] The target domain data acquisition module is used to acquire target domain data corresponding to the target domain scenario.

[0029] The domain enhancement module is used to perform self-supervised comparative learning training on the source domain pre-trained model based on the target domain data, so as to obtain a target domain pre-trained model after domain enhancement based on the target domain scene.

[0030] Optionally, the target domain data includes unlabeled target domain data from multiple target domain scenarios; the self-supervised contrastive learning training is implemented based on data reconstruction using a trained model; the domain enhancement module includes:

[0031] The training framework acquisition submodule is used to acquire the corresponding training framework according to the type of the source domain pre-trained model.

[0032] The data reconstruction training submodule is used to train the source domain pre-trained model to reconstruct the target domain data based on the multiple target domain scenarios and unlabeled target domain data under the training framework, thereby obtaining the target domain pre-trained model after domain enhancement.

[0033] Optionally, the data reconstruction training submodule includes:

[0034] The occlusion data generation unit is used to partially occlude data from multiple target domains to generate occlusion data.

[0035] The data reconstruction training unit is used to train the source domain pre-trained model to reconstruct the occluded data, thereby obtaining the target domain pre-trained model.

[0036] Optionally, the target domain data includes image data; the occlusion data generation unit includes:

[0037] The image block splitting subunit is used to split multiple image data into at least one image block;

[0038] The occlusion data generation subunit is used to randomly obtain a target image block from the at least one image block, and perform a random occlusion operation on the target image block to obtain occlusion data.

[0039] Optionally, the data reconstruction training unit includes:

[0040] The feature encoding acquisition subunit is used to encode the occluded data with missing information to obtain the feature encoding for the target domain scene;

[0041] The data reconstruction subunit is used to reconstruct the occlusion data of the target domain scene based on the feature encoding of the target domain scene, so as to obtain reconstructed data with complete information.

[0042] A target domain pre-trained model generation sub-unit is used to calculate the mean squared error loss of the reconstructed data and the target domain data, and to update the source domain pre-trained model based on the mean squared error loss to obtain the updated target domain pre-trained model.

[0043] Optionally, the device further includes:

[0044] The target domain pre-trained model acquisition submodule is used to acquire the target domain pre-trained model;

[0045] The annotation business data acquisition submodule is used to acquire annotation business data corresponding to the target business in the target domain scenario;

[0046] The business processing submodule is used to fine-tune the target domain pre-trained model based on the labeled business data and produce a business model corresponding to the target business.

[0047] Optionally, the trained domain scene dataset includes an image dataset of natural scenes, and the target domain data corresponding to the target domain scene includes image data of power grid scenes; the source domain pre-trained model includes a natural scene processing model corresponding to the image dataset of natural scenes, and the target domain pre-trained model after domain enhancement based on the target domain scene includes a power grid scene processing model after domain enhancement based on the power grid scene; the labeled business data includes transmission line defect labeled data, and the business model includes a detection model for transmission line defect detection.

[0048] This invention also discloses an electronic device, including: a processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein when the computer program is executed by the processor, it implements the steps of any of the pre-trained model's domain augmentation method.

[0049] This invention also discloses a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of any of the pre-trained model's domain augmentation method.

[0050] The embodiments of the present invention have the following advantages:

[0051] In this embodiment of the invention, when there is a domain offset between the target domain scene and the trained domain scene (i.e., the source domain), the target domain data corresponding to the target domain scene can be directly used to perform self-supervised domain reinforcement learning on the source domain pre-trained model. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model and add the ability to extract and analyze features of the environment and targets in the vertical business scene. This improves the business performance of various target businesses in the target domain without the need for data collection, labeling, and training costs. Attached Figure Description

[0052] Figure 1 This is a flowchart illustrating the steps of an embodiment of a domain augmentation method for a pre-trained model according to the present invention.

[0053] Figure 2 This is a schematic diagram illustrating the domain enhancement process of the model provided in this embodiment of the invention;

[0054] Figure 3 This is a flowchart illustrating the steps of another embodiment of the domain augmentation method for a pre-trained model according to the present invention.

[0055] Figure 4 This is a schematic diagram illustrating the process of performing self-supervised comparative learning training provided in an embodiment of the present invention;

[0056] Figure 5This is a structural block diagram of an embodiment of a domain enhancement device for a pre-trained model according to the present invention. Detailed Implementation

[0057] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0058] In practical applications, different business domains tend to have their own pre-trained models. However, training these pre-trained models requires a large amount of business data and computing power. Not every relatively segmented business domain has enough data accumulation and budget for model training.

[0059] The approach of fine-tuning pre-trained models not only means that the model's feature extraction and generalization capabilities are directly proportional to the amount of training data (i.e., it depends on the amount of data used for fine-tuning and the data scenario; the larger the amount of data used for fine-tuning and the richer the data scenario, the better the performance of the pre-trained model), but also that when the downstream business scenario differs significantly from the pre-training dataset scenario (i.e., when there is a domain shift), the fine-tuned model has a certain range of applicability, which leads to a significant decrease in the improvement effect of the pre-trained model, thus limiting the application of the business task and significantly reducing the relevant business performance when the model is applied.

[0060] One of the core ideas of this invention is to fine-tune the source domain pre-trained model using unlabeled target domain data and self-supervised training, based on the source domain pre-trained model, to obtain a target domain pre-trained model that performs well in the target domain. Specifically, when there is a domain shift between the target domain scene and the trained domain scene (i.e., the source domain), unlabeled target domain data corresponding to the target domain scene can be directly used to perform self-supervised domain reinforcement learning on the source domain pre-trained model. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model while adding the ability to extract and analyze features of the environment and targets in vertical business scenarios. This eliminates the need for data collection, labeling, and training costs, allowing the domain-enhanced target domain pre-trained model to be quickly applied to various model production processes in business scenarios, improving the business performance of various target businesses in the target domain. Furthermore, training the target domain pre-trained model once and applying it to multiple target businesses can significantly improve the model output rate, increase model accuracy, and reduce model output costs.

[0061] Reference Figure 1 The diagram illustrates a step flowchart of an embodiment of a domain augmentation method for a pre-trained model according to the present invention, which may specifically include the following steps:

[0062] Step 101: Obtain the source domain pre-trained model corresponding to the trained domain scene, and the target domain data corresponding to the target domain scene;

[0063] In this embodiment of the invention, based on the source domain pre-trained model, unlabeled target domain data is used, and the source domain pre-trained model is fine-tuned by using a self-supervised training method, so as to obtain a target domain pre-trained model that performs well in the target domain.

[0064] The source domain pre-trained model is generated based on a dataset from a previously trained domain scenario. The training method for this model is unrestricted; it can be implemented using supervised training, unsupervised training, or semi-supervised training. It's important to note that supervised training refers to learning a function from a given labeled training dataset as model parameters, and then training the model by predicting results based on this function when new test data is input. Unsupervised training refers to analyzing the analytical features, such as patterns, of unlabeled data to train the model, which can be achieved using probability density function estimation methods and / or methods based on sample similarity measurements. Semi-supervised training falls between supervised and unsupervised training; it involves only a portion of the training set being labeled, requiring model training through methods such as pseudo-label generation.

[0065] Specifically, the method of fine-tuning the source domain pre-trained model through self-supervised training is mainly performed when there is domain shift. Domain shift refers to the difference between the downstream business scenario and the pre-trained dataset scenario, that is, the target domain scenario and the trained domain scenario (i.e., the source domain) do not belong to the vertical business scenario. In this case, the dataset of the trained domain scenario and the target domain data do not belong to the vertical business scenario scenario. The source domain pre-trained model corresponding to the trained domain scenario cannot be applied to the scenario processing that does not belong to the vertical business scenario.

[0066] In practical applications, when there is a domain offset between the target domain and the source domain, in order to fine-tune the pre-trained model of the source domain that does not belong to the vertical business scenario, so as to achieve scenario processing that does not belong to the vertical business scenario, such as... Figure 2 As shown, at this point, target domain data corresponding to the target domain scene can be obtained, and the target domain dataset can be collected so that the source domain pre-trained model can be fine-tuned based on the obtained target domain data, the source domain pre-trained model can be updated, and the target domain pre-trained model after domain enhancement based on the target domain scene can be obtained.

[0067] The target domain dataset can be used to train a target domain pre-trained model corresponding to a target domain scenario. In this embodiment of the invention, it is used to fine-tune a source domain pre-trained model. Regarding the collection of the target domain dataset, since the collected target domain data can be unlabeled sample data, the target domain data included in the dataset can include target domain data from multiple target domain scenarios. These multiple target domain scenarios can be specific business scenarios (i.e., downstream business scenarios) under other upstream business scenarios that do not belong to the vertical business scenario of the source domain, such as transmission line scenarios and power station scenarios in the power grid scenario. This increases the diversity and richness of the domain scenarios, thereby improving the feature extraction and generalization capabilities of the pre-trained model. Furthermore, based on unlabeled target domain data, it reduces data collection and labeling costs as well as training costs.

[0068] Step 102: Perform self-supervised comparative learning training on the source domain pre-trained model based on the target domain data to obtain the target domain pre-trained model after domain enhancement based on the target domain scene.

[0069] The source domain pre-trained model corresponding to the trained domain scenario cannot be applied to scenario processing that does not belong to the vertical business scenario. After obtaining the target domain data corresponding to the target domain scenario, the source domain pre-trained model can be updated using the target domain data. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model and add the ability to extract and analyze the features of the environment and targets in the vertical business scenario, thus obtaining the domain-enhanced target domain pre-trained model.

[0070] Updating the source domain pre-trained model can primarily be achieved by using target domain data to perform self-supervised contrastive learning training on the source domain pre-trained model. For example... Figure 2 As shown, this can be specifically manifested as building a self-supervised domain enhancement model, using target domain data to enhance and train it, and updating the source domain pre-trained model, thereby enhancing the feature extraction capability of the source domain pre-trained model in the target domain and obtaining the target domain pre-trained model.

[0071] In constructing a domain-adaptive augmentation model, the self-supervised contrastive learning training paradigm can be used as the basic paradigm for domain-adaptive augmentation. Self-supervised contrastive learning training is a type of unsupervised learning paradigm.

[0072] In practical applications, when building self-supervised domain augmentation models, the corresponding training framework can be obtained based on the type of the source domain pre-trained model. Specifically, a training framework can be selected based on the type of the source domain pre-trained model (e.g., ResNet series (models built by stacking multiple residual structures), Transformer series (models proposed for natural language processing)). The source domain pre-trained model can then be updated based on the selected training framework. It should be noted that the training frameworks available for selection include, but are not limited to, the MOCO series (an unsupervised learning method applicable to visual models), the BYOL series (models capable of contrastive self-supervised learning without negative samples), and the MAE series (a metric derived from Mean Absolute Error), etc. This invention does not impose any limitations on these.

[0073] During the process of model domain augmentation training and updating, under the selected training framework, the source domain pre-trained model can be trained based on multiple target domain scenarios and unlabeled target domain data to reconstruct the target domain data, resulting in the target domain pre-trained model after domain augmentation. Specifically, under the selected training framework, the obtained source domain pre-trained model can be loaded first, and then the target domain data can be used to update the source domain pre-trained model.

[0074] The self-supervised contrastive learning training is primarily achieved through data reconstruction based on the trained model. This involves partially occluding unlabeled target domain data across various target domain scenarios to generate occluded data. This occluded data becomes training data with missing information, which can then be used for information-deficient training to fill in the missing information and update the source domain pre-trained model. Specifically, this involves reconstructing the occluded data from the occluded data to obtain the target domain data before partial occlusion, thus generating a pre-trained target domain model after domain augmentation.

[0075] In one example, the target domain data may include image data, which may be unlabeled images belonging to multiple target domain scenes. When obtaining occlusion data with missing information, for image data, multiple image data can be split into at least one image block. Then, a portion of the target image block can be randomly obtained from at least one local image block, and the target image block can be randomly occluded. The partially occluded image block can be used as the occlusion data with missing information.

[0076] Therefore, when reconstructing occluded data using a pre-trained model in the source domain to fill in missing information, the pre-trained model can be updated based on the reconstructed data to obtain an updated pre-trained model in the target domain. This updated model has already undergone domain augmentation based on the target domain scene. This is primarily achieved by encoding the occluded data with missing information to obtain feature codes specific to the target domain scene. Based on these feature codes, the occluded data in the target domain scene is reconstructed to obtain fully reconstructed data. Then, the mean squared error loss between the reconstructed data and the target domain data is calculated, and the pre-trained model in the source domain is updated accordingly, thus obtaining the pre-trained model in the target domain. This process requires no additional manual data annotation or specific network structure design or optimization; it only uses unlabeled information and a fixed self-supervised training paradigm to enhance the performance of the pre-trained model in the target domain.

[0077] Based on the updated source domain pre-trained model using target domain data, while retaining the original source domain pre-trained model, the system adds feature extraction and parsing capabilities for the environment and targets in vertical business scenarios. At this point, the updated source domain pre-trained model has undergone domain augmentation based on the target domain scenario, exhibiting good feature extraction and generalization capabilities in the target domain, making it applicable to scenario processing that was not originally part of the vertical business scenario. In a preferred embodiment, after obtaining the target domain training model, corresponding processing can also be performed on the target business of the target domain scenario based on the target domain pre-trained model.

[0078] like Figure 2 As shown, the source domain pre-trained model after domain enhancement can be exported, which is the target domain pre-trained model. At this time, the target domain pre-trained model can be used to process the target business of the target domain scenario. The processing of the target business is actually based on the corresponding business model. The corresponding business processing based on the target domain pre-trained model can be represented as the application of the target domain pre-trained model in the target business, that is, applying the target domain pre-trained model to the production of various business models in the business scenario.

[0079] Based on the pre-trained model in the target domain, various business models can be quickly trained and produced using a small amount of labeled business data in the target domain to handle target business. Specifically, labeled business data corresponding to the target business in the target domain scenario can be obtained, and then the pre-trained model in the target domain can be fine-tuned based on the labeled business data to generate a business model corresponding to the target business. This business model can then be used to process the target business accordingly. The domain-enhanced pre-trained model in the target domain can be seamlessly applied to various downstream target businesses, effectively improving the efficiency and accuracy of the model.

[0080] It should be noted that the operations of enhancing the source domain pre-trained model with target domain data to target domain scenarios, and generating business models for various downstream target businesses in the target domain scenarios based on the target domain pre-trained model, can be applied to a wide range of scenarios, as long as the target domain scenario and the trained domain scenario (i.e., the source domain) do not belong to vertical business scenarios, and the target domain scenario is a specific business scenario under other upstream business scenarios that do not belong to vertical business scenarios of the source domain. Furthermore, when generating business models based on the target domain training model for each business process, even if they do not belong to vertical business scenarios, if they belong to similar business scenarios, such as power transmission line scenarios and power station scenarios, fine-tuning of the same target domain pre-trained model is allowed, that is, it is not necessary to retrain the target domain pre-trained model. In this respect, the embodiments of the present invention do not impose any limitations.

[0081] For example, when performing domain augmentation in the power grid domain to process related services in the power grid scenario, the dataset of the trained domain scenario can be an image dataset of natural scenes, the target domain data corresponding to the target domain scenario includes image data of the power grid scenario, the source domain pre-trained model can include a natural scene processing model corresponding to the image dataset of natural scenes, the target domain pre-trained model after domain augmentation based on the target domain scenario can include a power grid scenario processing model after domain augmentation based on the power grid scenario, and when using the power grid scenario processing model after domain augmentation for target services, such as the task of detecting transmission line defects, the labeled service data can be transmission line defect labeled data, and the service model can be a detection model for detecting transmission line defects.

[0082] In this embodiment of the invention, when there is a domain offset between the target domain scene and the trained domain scene (i.e., the source domain), unlabeled target domain data corresponding to the target domain scene can be directly used to perform self-supervised domain reinforcement learning on the source domain pre-trained model. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model and add the ability to extract and analyze features of the environment and targets in the vertical business scene. Without the need for data collection, labeling, and training costs, the domain-enhanced target domain pre-trained model can be quickly applied to the production of various models in the business scene, thereby improving the business performance of various target businesses in the target domain.

[0083] Reference Figure 3 This diagram illustrates a flowchart of another embodiment of the domain augmentation method for a pre-trained model according to the present invention. In this embodiment, domain augmentation is performed on a power grid domain to process relevant business examples within a power grid scenario. Specifically, the method may include the following steps:

[0084] Step 301: Obtain the natural scene processing model corresponding to the natural scene, and the image data corresponding to the power grid scene;

[0085] In this embodiment of the invention, based on the source domain pre-trained model, unlabeled target domain data is used, and the source domain pre-trained model is fine-tuned by using a self-supervised training method, so as to obtain a target domain pre-trained model that performs well in the target domain.

[0086] Among them, the operations of enhancing the source domain pre-trained model with target domain data to target domain scenarios, and generating business models for various downstream target businesses in the target domain scenarios based on the target domain pre-trained model, can be applied to a wide range of scenarios, as long as the target domain scenario and the trained domain scenario (i.e., the source domain) do not belong to vertical business scenarios, and the target domain scenario is a specific business scenario under other upstream business scenarios that do not belong to vertical business scenarios of the source domain.

[0087] In practical applications, different business domains tend to have their own pre-trained models. In this embodiment of the invention, it is assumed that a downstream target business in a power grid scenario (i.e., the target domain) needs to be processed, such as the task of detecting defects in transmission lines. The source domain pre-trained model is assumed to be a natural scene processing model corresponding to the natural scene. The natural scene processing model is trained based on the image data of the natural scene. It does not belong to the vertical business scenario of the power grid scenario. Due to the domain difference, its accuracy will be significantly reduced compared to the model pre-trained directly using the power grid labeled data.

[0088] At this point, image data corresponding to the power grid scenario can be acquired, and the natural scene processing model can be trained by self-supervised comparative learning based on the image data corresponding to the power grid scenario. This will update the natural scene processing model and enhance its ability to extract and analyze environmental and target features in the power grid scenario.

[0089] In specific implementation, the acquired image data corresponding to the power grid scenario can include various downstream business scenarios, and is not limited to image data related to the task of detecting defects in transmission lines. It can also include image data of transmission line scenarios, power station scenarios, etc., to increase the diversity and richness of the domain scenarios, improve the feature extraction and generalization capabilities of the pre-trained model, and the acquired image data corresponding to the power grid scenario is unlabeled image data, which can reduce the cost of data collection, labeling and training.

[0090] Step 302: Perform self-supervised comparative learning training on the natural scene processing model based on the image data of the power grid scene to obtain the power grid scene processing model after domain enhancement based on the power grid scene.

[0091] Updating the natural scene processing model based on image data of the power grid scene can be manifested as training the natural scene processing model with self-supervised comparative learning using image data of the power grid scene. Specifically, this can be manifested as building a self-supervised domain enhancement model, using image data of the power grid scene to enhance and train it, and then updating the natural scene processing model. This enhances the feature extraction capability of the natural scene processing model in the power grid scene, and obtains the target domain pre-trained model, i.e., the power grid scene processing model.

[0092] In practical applications, the appropriate training framework can be obtained according to the type of natural scene processing model. That is, the appropriate training framework can be selected according to different types of natural scene processing models. Under the selected training framework, the natural scene processing model can be trained to perform data reconstruction based on image data of unlabeled power grid scenes in various target domains, resulting in a power grid scene processing model with domain enhancement. Specifically, under the selected training framework, the obtained natural scene processing model can be loaded first, and then the natural scene processing model can be updated using image data of power grid scenes.

[0093] In one example, we can take the Transformer pre-trained model and the MAE training paradigm as an example. The original Transformer pre-trained model can be a natural scene processing model corresponding to natural scenes. It is a model pre-trained on the ImageNet dataset. The ImageNet dataset contains image data of natural scenes and does not include vertical domain scene data, such as image data of power grid data. In this case, after downloading the open-source pre-trained Transformer model and collecting image data of power grid scenes, we can build the MAE training paradigm model framework. The MAE training paradigm is a self-supervised contrastive learning model architecture, which mainly reconstructs the information of occluded images by training the model to enable the model to obtain general feature understanding and extraction capabilities.

[0094] The MAE model mainly consists of two parts: a feature encoder and a feature decoder. Both the feature encoder and decoder use the Transformer architecture. The encoder encodes features of the occluded image with missing information, and then the decoder reconstructs the image, filling in the occluded parts and completing the missing information.

[0095] Reference Figure 4 This diagram illustrates the process of self-supervised contrastive learning training provided in an embodiment of the present invention. After constructing the model and loading the acquired Transformer pre-trained model (i.e., the natural scene processing model), partial occlusion can be applied to image data of various unlabeled power grid scenes to generate occluded data with missing information. This data is then used to train the natural scene processing model to reconstruct the occluded data, i.e., to perform information-gap training to supplement the missing information.

[0096] In practical applications, such as Figure 4 As shown, unlabeled image data (i.e., original images) from multiple power grid scenarios can be split into at least one image block, such as local image block X1, ..., local image block Xn. A portion of the target image blocks are randomly selected from these at least one image block and subjected to random occlusion operations, such as setting them to zero, to obtain occluded training data with missing information, i.e., occluded data. This occluded training data can then be input into a Transformer encoder that loads a pre-trained model for encoding to obtain feature codes specific to the power grid scenario. These feature codes can then be input into a Transformer decoder to reconstruct the image, filling in the missing information to obtain a fully reconstructed image. The mean squared error (MES) loss between the reconstructed image and the original image can then be calculated. The natural scene processing model is updated based on this MES loss, and the fine-tuned model is exported. This is the pre-trained model after power grid target domain enhancement, also known as the power grid scene processing model.

[0097] In this embodiment of the invention, no additional manual data annotation is required, nor is specific network structure design or optimization necessary. The natural scene processing model can be enhanced in power grid scenarios simply by using unlabeled information and a fixed self-supervised training paradigm.

[0098] Step 303: Obtain transmission line defect labeling data, fine-tune the power grid scenario processing model based on the transmission line defect labeling data, and generate a detection model for transmission line defect detection.

[0099] The updated power grid scene processing model, based on image data from a power grid scenario, retains the natural scene processing model and adds the ability to extract and analyze features of the environment and targets in vertical business scenarios. At this point, the updated natural scene processing model has been enhanced based on the power grid scenario, and it has good feature extraction and generalization capabilities in the power grid scenario, making it applicable to scene processing that does not originally belong to vertical business scenarios.

[0100] At this point, the power grid scenario processing model obtained after domain augmentation can be used to process target services within the power grid scenario, such as in power grid inspection. Suppose we need to process downstream target services within the power grid scenario (i.e., the target domain), such as the task of detecting transmission line defects. Then, we can construct a model corresponding to the target service, such as a target detection model. This can be represented by loading a pre-trained model after domain augmentation, and then collecting labeled service data corresponding to the target service, such as a small amount of labeled transmission line defect data, to perform transfer learning on the detection model. By using the labeled transmission line defect data to fine-tune the power grid scenario processing model, a high-precision detection model for transmission line defect detection can be quickly produced.

[0101] Specifically, after obtaining the domain-enhanced power grid scene model, it can also process power transmission station defect detection. For example, the detection model can be used to classify images and detect transmission line defects based on images that are identified as defective. Alternatively, the detection model can be used to detect transmission line defects by detecting the presence of defective areas in the images. At this point, only a small amount of labeled power transmission station defect detection data is needed to produce the corresponding detection model. This allows for a single training of a pre-trained model in the target domain, which can be used for applications in various target businesses, improving the business performance of various target businesses in the target domain. Furthermore, it can significantly increase the model's output rate, improve model accuracy, and reduce model production costs.

[0102] It should be noted that when producing a business model based on the target domain training model for each business process, even if it does not belong to a vertical business scenario, if it belongs to a similar business scenario, such as a power transmission line scenario and a power station scenario, it is allowed to fine-tune the same target domain pre-trained model. That is, based on the power grid scenario processing model obtained in step 302, it is possible to fine-tune the model using relevant business data in the power grid scenario to generate the business model of the relevant business, without needing to retrain the target domain pre-trained model.

[0103] In this embodiment of the invention, based on the source domain pre-trained model, unlabeled target domain data is used, and self-supervised training is employed to fine-tune the source domain pre-trained model, thereby obtaining a target domain pre-trained model that performs well in the target domain. Specifically, when there is a domain shift between the target domain scene and the trained domain scene (i.e., the source domain), unlabeled target domain data corresponding to the target domain scene can be directly used to perform self-supervised domain reinforcement learning on the source domain pre-trained model. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model while adding the ability to extract and analyze features of the environment and targets in vertical business scenarios. This eliminates the need for data collection, labeling, and training costs, enabling the domain-enhanced target domain pre-trained model to be quickly applied to the production of various models in business scenarios, thereby improving the business performance of various target businesses in the target domain.

[0104] It should be noted that, for the sake of simplicity, the method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments of the present invention are not limited to the described order of actions, because according to the embodiments of the present invention, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions involved are not necessarily essential to the embodiments of the present invention.

[0105] Reference Figure 5 The diagram illustrates a structural block diagram of an embodiment of a domain enhancement device for a pre-trained model according to the present invention, which may specifically include the following modules:

[0106] The source domain pre-trained model acquisition module 501 is used to acquire the source domain pre-trained model corresponding to the trained domain scenario. The source domain pre-trained model is generated based on the dataset of the trained domain scenario. The dataset of the trained domain scenario and the target domain data are not scenario data belonging to the vertical business scenario.

[0107] The target domain data acquisition module 502 is used to acquire target domain data corresponding to the target domain scenario.

[0108] The domain enhancement module 503 is used to perform self-supervised comparative learning training on the source domain pre-trained model based on the target domain data, so as to obtain the target domain pre-trained model after domain enhancement based on the target domain scene.

[0109] In one embodiment of the present invention, the target domain data includes unlabeled target domain data for multiple target domain scenarios; self-supervised contrastive learning training is implemented based on data reconstruction using a training model; the domain enhancement module 503 may include the following sub-modules:

[0110] The training framework acquisition submodule is used to acquire the corresponding training framework based on the type of the source domain pre-trained model.

[0111] The data reconstruction training submodule is used to reconstruct the target domain data by training a source domain pre-trained model based on multiple target domain scenarios and unlabeled target domain data within the training framework, thereby obtaining a target domain pre-trained model after domain enhancement.

[0112] In one embodiment of the present invention, the data reconstruction training submodule may include the following units:

[0113] The occlusion data generation unit is used to partially occlude data from multiple target domains to generate occlusion data.

[0114] The data reconstruction training unit is used to train the source domain pre-trained model to reconstruct the occluded data and obtain the target domain pre-trained model.

[0115] In one embodiment of the present invention, the target domain data includes image data; the occlusion data generation unit may include the following sub-units:

[0116] The image block splitting subunit is used to split multiple image data into at least one image block;

[0117] The occlusion data generation subunit is used to randomly obtain a target image block from at least one image block and perform random occlusion operations on the target image block to obtain occlusion data.

[0118] In one embodiment of the present invention, the data reconstruction training unit may include the following sub-units:

[0119] The feature encoding acquisition subunit is used to encode occluded data with missing information to obtain feature codes for the target domain scene;

[0120] The data reconstruction subunit is used to reconstruct the occluded data of the target domain scene based on the feature encoding of the target domain scene, so as to obtain the reconstructed data with complete information.

[0121] The target domain pre-trained model generation sub-unit is used to calculate the mean squared error loss of the reconstructed data and the target domain data. The source domain pre-trained model is updated based on the mean squared error loss to obtain the updated target domain pre-trained model.

[0122] In one embodiment of the present invention, the device may further include the following modules:

[0123] The target domain pre-trained model acquisition module is used to acquire the target domain pre-trained model;

[0124] The annotation business data acquisition module is used to acquire annotation business data corresponding to the target business in the target domain scenario;

[0125] The business processing module is used to fine-tune the target domain pre-trained model based on labeled business data and generate a business model corresponding to the target business.

[0126] In one embodiment of the present invention, the dataset of the trained domain scene includes an image dataset of natural scenes, and the target domain data corresponding to the target domain scene includes image data of power grid scenes; the source domain pre-trained model includes a natural scene processing model corresponding to the image dataset of natural scenes, and the target domain pre-trained model after domain enhancement based on the target domain scene includes a power grid scene processing model after domain enhancement based on the power grid scene; the labeled business data includes transmission line defect labeled data, and the business model includes a detection model for detecting transmission line defects.

[0127] In this embodiment of the invention, when there is a domain offset between the target domain scene and the trained domain scene (i.e., the source domain), unlabeled target domain data corresponding to the target domain scene can be directly used to perform self-supervised domain reinforcement learning on the source domain pre-trained model. This allows the domain-enhanced target domain pre-trained model to retain the source domain pre-trained model and add the ability to extract and analyze features of the environment and targets in the vertical business scene. Without the need for data collection, labeling, and training costs, the domain-enhanced target domain pre-trained model can be quickly applied to the production of various models in the business scene, thereby improving the business performance of various target businesses in the target domain.

[0128] As the device embodiment is basically similar to the method embodiment, the description is relatively simple, and relevant parts can be found in the description of the method embodiment.

[0129] This invention also provides an electronic device, comprising:

[0130] It includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor. When the computer program is executed by the processor, it implements the various processes of the above-described domain augmentation method embodiment of the pre-trained model and achieves the same technical effect. To avoid repetition, it will not be described again here.

[0131] This invention also provides a computer-readable storage medium storing a computer program. When the computer program is executed by a processor, it implements the various processes of the above-described pre-trained model domain enhancement method embodiment and achieves the same technical effect. To avoid repetition, it will not be described again here.

[0132] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.

[0133] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, apparatus, or computer program products. Therefore, embodiments of the present invention can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, embodiments of the present invention can take the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0134] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0135] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0136] These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, causing a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0137] Although preferred embodiments of the present invention have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present invention.

[0138] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.

[0139] The foregoing has provided a detailed description of a domain augmentation method for a pre-trained model, a domain augmentation device for a pre-trained model, a corresponding electronic device, and a corresponding computer storage medium provided by the present invention. Specific examples have been used to illustrate the principles and implementation methods of the present invention. The descriptions of the above embodiments are only for the purpose of helping to understand the method and core ideas of the present invention. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of the present invention. Therefore, the content of this specification should not be construed as a limitation of the present invention.

Claims

1. A domain enhancement method of a pre-trained model applied to the power grid field, characterized in that, The method includes: Acquire source domain pre-trained models corresponding to trained domain scenarios, and target domain data corresponding to target domain scenarios; the source domain pre-trained models are generated based on the dataset of trained domain scenarios, and the dataset of trained domain scenarios and the target domain data are not scenario data belonging to vertical business scenarios; the target domain data includes image data; The source domain pre-trained model is trained by self-supervised comparative learning based on the target domain data to obtain a target domain pre-trained model after domain enhancement based on the target domain scene. The target domain data includes unlabeled target domain data from various target domain scenarios; the self-supervised contrastive learning training is implemented based on data reconstruction using a training model. The step of performing self-supervised contrastive learning training on the source domain pre-trained model based on the target domain data to obtain a target domain pre-trained model with domain augmentation based on the target domain scene includes: Obtain the corresponding training framework based on the type of the source domain pre-trained model; Under the training framework, based on the various target domain scenarios and unlabeled target domain data, the source domain pre-trained model is trained to reconstruct the target domain data, resulting in a target domain pre-trained model after domain enhancement. The step of training the source domain pre-trained model to reconstruct the target domain data based on the multiple target domain scenarios and unlabeled target domain data, to obtain a target domain pre-trained model with domain augmentation, includes: Partial occlusion is applied to data from multiple target domains to generate occluded data; The source domain pre-trained model is trained to reconstruct the occlusion data to obtain the target domain pre-trained model.

2. The method according to claim 1, characterized in that, The process of partially occluding multiple target domain data to generate occluded data includes: Split multiple image data into at least one image block; From the at least one image block, a target image block is randomly selected, and a random occlusion operation is performed on the target image block to obtain occlusion data.

3. The method according to claim 1, characterized in that, The process of training the source domain pre-trained model to reconstruct the occlusion data and obtain the target domain pre-trained model includes: Encode the occluded data with missing information to obtain feature codes for the target domain scene; Based on the feature encoding of the target domain scene, the occlusion data of the target domain scene is reconstructed to obtain reconstructed data with complete information; Calculate the mean squared error loss of the reconstructed data and the target domain data, and update the source domain pre-trained model based on the mean squared error loss to obtain the updated target domain pre-trained model.

4. The method according to claim 1, characterized in that, Also includes: Obtain the pre-trained model for the target domain, and the labeled business data corresponding to the target business in the target domain scenario; Based on the labeled business data, fine-tune the target domain pre-trained model to produce a business model corresponding to the target business.

5. The method according to any one of claims 1 to 4, characterized in that, The trained domain scene dataset includes an image dataset of natural scenes, and the target domain data corresponding to the target domain scene includes image data of power grid scenes; the source domain pre-trained model includes a natural scene processing model corresponding to the image dataset of natural scenes, and the target domain pre-trained model after domain enhancement based on the target domain scene includes a power grid scene processing model after domain enhancement based on the power grid scene; the labeled business data includes transmission line defect labeled data, and the business model includes a detection model for transmission line defect detection.

6. A domain augmentation device for a pre-trained model, applied in the power grid field, characterized in that, The device includes: The source domain pre-trained model acquisition module is used to acquire a source domain pre-trained model corresponding to the trained domain scenario; the source domain pre-trained model is generated based on the dataset of the trained domain scenario, and the dataset of the trained domain scenario and the target domain data are not scenario data belonging to the vertical business scenario; The target domain data acquisition module is used to acquire target domain data corresponding to the target domain scene; the target domain data includes unlabeled target domain data for various target domain scenes; the target domain data includes image data. The domain enhancement module is used to perform self-supervised contrastive learning training on the source domain pre-trained model based on the target domain data to obtain a target domain pre-trained model after domain enhancement based on the target domain scene; the self-supervised contrastive learning training is implemented by data reconstruction based on the trained model. The domain enhancement module includes: The training framework acquisition submodule is used to acquire the corresponding training framework based on the type of the source domain pre-trained model. The data reconstruction training submodule is used to reconstruct the target domain data by training a source domain pre-trained model based on multiple target domain scenarios and unlabeled target domain data within the training framework, thereby obtaining a target domain pre-trained model after domain enhancement. The data reconstruction training submodule includes: The occlusion data generation unit is used to partially occlude data from multiple target domains to generate occlusion data. The data reconstruction training unit is used to train the source domain pre-trained model to reconstruct the occluded data and obtain the target domain pre-trained model.

7. An electronic device, characterized in that, include: A processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein the computer program, when executed by the processor, implements the steps of the domain augmentation method for the pre-trained model as claimed in any one of claims 1 to 5.

8. A computer-readable storage medium, characterized in that, A computer program is stored on the computer-readable storage medium, which, when executed by a processor, implements the steps of the domain augmentation method for the pre-trained model as described in any one of claims 1 to 5.