Meter reading broken image self-labeling model training method based on domain generalization

By using a domain-generalized image self-labeling model training method for damaged power meters, the problem of low recognition efficiency in traditional technologies is solved, and automatic labeling and efficient recognition of damaged insulation devices in power substations are achieved.

CN117132847BActive Publication Date: 2026-06-26MAINTENANCE & TEST CENTRE CSG EHV POWER TRANSMISSION CO

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
MAINTENANCE & TEST CENTRE CSG EHV POWER TRANSMISSION CO
Filing Date
2023-08-28
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In traditional technologies, identifying damaged insulation components in power substations relies on labeled sample libraries, resulting in low identification efficiency.

Method used

A domain-generalization-based self-labeling model training method for damaged power meter images is adopted. By acquiring the training images of damaged meters and the image labeling model to be trained, the feature extraction layer and information detection layer are trained using the training images, and the trained image feature extraction layer and information detection layer are constructed to achieve automatic labeling and recognition.

Benefits of technology

It improves the efficiency of identifying damaged markings, enables automatic detection of unlabeled damaged marking images, and has strong transfer capabilities and recognition efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN117132847B_ABST
    Figure CN117132847B_ABST
Patent Text Reader

Abstract

The application relates to a meter mark damage image self-labeling model training method based on domain generalization. The method comprises the following steps: obtaining a training mark damage image and a to-be-trained mark damage image labeling model; the to-be-trained mark damage image labeling model comprises a to-be-trained image feature extraction layer and a to-be-trained image information detection layer; the to-be-trained image feature extraction layer is trained by using the training mark damage image, and a trained image feature extraction layer is obtained; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features; the to-be-trained image information detection layer is trained by using the training mark damage image, and a trained image information detection layer is obtained; and a trained mark damage image labeling model is obtained according to the trained image feature extraction layer and the trained image information detection layer. The method can realize automatic labeling of mark damage images and improve the recognition efficiency of mark damage.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of artificial intelligence technology, and in particular to a training method for a self-labeled model of damaged power meter images based on domain generalization. Background Technology

[0002] With the development of artificial intelligence technology, intelligent recognition technology has emerged. This technology uses computers to process, analyze, and understand images to identify targets and objects of various patterns, and is a practical application of deep learning algorithms. Intelligent recognition technology has become a trend in identifying common insulation device damage (marking damage) in substations of the power industry.

[0003] In traditional technologies, the identification of common insulation device damage in substations of the power industry (identification of meter damage) still relies on labeled sample libraries. However, establishing a labeled sample library requires manual labeling of samples and research on labeling methods for meter images, resulting in low efficiency in identifying meter damage. Summary of the Invention

[0004] Therefore, it is necessary to provide a domain-generalized image self-labeling model training method, apparatus, computer equipment, computer-readable storage medium, and computer program product that can improve the identification efficiency of meter damage based on the above-mentioned technical problems.

[0005] In a first aspect, this application provides a training method for a self-annotation model of damaged power meter images based on domain generalization. The method includes: acquiring a damaged meter image for training and a damaged meter image annotation model to be trained; the damaged meter image annotation model to be trained includes a feature extraction layer and an information detection layer; training the feature extraction layer using the damaged meter image for training to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; training the information detection layer using the damaged meter image for training to obtain a trained image information detection layer; and obtaining the trained damaged meter image annotation model based on the trained image feature extraction layer and the trained image information detection layer.

[0006] Secondly, this application also provides a training device for a self-annotation model of a damaged power meter image based on domain generalization. The device includes: a data information acquisition module for acquiring a damaged meter image for training and a damaged meter image annotation model to be trained; the damaged meter image annotation model to be trained includes a feature extraction layer and an information detection layer; a feature extraction layer training module for training the damaged meter image to be trained using the damaged meter image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; a feature detection layer training module for training the damaged meter image to be trained using the damaged meter image to obtain a trained image information detection layer; and an annotation model acquisition module for obtaining the trained damaged meter image annotation model based on the trained image feature extraction layer and the trained image information detection layer.

[0007] Thirdly, this application also provides a computer device. The computer device includes a memory and a processor. The memory stores a computer program, and the processor executes the computer program to perform the following steps: acquiring a training image of a damaged notation and a training model of a damaged notation image; the training model of a damaged notation image includes a training image feature extraction layer and a training image information detection layer; training the training image of the training image using the damaged notation to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; training the training image of the training image using the damaged notation to obtain a trained image information detection layer; and obtaining the trained image labeling model based on the trained image feature extraction layer and the trained image information detection layer.

[0008] Fourthly, this application also provides a computer-readable storage medium. The computer-readable storage medium stores a computer program thereon, which, when executed by a processor, performs the following steps: acquiring a training image of a damaged notation and a training model of a damaged notation image; the training model of a damaged notation image includes a training image feature extraction layer and a training image information detection layer; training the training image of a damaged notation using the training image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; training the training image of a damaged notation using the training image to obtain a trained image information detection layer; and obtaining a trained model of a damaged notation image based on the trained image feature extraction layer and the trained image information detection layer.

[0009] Fifthly, this application also provides a computer program product. The computer program product includes a computer program that, when executed by a processor, performs the following steps: acquiring a training image of a damaged notation and a training model of a damaged notation image; the training model of a damaged notation image includes a training image feature extraction layer and a training image information detection layer; training the training image of a damaged notation using the training image to obtain a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; training the training image of a damaged notation using the training image to obtain a trained image information detection layer; and obtaining a trained model of a damaged notation image based on the trained image feature extraction layer and the trained image information detection layer.

[0010] The aforementioned training method, apparatus, computer equipment, storage medium, and computer program product for a self-annotation model of damaged power meter images based on domain generalization involves acquiring damaged meter images for training and a model to be trained for labeling damaged meter images. The model to be trained includes a feature extraction layer and an information detection layer. The feature extraction layer is trained using the damaged meter images to obtain a trained feature extraction layer. The trained feature extraction layer is constructed based on domain-invariant features. The information detection layer is trained using the damaged meter images to obtain a trained information detection layer. Based on the trained feature extraction layer and the trained information detection layer, the trained model for labeling damaged meter images is obtained.

[0011] By training the image feature extraction layer and image information detection layer of the image damage annotation model using training images of damaged markers, and embedding the trained image feature extraction layer into the trained image information detection layer, a trained image damage annotation model is obtained. This model can employ a domain generalization algorithm to train a domain-invariant image damage annotation model from a small dataset of different domains, giving it strong transfer capabilities. This enables the automatic detection of unlabeled damaged marker images, achieving automatic annotation of damaged marker images and improving the efficiency of marker damage recognition. Attached Figure Description

[0012] Figure 1 This is an application environment diagram of a domain generalization-based self-labeling model training method for power meter damage images in one embodiment.

[0013] Figure 2 This is a flowchart illustrating a method for training a self-labeling model of a damaged power meter image based on domain generalization in one embodiment.

[0014] Figure 3 This is a flowchart illustrating a method for obtaining a trained image feature extraction layer in one embodiment;

[0015] Figure 4 This is a flowchart illustrating a method for obtaining image domain invariant feature values ​​in one embodiment.

[0016] Figure 5 This is a flowchart illustrating a method for obtaining image domain invariant feature values ​​in another embodiment;

[0017] Figure 6 This is a flowchart illustrating a method for obtaining image location feature vectors in one embodiment;

[0018] Figure 7 This is a flowchart illustrating a method for obtaining a trained image annotation model for damaged images in one embodiment.

[0019] Figure 8 This is a flowchart illustrating a method for obtaining a labeled image model of a damaged image, as tested in one embodiment.

[0020] Figure 9 This is a schematic diagram illustrating the principle of cross-domain reconstruction in one embodiment;

[0021] Figure 10 This is a schematic diagram of the connection structure of a model for labeling damaged images in one embodiment;

[0022] Figure 11 This is a structural block diagram of a training device for a self-annotation model of a damaged power meter image based on domain generalization in one embodiment.

[0023] Figure 12 This is an internal structural diagram of a computer device in one embodiment. Detailed Implementation

[0024] To make the objectives, technical solutions, and advantages of this application clearer, the following detailed description is provided in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative and not intended to limit the scope of this application.

[0025] This application provides a domain-generalization-based self-labeling model training method for damaged power meter images, which can be applied to, for example... Figure 1In the application environment shown, terminal 102 communicates with server 104 via a network. A data storage system can store the data that server 104 needs to process. The data storage system can be integrated onto server 104 or placed on a cloud or other network server. Server 104 obtains training damaged images and a training model for labeled damaged images from terminal 102. The training model includes a feature extraction layer and an information detection layer. The training damaged images are used to train the feature extraction layer, resulting in a trained feature extraction layer. The trained feature extraction layer is a feature extraction layer constructed based on domain-invariant features. The training damaged images are used to train the information detection layer, resulting in a trained information detection layer. Based on the trained feature extraction layer and the trained information detection layer, the trained model for labeled damaged images is obtained. The terminal 102 can be, but is not limited to, various personal computers, laptops, smartphones, tablets, IoT devices, and portable wearable devices. IoT devices can include smart speakers, smart TVs, smart air conditioners, and smart in-vehicle systems. Portable wearable devices can include smartwatches, smart bracelets, and head-mounted devices. The server 104 can be implemented using a standalone server or a server cluster consisting of multiple servers.

[0026] In one embodiment, such as Figure 2 As shown, a method for training a self-labeled model of damaged power meter images based on domain generalization is presented, which is then applied to... Figure 1 Taking the server in the example, the following steps are included:

[0027] Step 202: Obtain the damaged image of the training notation and the labeled model of the damaged image of the notation to be trained.

[0028] The training image with damaged symbols can be an image used to train an artificial intelligence model, wherein the symbols in the image are damaged.

[0029] The image annotation model to be trained can be an image annotation model that needs to be trained. The image annotation model to be trained is constructed using a neural network, including a feature extraction layer and an information detection layer for the image to be trained.

[0030] Among them, the image feature extraction layer to be trained can be a feature extraction layer with invariant extraction domain in the model for labeling damaged images, but which is the feature extraction layer to be trained.

[0031] Among them, the image information detection layer to be trained can be a neural network with detection function in the labeling model of damaged images, but to be trained. For example, the image information detection layer can be composed of a region proposal network to achieve real-time target detection (Faster R-CNN).

[0032] Specifically, server 104 responds to terminal 102's instructions regarding constructing a cross-domain reconstruction task and a marker-damaged image annotation model. It obtains the training marker-damaged image and the marker-damaged image annotation model to be trained from terminal 102. The marker-damaged image annotation model to be trained includes a feature extraction layer and an information detection layer. The server stores the obtained training marker-damaged image and the marker-damaged image annotation model in a storage unit. When the server needs to process any data from the training marker-damaged image and the marker-damaged image annotation model, it retrieves volatile storage resources from the storage unit for the central processing unit (CPU) to perform computation. This arbitrary data can be a single data input to the CPU or multiple data inputs simultaneously.

[0033] For example, server 104 responds to the instruction of terminal 102, obtains the training table damaged image and the table damaged image annotation model to be trained from terminal 102, and stores them in the storage unit in server 104. The server 104 obtains a total of 10 data information corresponding to the training table damaged image and the table damaged image annotation model to be trained, and can input multiple data information to the central processing unit at the same time.

[0034] Step 204: Use the damaged image as a training representation to train the feature extraction layer of the image to be trained, and obtain the trained image feature extraction layer.

[0035] Among them, the trained image feature extraction layer can be a feature extraction layer with extraction domain invariant features in the labeling model of damaged images that has been trained.

[0036] Specifically, firstly, since the training image of meter damage includes multiple different sub-images of meter damage, and each sub-image of meter damage is different, the feature extraction layer of the image to be trained uses the mathematical model of stylemix to perform style mixing operation on each sub-image of meter damage, so that it mixes the meter damage styles of other substations to obtain various mixed meter damage images. The process of performing style mixing operation on each sub-image of meter damage is as follows: For the case of mixing in Fourier space, given a training image of meter damage x from the j-th substation and an auxiliary training image of meter damage x randomly selected from the i-th substation (i≠j),... aux The training uses a representation of the view v of the damaged image x. iIt can be represented as:

[0037]

[0038] Where K -1 K represents the inverse Fourier transform. A and K P Return the amplitude and phase of the Fourier transform, respectively.

[0039] Second, the various hybrid damaged image patches are segmented into a series of damaged image blocks. This segmentation can be implemented using the image segmentation layer in the vision transformer (VIT) model. Each damaged image block undergoes a learnable linear mapping (patch embedding) to be transformed into a single, fixed-length damaged image vector. These vectors are then used as input to the Transformer sub-model within the vision transformer model.

[0040] To preserve positional information in the damaged images used for training, the visual converter model uses positional encoding to add positional information to each input damaged image vector, resulting in positional feature vectors for each image. This helps the model learn features at different locations within the image.

[0041] Third: The visual converter model uses the encoder structure of the Transformer sub-model to process the feature vectors of each input image location. The Transformer encoder consists of multiple identical layers (usually multiple self-attention layers and feedforward neural network layers). Each layer performs multi-head self-attention mechanism and feedforward neural network calculations. The Transformer encoder extracts the feature relationships of each image location feature vector to obtain the image domain invariant feature values.

[0042] Fourth: Using the feature decoder model of the image feature extraction layer to be trained, the image domain invariant feature values ​​of each substation are decoded to obtain the decoded image domain invariant feature values. Then, the images are reconstructed using these decoded image domain invariant feature values ​​to obtain the reconstructed image feature values. For example... Figure 9 As shown in the figure, this diagram illustrates the principle of cross-domain reconstruction.

[0043] Fifth: Input each reconstructed image feature value into the reconstruction loss function. Through the calculation of the reconstruction loss function, the feature extraction layer of the image to be trained is trained until the reconstruction loss value output by the feature extraction layer is less than a preset reconstruction loss value. The trained image feature extraction layer is then obtained. The expression for the reconstruction loss function is:

[0044] Step 206: Use the damaged image in the training table to train the image information detection layer to obtain the trained image information detection layer.

[0045] The trained image information detection layer can be a neural network with detection capabilities that has already been trained within the image damage annotation model.

[0046] Specifically, firstly, the damaged images represented by training data are input into the convolutional neural network to be trained (such as VGG16, ResNet, etc.) to extract feature maps. These feature maps will contain feature information at different levels of the image, from low to high.

[0047] Second, in terms of feature mapping, the Region Proposal Network (RPN) generates candidate regions by sliding a small window. For each window, the RPN generates multiple anchor boxes (candidate boxes) of different scales and aspect ratios. Then, the RPN uses convolutional and fully connected layers to classify (foreground / background) and regress (position adjustment) each anchor box to determine which anchor boxes are likely to contain objects. The RPN uses non-maximum suppression (NMS) to exclude highly overlapping candidate boxes, retaining the most likely candidate boxes as region proposals.

[0048] Third: For each region, candidate boxes generated by the network are selected, and the RoIPooling operation is used to map the regions in these candidate boxes to fixed-size feature maps. These feature maps will be used as input for subsequent processing. The RoIPooling operation obtains fixed-size feature representations by dividing the region in each candidate box into fixed-size sub-regions and performing pooling operations within each sub-region.

[0049] Fourth: The region that has passed through RoIPooling is fed into a classification head and a regression head. The classification head is used to predict whether the target exists in the region, and the regression head is used to predict the target's position adjustment. Typically, these heads are fully connected layers or convolutional layers.

[0050] Fifth: After processing by the classification and regression heads, Faster R-CNN will output the target category and location information (bounding box coordinates) for each region. Finally, by selecting regions with higher classification scores and adjusting the bounding boxes according to the regression information, the detection results of the image information detection layer can be obtained.

[0051] Sixth: Based on the detection results of the image information detection layer to be trained, train the image information detection layer to be trained until the detection results output by the image information detection layer to be trained meet the preset detection results, and obtain the trained image information detection layer.

[0052] Step 208: Based on the trained image feature extraction layer and the trained image information detection layer, a trained image labeling model for damaged images is obtained.

[0053] Among them, the trained image annotation model for representing damaged images can be an image annotation model that has already been trained.

[0054] Specifically, since it is necessary to embed the trained image feature extraction layer and the trained image information detection layer, the embedding information of the determined image annotation model corresponding to the embedding processing of the trained image feature extraction layer and the trained image information detection layer is calculated.

[0055] The trained image feature extraction layer is used as the active embedding end, while the trained image information detection layer is used as the passive embedding end. Under the guidance of the embedding information of the image annotation model, the trained image feature extraction layer is embedded into the trained image information detection layer to obtain the embedded damaged image annotation model.

[0056] The training image with damaged symbols is input into the embedded image with damaged symbols annotation model. The model is trained through calculations performed by this model until its output loss value is less than a preset loss value, at which point the trained image with damaged symbols annotation model is obtained. Figure 10 As shown in the figure, this is a schematic diagram of the connection structure of the image annotation model for representing damaged images.

[0057] In the aforementioned training method for a self-annotation model of damaged power meter images based on domain generalization, the following steps are taken: First, training images of damaged meters and a model to be trained for labeling damaged meters are acquired. The model to be trained includes a feature extraction layer and an information detection layer. The training images of damaged meters are used to train the feature extraction layer, resulting in a trained feature extraction layer. This trained feature extraction layer is constructed based on domain-invariant features. Second, the training images of damaged meters are used to train the information detection layer, resulting in a trained information detection layer. Finally, the trained model to be labeled is obtained based on the trained feature extraction layer and the trained information detection layer.

[0058] By training the image feature extraction layer and image information detection layer of the image damage annotation model using training images of damaged markers, and embedding the trained image feature extraction layer into the trained image information detection layer, a trained image damage annotation model is obtained. This model can employ a domain generalization algorithm to train a domain-invariant image damage annotation model from a small dataset of different domains, giving it strong transfer capabilities. This enables the automatic detection of unlabeled damaged marker images, achieving automatic annotation of damaged marker images and improving the efficiency of marker damage recognition.

[0059] In one embodiment, such as Figure 3 As shown, the training image is used to represent the damaged image to train the feature extraction layer of the image to be trained, resulting in a trained image feature extraction layer, including:

[0060] Step 302: Perform visual domain invariant feature encoding on each damaged sub-image of the mark to obtain the image domain invariant feature values ​​corresponding to each damaged image of the training mark.

[0061] Among them, visual domain invariant feature encoding can be implemented by the encoder of the Transformer sub-model in the visual transformer model.

[0062] Among them, image domain invariant eigenvalues ​​can be the feature information values ​​of dense attributes of the spatial difference features borne by the image.

[0063] Specifically, firstly, since the training image of meter damage includes multiple different meter damage sub-images, and each meter damage sub-image is different, the feature extraction layer of the image to be trained uses the mathematical model of stylemix to perform style mixing operation on each meter damage sub-image, so as to mix the meter styles of other substations and obtain each mixed meter damage image.

[0064] Second, the various hybrid damaged image patches are segmented into a series of damaged image blocks. This segmentation can be implemented using the image segmentation layer in the vision transformer (VIT) model. Each damaged image block undergoes a learnable linear mapping (patch embedding) to be transformed into a single, fixed-length damaged image vector. These vectors are then used as input to the Transformer sub-model within the vision transformer model.

[0065] To preserve positional information in the damaged images used for training, the visual converter model uses positional encoding to add positional information to each input damaged image vector, resulting in positional feature vectors for each image. This helps the model learn features at different locations within the image.

[0066] Third: The visual converter model uses the encoder structure of the Transformer sub-model to process the feature vectors of each input image location. The Transformer encoder consists of multiple identical layers (usually multiple self-attention layers and feedforward neural network layers). Each layer performs multi-head self-attention mechanism and feedforward neural network calculations. The Transformer encoder extracts the feature relationships of each image location feature vector to obtain the image domain invariant feature values.

[0067] Step 304: Based on the feature decoder of the feature extraction layer of the image to be trained, reconstruct the image by performing image reconstruction on the invariant feature values ​​of each image domain to obtain the feature values ​​of each reconstructed image.

[0068] The feature decoder is a key component of the Transformer model, used to generate sequential data in natural language processing tasks such as machine translation, text generation, and summarization. Its design goal is to combine the contextual information generated by the encoder with previously generated partial sequence information to progressively generate the next sequence element.

[0069] Among them, the reconstructed image feature values ​​can be feature information values ​​obtained by reconstructing the image using image domain invariant feature values.

[0070] Specifically, the feature decoder model of the image feature extraction layer to be trained is used to decode the image domain invariant feature values ​​of each substation to obtain the decoded image domain invariant feature values, and then the image is reconstructed using each decoded image domain invariant feature value to obtain each reconstructed image feature value.

[0071] Step 306: Use the feature values ​​of each reconstructed image to train the feature extraction layer of the image to be trained, and obtain the trained image feature extraction layer.

[0072] Specifically, each reconstructed image feature value is input into the reconstruction loss function. Through the calculation of the reconstruction loss function, the training of the feature extraction layer of the image to be trained is realized until the reconstruction loss value output by the feature extraction layer of the image to be trained is less than the preset reconstruction loss value, and the trained image feature extraction layer is obtained.

[0073] In this embodiment, the reconstructed image feature values ​​are obtained by encoding the damaged sub-image using a visual converter model and then decoding it using a feature decoder. The reconstructed image feature values ​​are then used to train the image feature extraction layer, which enables the image feature extraction layer to have cross-domain reconstruction capabilities and improves the generalization of the image feature extraction layer.

[0074] In one embodiment, such as Figure 4 As shown, visual domain invariant feature encoding is performed on each damaged sub-image to obtain image domain invariant feature values ​​corresponding to the damaged training images, including:

[0075] Step 402: Perform style blending processing on each damaged sub-image of the mark to obtain each blended damaged mark image.

[0076] Among them, the hybrid label damage image can be a label damage image obtained by mixing the feature information from different label damage sub-images.

[0077] Specifically, since the training meter damage images include multiple different meter damage sub-images, and each meter damage sub-image is different, the feature extraction layer of the image to be trained uses the mathematical model of stylemix to perform style mixing operation on each meter damage sub-image, so as to mix it with the meter styles of other substations to obtain each mixed meter damage image.

[0078] Step 404: Perform visual domain invariant feature encoding on each mixed-representation damaged image to obtain image domain invariant feature values.

[0079] Specifically, the various hybrid damaged image patches are segmented into a series of damaged image patches. The segmentation operation can be implemented using the image segmentation layer in the vision transformer (VIT) model. Each damaged image patch undergoes a learnable linear mapping (patch embedding) to be transformed into a single fixed-length damaged image vector. These damaged image vectors are then fed as input to the Transformer sub-model in the vision transformer model.

[0080] To preserve positional information in the damaged images used for training, the visual converter model uses positional encoding to add positional information to each input damaged image vector, resulting in positional feature vectors for each image. This helps the model learn features at different locations within the image.

[0081] The visual converter model uses the encoder structure of the Transformer sub-model to process the feature vectors of each image location in the input. The Transformer encoder consists of multiple identical layers (usually multiple self-attention layers and feedforward neural network layers). Each layer performs multi-head self-attention mechanism and feedforward neural network calculations. The Transformer encoder extracts the feature relationships of each image location feature vector to obtain the image domain invariant feature values.

[0082] In this embodiment, by using the results of style mixing processing of each damaged sub-image, the image domain invariant feature value is calculated. The data input to the feature decoder includes the feature information of all damaged sub-images, which improves the data applicability of the image feature extraction layer.

[0083] In one embodiment, such as Figure 5 As shown, visual domain invariant feature encoding is performed on each mixed-representation damaged image to obtain image domain invariant feature values, including:

[0084] Step 502: Segment each mixed mark damage image to obtain each mark damage image block.

[0085] Among them, the damaged image segmentation can be the segmented image obtained by segmenting the mixed damaged image into segments using an image segmentation model.

[0086] Specifically, each hybrid damaged image is segmented into a series of damaged image patches. The segmentation operation can be implemented using the image segmentation layer in the vision transformer (VIT) model.

[0087] Step 504: Perform position encoding on each damaged image block to obtain the position feature vector of each image.

[0088] Location encoding can be used to add location information to the damaged image blocks.

[0089] Among them, the image location feature vector can be a feature vector that represents the corresponding location information added to the damaged image blocks.

[0090] Specifically, each damaged image block is transformed by a learnable linear mapping (pattern bedding) into a single fixed-length damaged image vector, which is then used as input to the Transformer sub-model in the visual converter model.

[0091] To preserve positional information in the damaged images used for training, the visual converter model uses positional encoding to add positional information to each input damaged image vector, resulting in positional feature vectors for each image. This helps the model learn features at different locations within the image.

[0092] Step 506: Extract the feature relationships of the feature vectors of each image location to obtain the domain-invariant feature values ​​of each image.

[0093] Specifically, the visual converter model uses the encoder structure of the Transformer sub-model to process the feature vectors of each input image location. The Transformer encoder consists of multiple identical layers (usually multiple self-attention layers and feedforward neural network layers). Each layer performs multi-head self-attention mechanism and feedforward neural network calculations. The Transformer encoder extracts the feature relationships of each image location feature vector to obtain the image domain invariant feature values.

[0094] In this embodiment, by performing position encoding on the block of the mixed damaged image corresponding to the damaged image, the image position feature vector is obtained and the image domain invariant feature value is calculated, which can help train the damaged image annotation model to learn the features of different positions in the image.

[0095] In one embodiment, such as Figure 6 As shown, the location of each damaged image block is encoded to obtain the location feature vector of each image, including:

[0096] Step 602: Perform linear mapping transformation on each damaged image block to obtain the damaged image vector of each mark.

[0097] Among them, linear mapping transformation can be an operation that maps the damaged image blocks to a vector space.

[0098] Among them, the damaged image vector can be the information represented by the mapping of damaged image blocks into a vector space.

[0099] Specifically, the various hybrid damaged image patches are segmented into a series of damaged image patches. The segmentation operation can be implemented using the image segmentation layer in the vision transformer (VIT) model. Each damaged image patch undergoes a learnable linear mapping (patch embedding) to be transformed into a single fixed-length damaged image vector. These damaged image vectors are then fed as input to the Transformer sub-model in the vision transformer model.

[0100] Step 604: Perform position encoding on each damaged image vector to obtain the position feature vector of each image.

[0101] Specifically, to preserve the positional information in the training images with damaged markings, the visual converter model uses positional encoding to add positional information to each input image vector of the damaged markings, resulting in positional feature vectors for each image. This helps the model learn the features at different locations in the images.

[0102] In this embodiment, by using linear mapping transformation to convert the damaged image blocks into a vector space and performing position encoding, the damaged image blocks can achieve feature transformation and dimensionality reduction, thereby reducing the training difficulty of the damaged image annotation model.

[0103] In one embodiment, such as Figure 7 As shown, based on the trained image feature extraction layer and the trained image information detection layer, a trained image damage annotation model is obtained, including:

[0104] Step 702: Determine the image annotation model embedding information based on the trained image feature extraction layer and the trained image information detection layer.

[0105] Among them, the image annotation model embedding information can be the parameters of embedding the trained image feature extraction layer into the trained image information detection layer.

[0106] Specifically, since it is necessary to embed the trained image feature extraction layer and the trained image information detection layer, the embedding information of the determined image annotation model corresponding to the embedding processing of the trained image feature extraction layer and the trained image information detection layer is calculated.

[0107] Step 704: Based on the image annotation model embedding information, the trained image feature extraction layer is embedded into the trained image information detection layer to obtain the embedded image annotation model for marked damage.

[0108] Among them, the embedded mark-damaged image annotation model can be the mark-damaged image annotation model obtained by embedding the trained image feature extraction layer into the trained image information detection layer.

[0109] Specifically, the trained image feature extraction layer is used as the active embedding end, while the trained image information detection layer is used as the passive embedding end. Under the guidance of the embedding information of the image annotation model, the trained image feature extraction layer is embedded into the trained image information detection layer to obtain the embedded damaged image annotation model.

[0110] Step 706: Use the training image with damaged symbols to train the embedded image with damaged symbols annotation model to obtain the trained image with damaged symbols annotation model.

[0111] Specifically, the training image with damaged markings is input into the embedded image with damaged markings annotation model. The embedded image with damaged markings annotation model is trained through calculations by the embedded image with damaged markings annotation model until the model loss value output by the embedded image with damaged markings annotation model is less than the preset model loss value, thus obtaining the trained image with damaged markings annotation model.

[0112] In this embodiment, by embedding the trained image feature extraction layer into the trained image information detection layer, the damaged image annotation model can simultaneously possess domain generalization ability and detection ability, thereby improving the working efficiency of the damaged image annotation model and providing a better user experience.

[0113] In one embodiment, such as Figure 8 As shown, after obtaining the trained image damage annotation model based on the trained image feature extraction layer and the trained image information detection layer, the method further includes:

[0114] Step 802: Obtain the image of the damaged test table.

[0115] The test image with damaged symbols can be an image used to test a trained image annotation model with damaged symbols, wherein the image contains damaged symbols.

[0116] Specifically, server 104 responds to terminal 102's instruction to acquire a damaged test meter image, acquires the damaged test meter image from terminal 102, and stores the acquired damaged test meter image in a storage unit. When the server needs to process any data in the damaged test meter image, it retrieves volatile storage resources from the storage unit for the central processing unit to perform calculations. The arbitrary data can be a single data input to the central processing unit, or multiple data inputs can be input to the central processing unit simultaneously.

[0117] Step 804: Input the test mark damage image into the trained mark damage image annotation model to obtain mark damage image annotation information.

[0118] Among them, the labeling information of the damaged image can be the result of the trained labeling image labeling model labeling the damaged image of the test image.

[0119] Specifically, the test table damage image is input into a trained table damage image annotation model. Feature extraction is performed through the trained image feature extraction layer to obtain the feature value of the test table damage image, and information detection is performed through the trained image information detection layer to obtain the test table damage image detection value. The test table damage image feature value and the test table damage image detection value are used to annotate the table damage to obtain the table damage image annotation information.

[0120] Step 806: Calculate the difference between the recorded damaged image annotation information and the preset image annotation information to obtain the annotation information difference value.

[0121] The difference value of the annotation information can be the degree of difference between the annotation information of the damaged image and the preset image annotation information.

[0122] Specifically, the labeled damaged image information and the preset image information are input into the model test difference function. The model test difference function calculates the difference between the labeled damaged image information and the preset image information and outputs the difference value of the label information as the label information difference value.

[0123] Step 808: If the difference value of the annotation information meets the difference threshold of the annotation information, the trained annotation model of the damaged image is used as the tested annotation model of the damaged image.

[0124] Among them, the difference threshold of annotation information can be used as an evaluation criterion for whether the trained annotation model for damaged images meets the usage requirements.

[0125] Among them, the image annotation model that has been tested and marked as damaged can be an image annotation model that has passed the test.

[0126] Specifically, if the difference value of the annotation information is less than the annotation information difference threshold, it means that the trained annotation model for damaged images meets the usage requirements, and the trained annotation model for damaged images is used as the tested annotation model for damaged images. If the difference value of the annotation information is greater than the annotation information difference threshold, it means that the trained annotation model for damaged images does not meet the usage requirements, and the trained annotation model for damaged images is returned to the model training step until the difference value of the annotation information is less than the annotation information difference threshold, and the trained annotation model for damaged images is used as the tested annotation model for damaged images.

[0127] In this embodiment, by testing the damaged image annotation model with a test image, defects in the damaged image annotation model can be identified before it is used, thereby improving the stability of the damaged image annotation model.

[0128] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages of other steps.

[0129] Based on the same inventive concept, this application also provides a domain-generalized power meter damage image self-annotation model training device for implementing the aforementioned domain-generalized power meter damage image self-annotation model training method. The solution provided by this device is similar to the implementation described in the above method. Therefore, the specific limitations of one or more domain-generalized power meter damage image self-annotation model training device embodiments provided below can be found in the above-described limitations of the domain-generalized power meter damage image self-annotation model training method, and will not be repeated here.

[0130] In one embodiment, such as Figure 11 As shown, a training device for a self-labeling model of damaged power meter images based on domain generalization is provided, including: a data information acquisition module 1102, a feature extraction layer training module 1104, a feature detection layer training module 1106, and a labeling model acquisition module 1108, wherein:

[0131] The data information acquisition module 1102 is used to acquire the training table damage image and the table damage image annotation model to be trained; the table damage image annotation model to be trained includes a training image feature extraction layer and a training image information detection layer.

[0132] The feature extraction layer training module 1104 is used to train the feature extraction layer of the image to be trained using the damaged image of the training representation to obtain the trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain invariant features.

[0133] The feature detection layer training module 1106 is used to train the image information detection layer of the training image using the damaged image of the training table to obtain the trained image information detection layer.

[0134] The annotation model is obtained by module 1108, which is used to obtain a trained annotation model for marked damaged images based on the trained image feature extraction layer and the trained image information detection layer.

[0135] In one embodiment, the feature extraction layer training module 1104 is further configured to perform visual domain invariant feature encoding on each damaged sub-image to obtain image domain invariant feature values ​​corresponding to each damaged image for training; to perform image reconstruction on each image domain invariant feature value according to the feature decoder of the feature extraction layer of the image to be trained to obtain each reconstructed image feature value; and to train the feature extraction layer of the image to be trained using each reconstructed image feature value to obtain the trained image feature extraction layer.

[0136] In one embodiment, the feature extraction layer training module 1104 is further configured to perform style mixing processing on each damaged sub-image to obtain each mixed damaged image; and to perform visual domain invariant feature encoding on each mixed damaged image to obtain each image domain invariant feature value.

[0137] In one embodiment, the feature extraction layer training module 1104 is further used to segment each hybrid damaged image to obtain each damaged image block; to perform position encoding on each damaged image block to obtain each image position feature vector; and to extract the feature relationship of each image position feature vector to obtain each image domain invariant feature value.

[0138] In one embodiment, the feature extraction layer training module 1104 is further configured to perform linear mapping transformation on each damaged image block to obtain each damaged image vector; and to perform position encoding on each damaged image vector to obtain each image position feature vector.

[0139] In one embodiment, the annotation model obtaining module 1108 is further configured to determine image annotation model embedding information based on the trained image feature extraction layer and the trained image information detection layer; embed the trained image feature extraction layer into the trained image information detection layer based on the image annotation model embedding information to obtain an embedded mark-damaged image annotation model; and train the embedded mark-damaged image annotation model using training mark-damaged images to obtain a trained mark-damaged image annotation model.

[0140] In one embodiment, the annotation model obtaining module 1108 is further configured to acquire a test table damage image; input the test table damage image into a trained table damage image annotation model to obtain table damage image annotation information; calculate the difference between the table damage image annotation information and preset image annotation information to obtain annotation information difference value; and, if the annotation information difference value meets the annotation information difference threshold, use the trained table damage image annotation model as the tested table damage image annotation model.

[0141] The modules in the aforementioned training device for a self-annotation model of damaged power meter images based on domain generalization can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the processor in a computer device, or stored in the memory of a computer device as software, so that the processor can call and execute the corresponding operations of each module.

[0142] In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 12 As shown, the computer device includes a processor, memory, and a network interface connected via a system bus. The processor provides computing and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs stored in the non-volatile storage media. The database stores server data. The network interface communicates with external terminals via a network connection. When the computer program is executed by the processor, it implements a domain generalization-based self-annotation model training method for power meter damage images.

[0143] Those skilled in the art will understand that Figure 12 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0144] In one embodiment, a computer device is also provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps in the above method embodiments.

[0145] In one embodiment, a computer-readable storage medium is provided storing a computer program that, when executed by a processor, implements the steps in the above method embodiments.

[0146] In one embodiment, a computer program product or computer program is provided, the computer program product or computer program including computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, causing the computer device to perform the steps in the above method embodiments.

[0147] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties.

[0148] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. The computer program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. Any references to memory, databases, or other media used in the embodiments provided in this application can include at least one of non-volatile and volatile memory. Non-volatile memory can include read-only memory (ROM), magnetic tape, floppy disk, flash memory, optical memory, high-density embedded non-volatile memory, resistive random access memory (ReRAM), magnetic random access memory (MRAM), ferroelectric random access memory (FRAM), phase change memory (PCM), graphene memory, etc. Volatile memory can include random access memory (RAM) or external cache memory, etc. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM). The databases involved in the embodiments provided in this application may include at least one type of relational database and non-relational database. Non-relational databases may include, but are not limited to, blockchain-based distributed databases. The processors involved in the embodiments provided in this application may be general-purpose processors, central processing units, graphics processing units, digital signal processors, programmable logic devices, quantum computing-based data processing logic devices, etc., and are not limited to these.

[0149] The technical features of the above embodiments can be combined in any way. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as there is no contradiction in the combination of these technical features, they should be considered to be within the scope of this specification.

[0150] The embodiments described above are merely illustrative of several implementation methods of this application, and while the descriptions are specific and detailed, they should not be construed as limiting the scope of this patent application. It should be noted that those skilled in the art can make various modifications and improvements without departing from the concept of this application, and these all fall within the protection scope of this application. Therefore, the protection scope of this application should be determined by the appended claims.

Claims

1. A training method for a self-labeled model of damaged power meter images based on domain generalization, characterized in that, The method includes: Obtain training images of damaged symbols and a training model of labeled images of damaged symbols; the training model of labeled images of damaged symbols includes a feature extraction layer and an information detection layer for the images to be trained. The training image is used to train the feature extraction layer of the image to be trained, thereby obtaining a trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; The training image is used to train the image information detection layer to obtain the trained image information detection layer. Obtaining a trained labeling damage image annotation model based on the trained image feature extraction layer and the trained image information detection layer includes: determining image annotation model embedding information based on the trained image feature extraction layer and the trained image information detection layer; embedding the trained image feature extraction layer into the trained image information detection layer based on the image annotation model embedding information to obtain an embedded labeling damage image annotation model; and training the embedded labeling damage image annotation model using the training labeling damage image to obtain the trained labeling damage image annotation model.

2. The method according to claim 1, characterized in that, The training image with damaged symbols includes multiple different sub-images with damaged symbols; the step of training the feature extraction layer of the image to be trained using the training image with damaged symbols to obtain a trained image feature extraction layer includes: Visual domain invariant feature encoding is performed on each of the aforementioned damaged sub-images to obtain the image domain invariant feature values ​​corresponding to the training damaged image; Based on the feature decoder of the image feature extraction layer to be trained, the image domain invariant feature values ​​are reconstructed to obtain the reconstructed image feature values. The reconstructed image feature values ​​are used to train the image feature extraction layer to obtain the trained image feature extraction layer.

3. The method according to claim 2, characterized in that, The step of performing visual domain invariant feature encoding on each of the aforementioned damaged sub-images to obtain image domain invariant feature values ​​corresponding to the training damaged images includes: Each of the aforementioned damaged sub-images is subjected to style blending processing to obtain a mixed damaged image; Visual domain invariant feature encoding is performed on each of the hybrid representation damaged images to obtain each of the image domain invariant feature values.

4. The method according to claim 3, characterized in that, The step of performing visual domain invariant feature encoding on each of the hybrid representation damaged images to obtain each of the image domain invariant feature values ​​includes: The images of each of the mixed-signature damages are segmented to obtain image blocks of each signature damage; The damaged image blocks are divided into sections and their positions are encoded to obtain the position feature vectors of each image. The feature relationships of each image location feature vector are extracted to obtain the image domain invariant feature values.

5. The method according to claim 4, characterized in that, The step of performing positional encoding on each of the aforementioned damaged image blocks to obtain the positional feature vector of each image includes: The damaged image blocks of each of the aforementioned marks are subjected to linear mapping transformation to obtain the damaged image vector of each mark; Position encoding is performed on each of the aforementioned damaged image vectors to obtain the position feature vectors of each image.

6. The method according to claim 1, characterized in that, After the step of obtaining the trained image damage annotation model based on the trained image feature extraction layer and the trained image information detection layer, the method further includes: Obtain images of the damaged test table; The test image of the damaged symbol is input into the trained image annotation model of the damaged symbol to obtain the annotation information of the damaged symbol image. Calculate the difference between the labeled damaged image annotation information and the preset image annotation information to obtain the annotation information difference value; If the difference value of the annotation information meets the annotation information difference threshold, the trained annotation model for the damaged image is used as the tested annotation model for the damaged image.

7. A training device for a self-labeling model of damaged power meter images based on domain generalization, characterized in that, The device includes: The data information acquisition module is used to acquire the training table damage image and the table damage image annotation model to be trained; the training table damage image annotation model includes a training image feature extraction layer and a training image information detection layer. The feature extraction layer training module is used to train the feature extraction layer of the image to be trained using the damaged image represented by the training notation, so as to obtain the trained image feature extraction layer; the trained image feature extraction layer is a feature extraction layer constructed based on domain-invariant features; The feature detection layer training module is used to train the image information detection layer to be trained using the damaged image represented by the training table, so as to obtain the trained image information detection layer. The annotation model acquisition module is used to obtain a trained labeling model for a damaged image based on the trained image feature extraction layer and the trained image information detection layer. Specifically, it is used to determine the image annotation model embedding information based on the trained image feature extraction layer and the trained image information detection layer; embed the trained image feature extraction layer into the trained image information detection layer based on the image annotation model embedding information to obtain an embedded labeling model for a damaged image; and train the embedded labeling model for a damaged image using the training image to obtain the trained labeling model for a damaged image.

8. The apparatus according to claim 7, characterized in that, The training image with damaged symbols includes multiple different sub-images with damaged symbols; the feature extraction layer training module is used for: Visual domain invariant feature encoding is performed on each of the aforementioned damaged sub-images to obtain the image domain invariant feature values ​​corresponding to the training damaged image; Based on the feature decoder of the image feature extraction layer to be trained, the image domain invariant feature values ​​are reconstructed to obtain the reconstructed image feature values. The reconstructed image feature values ​​are used to train the image feature extraction layer to obtain the trained image feature extraction layer.

9. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 6.

10. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by a processor, it implements the steps of the method according to any one of claims 1 to 6.