An image segmentation-based target positioning method, system, device, and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By combining image segmentation and 3D reconstruction techniques with camera parameters, the problem of difficult fault location in photovoltaic power plants has been solved, achieving higher accuracy in fault point location.

CN116309826BActive Publication Date: 2026-06-16SUNGROW SMART MAINTENANCE TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: SUNGROW SMART MAINTENANCE TECH CO LTD
Filing Date: 2023-02-01
Publication Date: 2026-06-16

Application Information

Patent Timeline

01 Feb 2023

Application

16 Jun 2026

Publication

CN116309826B

IPC: G06T7/73; G06T7/00; G06T7/11

CPC: G06T7/73; G06T7/97; G06T7/11; G06T2207/10028; G06T2207/10048; G06T2207/20081; G06T2207/20084; Y04S10/52

AI Tagging

Application Domain

Image analysis

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Railway intelligent loading quality monitoring and parameter correction method and system
CN122198812AImage analysis Ensemble learning
Object evaluation method, object evaluation system, object evaluation device, object evaluation program, and storage medium on which the program is recorded.
JP7872884B1Image analysisOther apparatus
A control method and device of an image acquisition device, an image acquisition device, and a storage medium
CN122199885AImage analysis Character and pattern recognition
Method and System for Imaging and Analysis of a Biological Specimen
US20260168894A1Image enhancement Image analysis
A micro device pose visual positioning method for a semiconductor substrate
CN122192290AImage analysisNavigation by terrestrial means

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

⚠Technical Problem

The fault location of existing photovoltaic power plants is difficult and the accuracy is poor, especially when the drone's view is switched to the actual location of the photovoltaic power plant, there is a positioning error.

⚗Method used

Image of the target region is obtained by image segmentation method, 3D reconstruction and fault detection are performed, the mapping relationship between 3D point cloud data and target pixel coordinates is established, and the precise location of the fault point in the target coordinate system is determined by combining camera parameters.

🎯Benefits of technology

It improves the accuracy and robustness of fault location in photovoltaic power plants, reduces the impact of shading or missing parts on the location results, and achieves more accurate fault location.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN116309826B_ABST

Patent Text Reader

Abstract

The application provides a target positioning method, system, device and medium based on image segmentation, which comprises the following steps: acquiring multiple target region images; performing image segmentation on the target region images to obtain subgraph regions of the target object; performing three-dimensional reconstruction according to the target region images to obtain three-dimensional point cloud data corresponding to the target region images and corresponding camera parameters; performing fault detection on the target region images to obtain first coordinates of fault points in the target region images; determining a target pixel coordinate set of a subgraph region where a fault point is located according to the first coordinates to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set; and determining a positioning coordinate value of the fault point in a target coordinate system according to the camera parameters and the mapping relationship. The application can effectively improve fault positioning accuracy.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of image processing, and more particularly to a target localization method, system, device, and medium based on image segmentation. Background Technology

[0002] Fault detection and location technology plays a crucial role in the intelligent operation and maintenance of photovoltaic (PV) power plants. By inspecting images of PV power plants, faults such as shading, hot spots, missing components, cracks, and lack of power generation in PV modules can be detected. Currently, PV power plant images are mostly collected through drone patrols. However, after a fault is detected, fault location is relatively difficult. How to pinpoint the fault point from the drone's perspective to the accurate location within the PV power plant for fault handling remains a major challenge. Summary of the Invention

[0003] In view of the problems existing in the prior art, this application proposes a target localization method, system, device and medium based on image segmentation, which mainly solves the problem of difficult and inaccurate fault localization in existing photovoltaic power plants.

[0004] To achieve the above and other objectives, the technical solution adopted in this application is as follows.

[0005] This application also provides a target localization method based on image segmentation, including:

[0006] Acquire multiple images of a target region, wherein the target region contains one or more target objects;

[0007] The target region image is segmented to obtain sub-image regions of the target object;

[0008] Three-dimensional reconstruction is performed on the target region image to obtain the three-dimensional point cloud data corresponding to the target region image and the corresponding camera parameters;

[0009] Fault detection is performed on the target area image to obtain the first coordinates of the fault point in the target area image;

[0010] The target pixel coordinate set of the sub-map region where the fault point is located is determined based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set;

[0011] The location coordinates of the fault point in the target coordinate system are determined based on the camera parameters and the mapping relationship.

[0012] In one embodiment of this application, before performing image segmentation on the target region image, the method further includes:

[0013] Multiple sample images containing the target object are acquired, and the image regions corresponding to the target object in the sample images are labeled to construct a training sample set, wherein the occluded areas or bad blocks of the target object are labeled as the background.

[0014] The model is trained based on the sample images to obtain a segmentation model, which is then used for image segmentation.

[0015] In one embodiment of this application, image segmentation of the target region image includes:

[0016] The target region image is segmented according to the segmentation model to obtain a mask image of the target object;

[0017] The sub-image regions of each target object in the target region image are determined based on the mask image.

[0018] In one embodiment of this application, segmenting the target region image according to the segmentation model to obtain a mask image of the target object includes:

[0019] The target region image is divided into slices to obtain multiple slice images;

[0020] Each of the segmented images is input into the segmentation model for pixel segmentation to obtain a segmented mask composed of background pixel values and target object pixel values.

[0021] The segmented masks are stitched together to obtain a mask image of the target object.

[0022] In one embodiment of this application, fault detection is performed on the target region image to obtain the first coordinates of the fault point in the target region image, including:

[0023] The target region image is input into a preset fault detection model to obtain one or more fault detection boxes;

[0024] The center coordinates of the fault detection box are used as the first coordinates.

[0025] In one embodiment of this application, determining the target pixel coordinate set of the sub-image region where the fault point is located based on the first coordinate, in order to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set, includes:

[0026] The first coordinate is compared with the set of pixel coordinates of each of the sub-image regions. The sub-image region corresponding to the matching coordinate is taken as the sub-image region where the fault point is located, and the set of pixel coordinates corresponding to the matching coordinate is taken as the target pixel coordinate set.

[0027] Extract the 3D point cloud data of the sub-image region where the fault point is located, and project it onto the coordinate system of the sub-image region to obtain the mapping relationship between the 3D point cloud data and the target pixel coordinate set.

[0028] In one embodiment of this application, determining the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship includes:

[0029] Based on the mapping relationship, determine the coplanar 3D point cloud data of the sub-graph region containing the fault point, and establish the coplanar equation of the corresponding sub-graph region based on the coplanar 3D point cloud data.

[0030] Obtain the collinearity equation, which is used to represent the relationship between the camera parameters and the coordinates of each pixel in the target region image;

[0031] The location coordinates of the fault point in the target coordinate system are determined based on the collinearity equation and the coplanarity equation.

[0032] This application also provides a target localization system based on image segmentation, including:

[0033] The image acquisition module is used to acquire multiple images of a target region, wherein the target region contains one or more target objects;

[0034] An image segmentation module is used to segment the target region image to obtain a sub-image region of the target object;

[0035] The 3D reconstruction module is used to perform 3D reconstruction based on the target area image to obtain the 3D point cloud data corresponding to the target area image and the corresponding camera parameters.

[0036] The fault detection module is used to perform fault detection on the target area image and obtain the first coordinates of the fault point in the target area image;

[0037] The coordinate mapping module is used to determine the target pixel coordinate set of the sub-image area where the fault point is located based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set;

[0038] The fault location module is used to determine the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship.

[0039] This application also provides a computer device, including: a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the image segmentation-based target localization method.

[0040] This application also provides a computer-readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, describes the steps of the target localization method based on image segmentation.

[0041] As described above, the target localization method, system device, and medium based on image segmentation of this application have the following beneficial effects.

[0042] This application obtains a sub-image region of the target object through image segmentation, reconstructs 3D point cloud data based on 2D images, and locates fault points in the sub-image region to the target coordinate system by combining camera parameters. This allows maintenance personnel to perform fault repair based on the location coordinates. Mapping 3D point cloud data only to the sub-image region of the target object reduces the impact of occlusion or missing data on the location results, improving positioning accuracy and robustness. Attached Figure Description

[0043] Figure 1 This is a flowchart illustrating a target localization method based on image segmentation in one embodiment of this application.

[0044] Figure 2 This is a schematic diagram of the overall fault location process in one embodiment of this application.

[0045] Figure 3 This is a schematic diagram of the process for filtering 3D point cloud data in one embodiment of this application.

[0046] Figure 4 This is a block diagram of a target localization system based on image segmentation in one embodiment of this application.

[0047] Figure 5 This is a schematic diagram of the device in one embodiment of this application. Detailed Implementation

[0048] The following specific examples illustrate the implementation of this application. Those skilled in the art can easily understand other advantages and effects of this application from the content disclosed in this specification. This application can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of this application. It should be noted that, unless otherwise specified, the following embodiments and features in the embodiments can be combined with each other.

[0049] It should be noted that the illustrations provided in the following embodiments are only schematic representations of the basic concept of this application. Therefore, the drawings only show the components related to this application and are not drawn according to the actual number, shape and size of the components in the actual implementation. In the actual implementation, the form, quantity and proportion of each component can be arbitrarily changed, and the layout of the components may also be more complex.

[0050] The inventors discovered that existing technologies preprocess dual-light images captured by dual-light cameras (distortion correction and mapping transformation), then perform format conversion, channel separation, morphological operations, region segmentation, and line detection on the preprocessed visible light images to detect the position of each smallest photovoltaic unit. Subsequently, feature matching processing of the visible light images establishes a correspondence between adjacent frames, enabling position tracking of each smallest photovoltaic unit. Finally, known defective units in a single image are mapped to pre-constructed global logical numbering units, locating all the smallest units contained in each array. The GPS coordinates corresponding to the center point of each array are calculated using the camera attitude angle. The GPS coordinates of the center point of each array are then marked on the global map. This method uses string and component segmentation algorithms to map each component to the logical position of the modeled panoramic map. This is helpful for photovoltaic panels mounted on fixed-axis supports, but less suitable for supports with horizontal or inclined single axes. Furthermore, locating the approximate position of strings in the panoramic map based on the approximate GPS coordinates of the photovoltaic panels introduces a certain degree of positioning error.

[0051] Furthermore, existing technologies utilize Structure From Motion (SFM) to reconstruct the camera's GPS (Global Positioning System) coordinates and camera attitude for all photos taken by the drone. A correspondence between 3D coordinates and 2D pixels is established based on the sparse point cloud data generated by SFM, and finally, the 3D coordinates corresponding to the pixel coordinates of the fault point are interpolated. Then, collinearity equations are solved to obtain the 3D coordinates of the fault point. However, in real-world scenarios, due to the sparse nature of the point cloud generated by the SFM algorithm, the neighboring point cloud found through interpolation deviates from the actual fault point, affecting the accuracy of the final location.

[0052] In view of the problems existing in the prior art, this application proposes a target localization method based on image segmentation. The method of this application will be described in detail below with reference to specific embodiments.

[0053] Please see Figure 1 , Figure 1 This is a flowchart illustrating a target localization method based on image segmentation in one embodiment of this application. The method includes steps S100-S150.

[0054] In step S100, multiple images of the target region are acquired, wherein the target region contains one or more target objects.

[0055] In one embodiment, a photovoltaic power station can be used as the target area, and photovoltaic modules as the target objects. A drone is used to cruise and collect images of the photovoltaic power station, acquiring images of the target area from a high-altitude perspective. The acquired images may include infrared thermal images and / or visible light images. Taking infrared images as an example, an infrared camera mounted on a drone can be used to collect infrared images of the target area, resulting in an infrared image sequence. The infrared thermal image sequence may include infrared images from multiple angles.

[0056] In step S110, the target region image is segmented to obtain the sub-image region of the target object.

[0057] In one embodiment, before performing image segmentation on the target region image, steps S111-S112 are further included.

[0058] In step S111, multiple sample images containing the target object are acquired, and the image regions corresponding to the target object in the sample images are labeled to construct a training sample set, wherein the occluded areas or bad blocks of the target object are labeled as background.

[0059] In one embodiment, image sequences of a photovoltaic power station can be collected by a drone as sample images. Each pixel in the sample images is then classified and labeled to construct a training sample set. Specifically, taking infrared images as sample images as an example, the infrared images are classified and labeled pixel by pixel. That is, each pixel in the infrared image is classified as either the background or the surface of the photovoltaic string. The pixel value of the background is configured as 0 (black), and the pixel value of the photovoltaic string surface is configured as 255 (white). Therefore, vegetation obstruction and missing parts on the photovoltaic string surface will be classified as the background, avoiding the impact of vegetation obstruction or missing photovoltaic modules on the fault location accuracy.

[0060] In one embodiment, data augmentation can also be performed on the sample images to expand the number of samples in the training sample set, enrich feature representation, prevent overfitting during model training, and improve the model's generalization ability. Data augmentation may include operations such as flipping, cropping, shaking, scaling, and translating the original acquired images. The specific data augmentation operations can be adjusted according to the actual application requirements and are not limited here.

[0061] In step S112, a model is trained based on the sample image to obtain a segmentation model, and image segmentation is performed according to the segmentation model.

[0062] In one embodiment, after completing annotation and data augmentation, the sample images in the training sample set can be sliced to obtain image patches. These image patches are then used as input to a neural network for model training. The neural network can be a UNET network, but other network structures can also be used to train the segmentation model according to actual needs. This example only provides an example of training using a UNET network and should not be considered a limitation of the embodiments of this application. The UNET network is an improvement based on a fully convolutional neural network and includes three parts: downsampling, upsampling, and softmax. Feature extraction and pooling are performed through downsampling. Each downsampled feature map is passed to a corresponding upsampled map. After multiple upsamplings, the classification result is output by the softmax activation function. The specific network architecture and training process are existing technologies and will not be described in detail here. The optimal model obtained after training with sample images is used as the segmentation model.

[0063] In one embodiment, image segmentation is performed on the target region image, including steps S113 and S114.

[0064] In step S113, the target region image is segmented according to the segmentation model to obtain the mask image of the target object.

[0065] In one embodiment, segmenting the target region image according to the segmentation model to obtain a mask image of the target object includes:

[0066] The target region image is divided into slices to obtain multiple slice images;

[0067] Each of the segmented images is input into the segmentation model for pixel segmentation to obtain a segmented mask composed of background pixel values and target object pixel values.

[0068] The segmented masks are stitched together to obtain a mask image of the target object.

[0069] Specifically, the target region image can be segmented, and the resulting segmented images are used as input to the previously trained segmentation model. The segmentation model is used for inference to obtain a segment mask for each segmented image. Based on the position of each segmented image in the original target region image, the segmented masks are stitched together to form a mask image of the same size as the target region image. The mask image contains only pixel blocks of the target object and pixel blocks of the background. The pixel values of the target object's pixel blocks are all configured to 255 (white), and the pixel values of the background's pixel blocks are all configured to 0 (black).

[0070] In step S114, sub-image regions of each target object in the target region image are determined based on the mask image.

[0071] Specifically, the mask image can be convolved with the original target area image to extract the contour of the target object as a sub-image region. By segmenting the infrared image pixel by pixel, the photovoltaic string region (i.e., the sub-image region) is segmented. Areas below missing components or covered by vegetation will not be segmented, reducing the impact of missing or occluded components on positioning accuracy.

[0072] In step S120, three-dimensional reconstruction is performed based on the target area image to obtain the three-dimensional point cloud data corresponding to the target area image and the corresponding camera parameters.

[0073] Please see Figure 2 , Figure 2 This is a schematic diagram of the overall fault location process in one embodiment of this application. In one embodiment, the SFM algorithm can be used to reconstruct three-dimensional point cloud data based on a two-dimensional target area image. The SFM algorithm determines the spatial and geometric relationships of the target by moving the camera. The SFM algorithm can recover the three-dimensional sparse point cloud and corresponding camera intrinsic and extrinsic parameters of each infrared image. The specific process of three-dimensional point cloud data reconstruction and acquisition of camera intrinsic and extrinsic parameters carried by the UAV is well known to those skilled in the art and will not be described in detail here. SFM reconstruction can obtain sparse three-dimensional point cloud data and camera parameters. The camera parameters may include: camera GPS coordinates, attitude, intrinsic parameters, etc. There is a correlation between the camera parameters and the pixel coordinates in the acquired image. A collinearity equation can be pre-established to represent the correlation between the camera parameters and the pixel coordinates in the image.

[0074] In step S130, fault detection is performed on the target area image to obtain the first coordinates of the fault point in the target area image.

[0075] In one embodiment, fault detection is performed on the target region image to obtain the first coordinates of the fault point in the target region image, including:

[0076] The target region image is input into a preset fault detection model to obtain one or more fault detection boxes;

[0077] The center coordinates of the fault detection box are used as the first coordinates.

[0078] In one embodiment, images containing fault points can be acquired as sample images. Fault points in the sample images are labeled to train a fault detection model. A convolutional neural network (CNN) can be used as the training architecture for model training. The specific CNN architecture can be selected according to actual application requirements and is not limited here. The trained fault detection model can mark fault points in the target region image and generate corresponding fault detection boxes. Fault detection boxes are generated in the image to identify fault points, and the coordinates of the center point of the fault detection box are used as the first coordinates.

[0079] In step S140, the target pixel coordinate set of the sub-map region where the fault point is located is determined based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set.

[0080] In one embodiment, since the photovoltaic module where the fault point is detected by the fault detection algorithm has the characteristic of being coplanar, multiple sparse 3D point cloud points on the surface of the photovoltaic module are recovered by SFM and a simultaneous coplanar equation is established. The location of the fault point is then solved by combining the coplanar equation with the aforementioned collinear equation. However, the aforementioned method has extremely high positioning accuracy for an ideal photovoltaic module surface (where the 3D points on the photovoltaic module surface are absolutely coplanar and there are no 3D points with inconsistent elevations). But the photovoltaic module surface has missing parts and vegetation shading, which can cause abnormal 3D points on the photovoltaic panel surface captured by the drone. When there are missing parts on the photovoltaic module surface, the 3D point cloud reconstructed by SFM is below the 3D coplanar points, causing the plane equation fitted by the photovoltaic module surface to be inaccurate, resulting in positioning deviation. Furthermore, for fault detection of shading hot spots (caused by trees, etc., resulting in higher elevation) and missing parts (caused by missing parts of the module surface, resulting in lower elevation), the abnormal point cloud will cause non-coplanarity, leading to inaccurate positioning. To address these issues, embodiments of this application establish a mapping relationship between a 3D point cloud and a 2D image by segmenting sub-image regions pixel by pixel and selecting common points within the sub-image regions.

[0081] In one embodiment, the target pixel coordinate set of the sub-image region where the fault point is located is determined based on the first coordinate to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set, including steps S141 and S142.

[0082] In step S141, the first coordinate is compared with the set of pixel coordinates of each of the sub-image regions, the sub-image region corresponding to the matching coordinate is taken as the sub-image region where the fault point is located, and the set of pixel coordinates corresponding to the matching coordinate is taken as the target pixel coordinate set.

[0083] Please see Figure 3 , Figure 3 This is a schematic diagram of the process for filtering 3D point cloud data in one embodiment of this application. After obtaining a segmentation mask by pixel-by-pixel segmentation of the infrared image according to the aforementioned steps, the contour of the target object in the target region image can be extracted as the corresponding sub-image region. After obtaining the first coordinate of the fault point by performing fault detection on the target region image, the first coordinate is compared with the pixel coordinates of each sub-image region in the image coordinate system. Based on the coordinate difference, the sub-image region where the fault point is located is determined, i.e., the string mask region, and the set of pixel coordinates with a mask value of 255 is obtained as the target pixel coordinate set.

[0084] In step S142, the 3D point cloud data of the sub-image region where the fault point is located is extracted and projected onto the coordinate system of the sub-image region to obtain the mapping relationship between the 3D point cloud data and the target pixel coordinate set.

[0085] In one embodiment, after reconstructing the three-dimensional point cloud data corresponding to the target area image using the aforementioned SFM algorithm, the three-dimensional point cloud data can be projected onto each sub-image area to establish a mapping relationship between the three-dimensional point cloud data and the two-dimensional points in the sub-image area.

[0086] In one embodiment, only the 3D point cloud data corresponding to the sub-map region containing the fault point can be extracted, and this part of the 3D point cloud data can be projected onto the corresponding sub-map region. The coplanar 3D point cloud data in the sub-map region can be used as the optimal 3D point cloud.

[0087] In step S150, the positioning coordinates of the fault point in the target coordinate system are determined according to the camera parameters and the mapping relationship.

[0088] In one embodiment, determining the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship includes:

[0089] Based on the mapping relationship, determine the coplanar 3D point cloud data of the sub-graph region containing the fault point, and establish the coplanar equation of the corresponding sub-graph region based on the coplanar 3D point cloud data.

[0090] Obtain the collinearity equation, which is used to represent the relationship between the camera parameters and the coordinates of each pixel in the target region image;

[0091] The location coordinates of the fault point in the target coordinate system are determined based on the collinearity equation and the coplanarity equation.

[0092] Specifically, based on the aforementioned mapping relationship, coplanar 3D point cloud data within the sub-image region can be selected for planar equation fitting to obtain coplanar equations. Then, linear fitting is performed using camera parameters obtained from the SFM algorithm to obtain the collinear equations of the coordinate points in the camera coordinate system and the image coordinate system. Solving the collinear and coplanar equations simultaneously allows for the calculation of the precise location coordinates of the fault point in the target coordinate system. The target coordinate system can be set to the global coordinate system where the photovoltaic power station is located; alternative coordinate systems can be used depending on the specific application requirements, and no restrictions are imposed here. Fault localization primarily relies on the coplanarity of the photovoltaic string plane where the fault point is located and the collinearity of the 3D coordinates and pixel coordinates of the fault point. By solving the simultaneous equations, the 3D coordinates of the fault point can be determined.

[0093] Based on the above technical solutions, this application performs pixel-by-pixel segmentation on the surface of a two-dimensional photovoltaic string, and then selects the three-dimensional point cloud that is only coplanar through the correspondence between three-dimensional coordinates and two-dimensional pixel points, thereby improving the accuracy of fault location and having great significance for the refined intelligent inspection of power plants.

[0094] In one embodiment, such as Figure 4 As shown, a target localization system based on image segmentation is provided. The system includes: an image acquisition module 10, used to acquire multiple images of a target region, wherein the target region contains one or more target objects; an image segmentation module 11, used to perform image segmentation on the target region images to obtain sub-image regions of the target objects; a 3D reconstruction module 12, used to perform 3D reconstruction based on the target region images to obtain 3D point cloud data corresponding to the target region images and corresponding camera parameters; a fault detection module 13, used to perform fault detection on the target region images to obtain the first coordinates of the fault point in the target region images; a coordinate mapping module 14, used to determine the target pixel coordinate set of the sub-image region where the fault point is located based on the first coordinates, so as to establish a mapping relationship between the 3D point cloud data and the target pixel coordinate set; and a fault localization module 15, used to determine the localization coordinate value of the fault point in the target coordinate system based on the camera parameters and the mapping relationship.

[0095] In one embodiment, the system further includes a model training module, which, before performing image segmentation on the target region image, further includes: acquiring multiple sample images containing the target object; labeling the image regions corresponding to the target object in the sample images to construct a training sample set, wherein the occluded regions or bad blocks of the target object are labeled as background; and training a model based on the sample images to obtain a segmentation model, so as to perform image segmentation according to the segmentation model.

[0096] In one embodiment, the image segmentation module 11 is further configured to perform image segmentation on the target region image, including: segmenting the target region image according to the segmentation model to obtain a mask image of the target object; and determining sub-image regions of each target object in the target region image according to the mask image.

[0097] In one embodiment, the image segmentation module 11 is further configured to segment the target region image according to the segmentation model to obtain a mask image of the target object, including: dividing the target region image into slices to obtain multiple slice images; inputting each slice image into the segmentation model for pixel segmentation to obtain a slice mask composed of background pixel values and target object pixel values; and stitching the slice masks to obtain a mask image of the target object.

[0098] In one embodiment, the fault detection module 13 is further configured to perform fault detection on the target area image to obtain the first coordinates of the fault point in the target area image, including: inputting the target area image into a preset fault detection model to obtain one or more fault detection boxes; and using the center coordinates of the fault detection boxes as the first coordinates.

[0099] In one embodiment, the coordinate mapping module 14 is further configured to determine the target pixel coordinate set of the sub-image region where the fault point is located based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set, including: comparing the first coordinate with the pixel coordinate set of each of the sub-image regions, taking the sub-image region corresponding to the matching coordinate as the sub-image region where the fault point is located, and taking the corresponding pixel coordinate set as the target pixel coordinate set; extracting the three-dimensional point cloud data of the sub-image region where the fault point is located, and projecting it onto the coordinate system of the sub-image region to obtain the mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set.

[0100] In one embodiment, the fault location module 15 is further configured to determine the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship, including: determining the coplanar three-dimensional point cloud data of the sub-image region containing the fault point according to the mapping relationship, and establishing a coplanar equation for the corresponding sub-image region based on the coplanar three-dimensional point cloud data; obtaining a collinear equation, wherein the collinear equation is used to represent the correlation between the camera parameters and the pixel coordinates in the target region image; and determining the location coordinates of the fault point in the target coordinate system based on the collinear equation and the coplanar equation.

[0101] The aforementioned target localization system based on image segmentation can be implemented as a computer program, which can be implemented in, for example... Figure 5 The computer device shown runs on the computer. The computer device includes: memory, processor, and computer programs stored in the memory and executable on the processor.

[0102] The modules in the aforementioned image segmentation-based target localization system can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in or independent of the terminal's memory in hardware form, or stored in the terminal's memory in software form, so that the processor can call and execute the corresponding operations of each module. The processor can be a central processing unit (CPU), a microprocessor, a microcontroller, etc.

[0103] like Figure 5The diagram shown is a schematic representation of the internal structure of a computer device in one embodiment. A computer device is provided, comprising: a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it performs the following steps: acquiring multiple images of a target region, the target region containing one or more target objects; performing image segmentation on the target region images to obtain sub-image regions of the target objects; performing 3D reconstruction based on the target region images to obtain 3D point cloud data corresponding to the target region images and corresponding camera parameters; performing fault detection on the target region images to obtain a first coordinate of a fault point in the target region images; determining a set of target pixel coordinates of the sub-image region where the fault point is located based on the first coordinates, to establish a mapping relationship between the 3D point cloud data and the target pixel coordinate set; and determining the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship.

[0104] In one embodiment, before the processor performs image segmentation on the target region image, it further includes: acquiring multiple sample images containing the target object; labeling the image regions corresponding to the target object in the sample images to construct a training sample set, wherein the occluded regions or bad blocks of the target object are labeled as background; training a model based on the sample images to obtain a segmentation model, and performing image segmentation according to the segmentation model.

[0105] In one embodiment, when the processor executes the above-described image segmentation of the target region image, the segmentation includes: segmenting the target region image according to the segmentation model to obtain a mask image of the target object; and determining sub-image regions of each target object in the target region image based on the mask image.

[0106] In one embodiment, when the processor executes the above-mentioned process, the segmentation of the target region image according to the segmentation model to obtain a mask image of the target object includes: dividing the target region image into segments to obtain multiple segmented images; inputting each segmented image into the segmentation model for pixel segmentation to obtain a segmented mask composed of background pixel values and target object pixel values; and stitching the segmented masks to obtain a mask image of the target object.

[0107] In one embodiment, when the processor executes the above-mentioned process, the process of detecting faults in the target region image and obtaining the first coordinates of the fault point in the target region image includes: inputting the target region image into a preset fault detection model to obtain one or more fault detection boxes; and using the center coordinates of the fault detection boxes as the first coordinates.

[0108] In one embodiment, when the processor executes the above-mentioned method, the process of determining the target pixel coordinate set of the sub-image region where the fault point is located based on the first coordinate to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set includes: comparing the first coordinate with the pixel coordinate set of each of the sub-image regions, taking the sub-image region corresponding to the matching coordinate as the sub-image region where the fault point is located, and taking the corresponding pixel coordinate set as the target pixel coordinate set; extracting the three-dimensional point cloud data of the sub-image region where the fault point is located, and projecting it onto the coordinate system of the sub-image region to obtain the mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set.

[0109] In one embodiment, when the processor executes the above-mentioned method, the process of determining the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship includes: determining the coplanar 3D point cloud data of the sub-image region containing the fault point based on the mapping relationship, and establishing a coplanar equation for the corresponding sub-image region based on the coplanar 3D point cloud data; obtaining a collinear equation, which is used to represent the correlation between the camera parameters and the pixel coordinates in the target region image; and determining the location coordinates of the fault point in the target coordinate system based on the collinear equation and the coplanar equation.

[0110] In one embodiment, the aforementioned computer device can be used as a server, including but not limited to a standalone physical server or a server cluster consisting of multiple physical servers. The computer device can also be used as a terminal, including but not limited to mobile phones, tablets, personal digital assistants, or smart devices. Figure 5 As shown, the computer device includes a processor, non-volatile storage medium, internal memory, display screen, and network interface connected via a system bus.

[0111] The processor of this computer device provides computing and control capabilities to support the operation of the entire device. The non-volatile storage medium of the computer device stores the operating system and computer programs. These programs can be executed by the processor to implement the image segmentation-based target localization method provided in the various embodiments above. The internal memory of the computer device provides a cached runtime environment for the operating system and computer programs stored in the non-volatile storage medium. A display interface can show data via a screen. The screen can be a touchscreen, such as a capacitive or electronic screen, and can generate corresponding instructions by receiving clicks on controls displayed on the touchscreen.

[0112] Those skilled in the art will understand that Figure 5The structure of the computer device shown in the figure is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. A specific computer device may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0113] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon. When executed by a processor, the computer program performs the following steps: acquiring multiple images of a target region, the target region containing one or more target objects; performing image segmentation on the target region images to obtain sub-image regions of the target objects; performing 3D reconstruction based on the target region images to obtain 3D point cloud data and corresponding camera parameters corresponding to the target region images; performing fault detection on the target region images to obtain a first coordinate of a fault point in the target region images; determining a set of target pixel coordinates of the sub-image region where the fault point is located based on the first coordinates to establish a mapping relationship between the 3D point cloud data and the set of target pixel coordinates; and determining the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship.

[0114] In one embodiment, when the computer program is executed by the processor, before performing image segmentation on the target region image, the program further includes: acquiring multiple sample images containing the target object; labeling the image regions corresponding to the target object in the sample images to construct a training sample set, wherein the occluded regions or bad blocks of the target object are labeled as background; training a model based on the sample images to obtain a segmentation model, and performing image segmentation according to the segmentation model.

[0115] In one embodiment, when the computer program is executed by a processor, the image segmentation of the target region image includes: segmenting the target region image according to the segmentation model to obtain a mask image of the target object; and determining sub-image regions of each target object in the target region image according to the mask image.

[0116] In one embodiment, when the computer program is executed by a processor, the process of segmenting the target region image according to the segmentation model to obtain a mask image of the target object includes: dividing the target region image into multiple slice images; inputting each slice image into the segmentation model for pixel segmentation to obtain a slice mask composed of background pixel values and target object pixel values; and stitching the slice masks together to obtain a mask image of the target object.

[0117] In one embodiment, when the computer program is executed by the processor, the method of detecting faults in the target region image and obtaining the first coordinates of the fault point in the target region image includes: inputting the target region image into a preset fault detection model to obtain one or more fault detection boxes; and using the center coordinates of the fault detection boxes as the first coordinates.

[0118] In one embodiment, when the computer program is executed by the processor, the implementation of determining the target pixel coordinate set of the sub-image region where the fault point is located based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set, includes: comparing the first coordinate with the pixel coordinate set of each of the sub-image regions, taking the sub-image region corresponding to the matching coordinate as the sub-image region where the fault point is located, and taking the corresponding pixel coordinate set as the target pixel coordinate set; extracting the three-dimensional point cloud data of the sub-image region where the fault point is located, and projecting it onto the coordinate system of the sub-image region to obtain the mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set.

[0119] In one embodiment, when the instruction is executed by the processor, the determination of the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship includes: determining the coplanar 3D point cloud data of the sub-image region containing the fault point according to the mapping relationship, and establishing the coplanar equation of the corresponding sub-image region based on the coplanar 3D point cloud data; obtaining the collinear equation, which is used to represent the correlation between the camera parameters and the pixel coordinates in the target region image; and determining the location coordinates of the fault point in the target coordinate system according to the collinear equation and the coplanar equation.

[0120] Those skilled in the art will understand that all or part of the processes in the above embodiments can be implemented by a computer program instructing related hardware. The program can be stored in a non-volatile computer-readable storage medium, and when executed, it can include the processes of the embodiments of the above methods. The storage medium can be a magnetic disk, optical disk, read-only memory (ROM), etc.

[0121] The above embodiments are merely illustrative of the principles and effects of this application and are not intended to limit this application. Any person skilled in the art can modify or alter the above embodiments without departing from the spirit and scope of this application. Therefore, all equivalent modifications or alterations made by those skilled in the art without departing from the spirit and technical concept disclosed in this application should still be covered by the claims of this application.

Claims

1. A target localization method based on image segmentation, characterized in that, include: Acquire multiple images of a target region, wherein the target region contains one or more target objects; The photovoltaic power station is defined as the target area, and the photovoltaic modules are defined as the target objects. The target region image is segmented to obtain sub-image regions of the target object; Before performing image segmentation on the target region image, the method further includes: acquiring multiple sample images containing the target object; labeling the image regions corresponding to the target object in the sample images to construct a training sample set, wherein the occluded regions or bad blocks of the target object are labeled as background; and training a model based on the sample images to obtain a segmentation model, so as to perform image segmentation according to the segmentation model. Three-dimensional reconstruction is performed on the target region image to obtain the three-dimensional point cloud data corresponding to the target region image and the corresponding camera parameters; Fault detection is performed on the target area image to obtain the first coordinates of the fault point in the target area image; The target pixel coordinate set of the sub-image region where the fault point is located is determined based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set; wherein, common facet points in the sub-image region are selected to establish the mapping relationship; The location coordinates of the fault point in the target coordinate system are determined based on the camera parameters and the mapping relationship.

2. The target localization method based on image segmentation according to claim 1, characterized in that, Image segmentation of the target region image includes: The target region image is segmented according to the segmentation model to obtain a mask image of the target object; The sub-image regions of each target object in the target region image are determined based on the mask image.

3. The target localization method based on image segmentation according to claim 2, characterized in that, The target region image is segmented according to the segmentation model to obtain a mask image of the target object, including: The target region image is divided into slices to obtain multiple slice images; Each of the segmented images is input into the segmentation model for pixel segmentation to obtain a segmented mask composed of background pixel values and target object pixel values. The segmented masks are stitched together to obtain a mask image of the target object.

4. The target localization method based on image segmentation according to claim 3, characterized in that, Fault detection is performed on the target area image to obtain the first coordinates of the fault point in the target area image, including: The target region image is input into a preset fault detection model to obtain one or more fault detection boxes; The center coordinates of the fault detection box are used as the first coordinates.

5. The target localization method based on image segmentation according to any one of claims 1-4, characterized in that, Based on the first coordinates, determine the target pixel coordinate set of the sub-image region where the fault point is located, and establish a mapping relationship between the 3D point cloud data and the target pixel coordinate set, including: The first coordinate is compared with the set of pixel coordinates of each of the sub-image regions. The sub-image region corresponding to the matching coordinate is taken as the sub-image region where the fault point is located, and the set of pixel coordinates corresponding to the matching coordinate is taken as the target pixel coordinate set. Extract the 3D point cloud data of the sub-image region where the fault point is located, and project it onto the coordinate system of the sub-image region to obtain the mapping relationship between the 3D point cloud data and the target pixel coordinate set.

6. The target localization method based on image segmentation according to any one of claims 1-4, characterized in that, Determining the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship includes: Based on the mapping relationship, determine the coplanar 3D point cloud data of the sub-graph region containing the fault point, and establish the coplanar equation of the corresponding sub-graph region based on the coplanar 3D point cloud data. Obtain the collinearity equation, which is used to represent the relationship between the camera parameters and the coordinates of each pixel in the target region image; The location coordinates of the fault point in the target coordinate system are determined based on the collinearity equation and the coplanarity equation.

7. A target localization system based on image segmentation, characterized in that, include: The image acquisition module is used to acquire multiple images of a target region, wherein the target region contains one or more target objects; The photovoltaic power station is defined as the target area, and the photovoltaic modules are defined as the target objects. An image segmentation module is used to segment the target region image to obtain a sub-image region of the target object. Before segmenting the target region image, the module further includes: acquiring multiple sample images containing the target object; labeling the image regions corresponding to the target object in the sample images to construct a training sample set, wherein the occluded areas or bad blocks of the target object are labeled as background; and training a model based on the sample images to obtain a segmentation model for image segmentation. The 3D reconstruction module is used to perform 3D reconstruction based on the target area image to obtain the 3D point cloud data corresponding to the target area image and the corresponding camera parameters. The fault detection module is used to perform fault detection on the target area image and obtain the first coordinates of the fault point in the target area image; The coordinate mapping module is used to determine the target pixel coordinate set of the sub-image region where the fault point is located based on the first coordinate, so as to establish a mapping relationship between the three-dimensional point cloud data and the target pixel coordinate set; wherein, common points in the sub-image region are selected to establish the mapping relationship; The fault location module is used to determine the location coordinates of the fault point in the target coordinate system based on the camera parameters and the mapping relationship.

8. A computer device, comprising: A memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, when the processor executes the computer program, it implements the steps of the target localization method based on image segmentation as described in any one of claims 1 to 6.

9. A computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements the steps of the target localization method based on image segmentation as described in any one of claims 1 to 6.