A method and system for wire grasping based on a robotic arm
By using image processing and coordinate system transformation based on robotic arms, combined with target detection and semantic segmentation networks, the problem of robotic arms struggling to grasp complex wires was solved, achieving safe and efficient wire grasping.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA ELECTRIC POWER RESEARCH INSTITUTE CO LTD
- Filing Date
- 2023-09-15
- Publication Date
- 2026-06-30
AI Technical Summary
Existing robotic arms struggle to effectively grasp wires of varying lengths, shapes, and thicknesses, especially in complex, variable, and hazardous work environments, where manual operation is risky and costly.
By acquiring image data of power lines using a camera based on a robotic arm, determining the coordinate transformation matrix, and using object detection and semantic segmentation networks to detect and segment power lines, extracting the coordinates of the power lines and grasping them, and combining hand-eye calibration technology to achieve precise positioning.
This technology enables robotic arms to quickly and accurately grasp electrical wires, avoiding the dangers of manual operation, saving labor costs, and improving the stability and accuracy of the grasping process.
Smart Images

Figure CN117226834B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of target detection technology, and more specifically, to a method and system for grasping wires based on a robotic arm. Background Technology
[0002] Electrical wires are conductive materials used to transmit electricity or signals. In buildings and residences, they are used for power supply and lighting; in electronic devices and communication systems, they are used to connect electronic components, circuit boards, and devices for signal transmission; in industry and manufacturing, they are used for power transmission and device connection; in transportation, they are used for battery charging, lighting, dashboard displays, and sensor connections; in energy and power systems, they are used for power transmission and distribution; in medical equipment, they are used to connect medical instruments, monitoring equipment, and medical devices; in stage lighting, sound, and special effects systems, they are used for power supply, signal transmission, and device connection; and in security and monitoring systems, they are used for power supply and signal transmission. As people's living standards continue to improve, the application scenarios for electrical wires are becoming increasingly complex and varied. For certain complex and variable work scenarios filled with unknown dangers, such as high-altitude operations, live-lined operations, and underwater operations, the risks of manual labor are too great. Therefore, new technological solutions are urgently needed to address the operational problems in these specific scenarios.
[0003] As research into robotic arms continues to deepen, many robotic arms have been successfully applied in practical production and daily life. Therefore, this paper uses a robotic arm to grasp electrical wires to solve the operational problem in the specific scenario mentioned above. It is worth noting that most current robotic arm applications grasp objects with relatively regular shapes. For complex and irregular objects like electrical wires, whose length, shape, and thickness are uncertain, directly using existing robotic arm grasping technology is somewhat inadequate. Summary of the Invention
[0004] To address the above problems, this invention proposes a wire grasping method based on a robotic arm, comprising:
[0005] The robot arm acquires image data of the wire to be inspected using its camera, and determines the transformation matrix between the robot arm's base coordinate system, the robot arm's camera coordinate system, and the robot arm's worktable coordinate system when the robot arm acquires the image data of the wire to be inspected.
[0006] The image data is input into a preset network and function for detection and segmentation to obtain segmented image data.
[0007] The coordinates of the wire to be detected are extracted from the segmented image data. Based on the transformation matrix, the initial coordinates of the wire to be detected are transformed into the coordinates of the wire to be detected in the base coordinate system of the robotic arm. The wire to be detected is then grasped according to the coordinates of the wire to be detected.
[0008] Optionally, based on the TCP calibration method of hand-eye calibration technology, the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm is determined when the robotic arm acquires the image data of the wire to be detected.
[0009] Optional, preset networks and functions, including: object detection network, loss function and semantic segmentation network;
[0010] The target detection network is used to detect the wire image of the wire to be detected in the obtained image data;
[0011] The loss function is used to determine the positional and dimensional losses of the wire image;
[0012] The semantic segmentation image is used to segment the wire image based on the position loss, size loss, and wire detection loss to obtain segmented image data.
[0013] Optional, the target detection network includes: a region fusion network, a key point fusion network, and a result fusion network;
[0014] The region fusion network is used to extract local feature information from image data multiple times, and fuse the local feature information obtained from the multiple extractions to obtain the key feature information of the image data;
[0015] The key point fusion network is used to calibrate the key point information in the key feature information multiple times based on the calibration matrix, and fuse the key point information calibrated multiple times to obtain a feature map with key point feature information.
[0016] The result fusion network is used to fuse key point feature information in the feature map to narrow the search range of key point feature information and obtain the wire image.
[0017] Alternatively, the formula for the loss function is as follows:
[0018]
[0019]
[0020]
[0021] in, For the location loss function, For size loss function, For wire detection loss function, x i For the location information of the true bounding box, To predict location, λ loc and λ size These are the weighting coefficients used to balance positional and dimensional losses.
[0022] Optional, semantic segmentation network, including: encoder and decoder;
[0023] The semantic segmentation network encodes wire images with determined size loss, position loss, and wire detection loss using an encoder, and then decodes the encoder's encoding results using a decoder to obtain segmented image data.
[0024] Furthermore, this invention proposes a wire grasping system based on a robotic arm, comprising:
[0025] The image acquisition unit is used to acquire image data of the wire to be inspected based on the camera of the robotic arm, and to determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be inspected.
[0026] An image processing unit is used to input the image data as input data into a preset network and function for detection and segmentation, so as to obtain segmented image data of the image data;
[0027] The information capturing unit is used to extract the coordinates of the wire to be detected in the segmented image data, transform the initial coordinates of the wire to be detected into the coordinates of the wire to be detected in the base coordinate system of the robotic arm based on the transformation matrix, and capture the wire to be detected according to the coordinates of the wire to be detected.
[0028] Optionally, the image acquisition unit uses the TCP calibration method based on hand-eye calibration technology to determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be detected.
[0029] Optional, preset networks and functions, including: object detection network, loss function and semantic segmentation network;
[0030] The target detection network is used to detect the wire image of the wire to be detected in the obtained image data;
[0031] The loss function is used to determine the positional and dimensional losses of the wire image;
[0032] The semantic segmentation image is used to segment the wire image based on the position loss, size loss, and wire detection loss to obtain segmented image data.
[0033] Optional, the target detection network includes: a region fusion network, a key point fusion network, and a result fusion network;
[0034] The region fusion network is used to extract local feature information from image data multiple times, and fuse the local feature information obtained from the multiple extractions to obtain the key feature information of the image data;
[0035] The key point fusion network is used to calibrate the key point information in the key feature information multiple times based on the calibration matrix, and fuse the key point information calibrated multiple times to obtain a feature map with key point feature information.
[0036] The result fusion network is used to fuse key point feature information in the feature map to narrow the search range of key point feature information and obtain the wire image.
[0037] Alternatively, the formula for the loss function is as follows:
[0038]
[0039]
[0040]
[0041] in, For the location loss function, For size loss function, For wire detection loss function, x i For the location information of the true bounding box, To predict location, λ loc and λ size These are the weighting coefficients used to balance positional and dimensional losses.
[0042] Optional, semantic segmentation network, including: encoder and decoder;
[0043] The semantic segmentation network encodes wire images with determined size loss, position loss, and wire detection loss using an encoder, and then decodes the encoder's encoding results using a decoder to obtain segmented image data.
[0044] In another aspect, the present invention also provides a computing device, comprising: one or more processors;
[0045] A processor is used to execute one or more programs;
[0046] When the one or more programs are executed by the one or more processors, the method described above is implemented.
[0047] In another aspect, the present invention also provides a computer-readable storage medium having a computer program stored thereon, which, when executed, implements the method described above.
[0048] Compared with the prior art, the beneficial effects of the present invention are as follows:
[0049] This invention provides a method for grasping electrical wires based on a robotic arm, comprising: acquiring image data of a wire to be detected using a camera on the robotic arm; determining the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when acquiring the image data of the wire to be detected; inputting the image data as input data into a preset network and function for detection and segmentation to obtain segmented image data; extracting the coordinates of the wire to be detected from the segmented image data; transforming the initial coordinates of the wire to be detected into the coordinates of the wire to be detected in the base coordinate system of the robotic arm based on the transformation matrix; and grasping the information of the wire to be detected based on the coordinates of the wire to be detected. This invention uses a robotic arm for operation, avoiding the dangers to operators caused by manual operation and saving labor costs. Attached Figure Description
[0050] Figure 1 This is a flowchart of the method of the present invention;
[0051] Figure 2 This is a schematic diagram of the method of the present invention;
[0052] Figure 3 This is a schematic diagram of the target detection network in the method of the present invention;
[0053] Figure 4 This is a schematic diagram of the semantic segmentation network structure of the method of the present invention;
[0054] Figure 5 This is a structural diagram of the system of the present invention. Detailed Implementation
[0055] Exemplary embodiments of the invention will now be described with reference to the accompanying drawings. However, the invention may be embodied in many different forms and is not limited to the embodiments described herein. These embodiments are provided to fully and completely disclose the invention and to fully convey its scope to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the drawings is not intended to limit the invention. In the drawings, the same units / elements are referred to by the same reference numerals.
[0056] Unless otherwise stated, the terms used herein (including technical terms) have their common meaning as understood by one of ordinary skill in the art. Furthermore, it is understood that terms defined in commonly used dictionaries should be understood to have a meaning consistent with the context of their relevant field, and not to be interpreted as having an idealized or overly formal meaning.
[0057] Example 1:
[0058] This invention proposes a method for grasping electrical wires based on a robotic arm, such as... Figure 1 As shown, it includes:
[0059] Step 1: Acquire image data of the wire to be inspected based on the camera of the robotic arm, and determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be inspected.
[0060] Step 2: Input the image data as input data into a preset network and function for detection and segmentation to obtain segmented image data of the image data;
[0061] Step 3: Extract the coordinates of the wire to be detected from the segmented image data. Based on the transformation matrix, transform the initial coordinates of the wire to be detected into the coordinates of the wire to be detected in the base coordinate system of the robotic arm, and grab the wire to be detected according to the coordinates of the wire to be detected.
[0062] Among them, the TCP calibration method based on hand-eye calibration technology determines the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be detected.
[0063] The preset networks and functions include: object detection network, loss function and semantic segmentation network;
[0064] The target detection network is used to detect the wire image of the wire to be detected in the obtained image data;
[0065] The loss function is used to determine the positional and dimensional losses of the wire image;
[0066] The semantic segmentation image is used to segment the wire image based on the position loss, size loss, and wire detection loss to obtain segmented image data.
[0067] The target detection network includes: a region fusion network, a key point fusion network, and a result fusion network.
[0068] The region fusion network is used to extract local feature information from image data multiple times, and fuse the local feature information obtained from the multiple extractions to obtain the key feature information of the image data;
[0069] The key point fusion network is used to calibrate the key point information in the key feature information multiple times based on the calibration matrix, and fuse the key point information calibrated multiple times to obtain a feature map with key point feature information.
[0070] The result fusion network is used to fuse key point feature information in the feature map to narrow the search range of key point feature information and obtain the wire image.
[0071] The formula for the loss function is as follows:
[0072]
[0073]
[0074]
[0075] in, For the location loss function, For size loss function, For wire detection loss function, x i For the location information of the true bounding box, To predict location, λ loc and λ size These are the weighting coefficients used to balance positional and dimensional losses.
[0076] The semantic segmentation network includes an encoder and a decoder.
[0077] The semantic segmentation network encodes wire images with determined size loss, position loss, and wire detection loss using an encoder, and then decodes the encoder's encoding results using a decoder to obtain segmented image data.
[0078] The method of the present invention will be further explained below with reference to specific examples:
[0079] The specific implementation principle is as follows: Figure 2 As shown, it includes:
[0080] Step 1: Use hand-eye calibration technology to obtain the transformation matrix between the robot arm base coordinate system, camera coordinate system, and worktable coordinate system;
[0081] Step 2: Design a target detection network for detecting wires;
[0082] Step 3: Design the loss function required for detecting the wires;
[0083] Step 4: Design a semantic segmentation network for grasping wires;
[0084] Step 5: Convert the wire coordinates obtained from the depth camera into wire coordinates in the robot arm's base coordinate system to enable the robot arm to grasp the wire.
[0085] The method used in step 1 to obtain the transformation matrix between the robot arm's base coordinate system, camera coordinate system, and worktable coordinate system using hand-eye calibration technology is the TCP calibration method, which includes three parts: position calibration, rotation calibration, and load calibration. The TCP calibration method is applicable to both situations where the eye is on the hand and situations where the eye is off the hand. The specific operation steps are as follows: First, position calibration is performed. In free drag mode, the TCP point is aligned with a fixed target point and four sets of pose points are repeatedly collected. Then, the relevant TCP calibration settings are configured to automatically calculate the positional relationship between the worktable and the camera relative to the robot arm's base coordinate system. Next, rotation calibration is performed by dragging the robot arm so that the gripper's Z-axis is parallel to the Z-axis of the robot arm's base coordinate system. Finally, load calibration is performed by randomly selecting four points to calculate the gripper's weight and center of gravity.
[0086] The target detection network designed in step 2 for detecting wires consists of three parts, such as... Figure 3 As shown, these are the region fusion network, keypoint fusion network, and result fusion network, respectively. The target detection network sequentially feeds the image of the wire to be detected into the region fusion network, keypoint fusion network, and result fusion network to finally obtain the detection result of whether the wire exists. Specifically, the region fusion network extracts features from the image by region and obtains local feature information. The input image is sequentially fed into a convolutional layer, an activation layer, and a ROI pooling layer. This operation is repeated four times, and the results of the four processing steps are fused. The ROI pooling layer is used to crop specific regions and extract the feature representation of the cropped region. The keypoint fusion network mainly obtains key feature information of the image based on the region fusion network. First, it performs one convolution and one activation. Then, when further processing the image using a calibration matrix, it connects to a downsampling layer. This operation is performed twice, and the results are added to obtain the result. To obtain more crucial and complete image feature information, the calibration matrix is used to find the correspondence between the point cloud and each point in the image. Then, fusion is performed at the point granularity. When using the calibration matrix for image processing, connecting the downsampling layer can reduce computational complexity, identify more semantically informative feature representations, and expand the receptive field, enabling the model to better understand the semantics and structure in the image. The result fusion network further processes the feature map. It processes the feature map through a projection matrix to narrow the search range of key feature information, and then passes it through a key point fusion network to make the detection results more accurate.
[0087] The formula for the loss function required for the detection wire designed in step 3 is as follows:
[0088]
[0089]
[0090]
[0091] in, Here, x is the location loss function used to measure the difference between the predicted bounding box and the ground truth bounding box; i It is the position information of the actual bounding box; It is the predicted location of the model; The size loss function measures the difference between the predicted bounding box size and the actual bounding box size; s i This is the actual size information of the bounding box; It is the predicted size of the model; λ is the wire detection loss function, which is a weighted sum of position loss and size loss; loc and λ size It is a weighting coefficient used to balance position loss and size loss.
[0092] The L1 loss function, also known as the absolute value loss function, measures the absolute value of the difference between predicted and true values. For a pair of true values x... i and predicted value The L1 loss function is calculated as follows:
[0093]
[0094] It measures the absolute error between the two.
[0095] smooth L1 This means that some smoothing or softening has been applied to the L1 loss; this softening is typically used to make the loss function more flexible on x. i and The transition between them is smoother, rather than increasing abruptly like the standard L1 loss.
[0096] therefore, Indicates x i and The difference between them was addressed by applying a softened L1 loss function. The softened L1 loss can reduce the impact of outliers on the training process and make the model more stable.
[0097] i represents the index of each sample in the dataset that the loss function iterates through during training. The loss function calculates the loss for each sample, and then these losses are used to obtain the final loss value, which is used to optimize the model's parameters.
[0098] Among them, the semantic segmentation network designed in step 4 for grasping wires, such as Figure 4 As shown, the algorithm consists of two parts: an encoder and a decoder. The input image is processed sequentially by the encoder and decoder to obtain the final image segmentation result. The encoding structure includes one 1x1 convolution, one 1x1 pooling, two identical 3x3 convolutions, 3x3 pooling, ReLU activation function processing, 3x3 convolution, 3x3 pooling, one 1x1 convolution, and one 1x1 pooling. The decoding structure includes one 1x1 convolution, one 4x upsampling, one 3x3 convolution, one ReLU activation function processing, one 4x upsampling, one 3x3 convolution, and one ReLU activation function processing. The encoder part gradually extracts high-level semantic features from the input image through a series of convolution and pooling operations, and gradually reduces the size of the feature map. This allows for better capture of local details and global contextual information in the image, and the extraction of more representative feature representations. The decoder part gradually restores the feature map output by the encoder to the original resolution of the input image through upsampling and convolution operations. This enables the reconstruction of richer feature representations, which helps generate prediction results at the same resolution as the input image. This improves the accuracy of the segmentation results and the ability to preserve details, thus achieving better pixel-level semantic segmentation.
[0099] In step 5, the conversion of wire coordinates obtained from the depth camera into wire coordinates in the robot arm's base coordinate system has a specific order. The wire coordinates obtained from the depth camera are pixel coordinates relative to the camera coordinate system. First, the pixel coordinates must be converted into table coordinates relative to the table coordinate system. Then, the table coordinates must be converted into robot arm coordinates relative to the robot arm's base coordinate system. Only after this orderly conversion can the wire coordinates be used as actual coordinates for the robot arm's grasping.
[0100] The present invention can achieve the following beneficial effects:
[0101] (1) A well-designed target detection network enables faster and more accurate detection and identification of wires.
[0102] (2) By designing a loss function required for detecting wires, the convergence speed and performance of network training can be improved, the stability of wire detection can be improved, and the accuracy of wire localization can be enhanced.
[0103] (3) By using a well-designed semantic segmentation network to segment the edge information of the wire at the pixel level, the wire boundary can be located more accurately, thereby obtaining more accurate wire coordinates.
[0104] Example 2:
[0105] This invention proposes a wire grasping system 200 based on a robotic arm, such as... Figure 5 As shown, it includes:
[0106] The image acquisition unit 201 is used to acquire image data of the wire to be inspected based on the camera of the robotic arm, and to determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be inspected.
[0107] Image processing unit 202 is used to input the image data as input data into a preset network and function for detection and segmentation, so as to obtain segmented image data of the image data;
[0108] The information capturing unit 203 is used to extract the coordinates of the wire to be detected in the segmented image data, transform the initial coordinates of the wire to be detected into the coordinates of the wire to be detected in the base coordinate system of the robotic arm based on the transformation matrix, and capture the wire to be detected according to the coordinates of the wire to be detected.
[0109] Among them, the image acquisition unit 201 uses the TCP calibration method based on hand-eye calibration technology to determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be detected.
[0110] The preset networks and functions include: object detection network, loss function and semantic segmentation network;
[0111] The target detection network is used to detect the wire image of the wire to be detected in the obtained image data;
[0112] The loss function is used to determine the positional and dimensional losses of the wire image;
[0113] The semantic segmentation image is used to segment the wire image based on the position loss, size loss, and wire detection loss to obtain segmented image data.
[0114] The target detection network includes: a region fusion network, a key point fusion network, and a result fusion network.
[0115] The region fusion network is used to extract local feature information from image data multiple times, and fuse the local feature information obtained from the multiple extractions to obtain the key feature information of the image data;
[0116] The key point fusion network is used to calibrate the key point information in the key feature information multiple times based on the calibration matrix, and fuse the key point information calibrated multiple times to obtain a feature map with key point feature information.
[0117] The result fusion network is used to fuse key point feature information in the feature map to narrow the search range of key point feature information and obtain the wire image.
[0118] The formula for the loss function is as follows:
[0119]
[0120]
[0121]
[0122] in, For the location loss function, For size loss function, For wire detection loss function, x i For the location information of the true bounding box, To predict location, λ loc and λ size These are the weighting coefficients used to balance positional and dimensional losses.
[0123] The semantic segmentation network includes an encoder and a decoder.
[0124] The semantic segmentation network encodes wire images with determined size loss, position loss, and wire detection loss using an encoder, and then decodes the encoder's encoding results using a decoder to obtain segmented image data.
[0125] This invention uses a robotic arm for operation, avoiding the dangers to operators caused by manual operation and saving labor costs.
[0126] Example 3:
[0127] Based on the same inventive concept, this invention also provides a computer device, which includes a processor and a memory. The memory stores a computer program, which includes program instructions. The processor executes the program instructions stored in the computer storage medium. The processor may be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate arrays (FPGAs), or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. It is the computing and control core of the terminal, suitable for implementing one or more instructions, specifically suitable for loading and executing one or more instructions in the computer storage medium to implement corresponding method flows or corresponding functions, thereby implementing the steps of the methods in the above embodiments.
[0128] Example 4:
[0129] Based on the same inventive concept, this invention also provides a storage medium, specifically a computer-readable storage medium (Memory), which is a memory device in a computer device used to store programs and data. It is understood that the computer-readable storage medium here can include both the built-in storage medium in the computer device and extended storage media supported by the computer device. The computer-readable storage medium provides storage space that stores the terminal's operating system. Furthermore, this storage space also stores one or more instructions suitable for loading and execution by a processor. These instructions can be one or more computer programs (including program code). It should be noted that the computer-readable storage medium here can be high-speed RAM or non-volatile memory, such as at least one disk storage device. The processor can load and execute one or more instructions stored in the computer-readable storage medium to implement the steps of the method in the above embodiments.
[0130] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code. The solutions in the embodiments of the present invention can be implemented using various computer languages, such as the object-oriented programming language Java and the interpreted scripting language JavaScript.
[0131] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0132] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.
[0133] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0134] Although preferred embodiments of the invention have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including both the preferred embodiments and all changes and modifications falling within the scope of the invention.
[0135] Obviously, those skilled in the art can make various modifications and variations to this invention without departing from its spirit and scope. Therefore, if these modifications and variations fall within the scope of the claims of this invention and their equivalents, this invention also intends to include these modifications and variations.
Claims
1. A method for grasping electrical wires based on a robotic arm, characterized in that, The wire grasping method includes: The robot arm acquires image data of the wire to be inspected using its camera, and determines the transformation matrix between the robot arm's base coordinate system, the robot arm's camera coordinate system, and the robot arm's worktable coordinate system when the robot arm acquires the image data of the wire to be inspected. The image data is input into a preset network and function for detection and segmentation to obtain segmented image data. The coordinates of the wire to be detected in the segmented image data are extracted. Based on the transformation matrix, the initial coordinates of the wire to be detected are transformed into the coordinates of the wire to be detected in the base coordinate system of the robotic arm. The wire to be detected is then grasped according to the coordinates of the wire to be detected. The preset networks and functions include: an object detection network, a loss function, and a semantic segmentation network; The target detection network is used to detect the wire image of the wire to be detected in the obtained image data; The loss function is used to determine the positional and dimensional losses of the wire image; The semantic segmentation image is used to segment the wire image based on the position loss, size loss, and wire detection loss to obtain segmented image data; The formula for the loss function is as follows: in, For the location loss function, For size loss function, For wire detection loss function, For the location information of the true bounding box, To predict location, and These are the weighting coefficients used to balance positional and dimensional losses.
2. The wire gripping method according to claim 1, characterized in that, Based on the TCP calibration method of hand-eye calibration technology, the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm is determined when the robotic arm acquires the image data of the wire to be detected.
3. The wire gripping method according to claim 1, characterized in that, The target detection network includes: a region fusion network, a key point fusion network, and a result fusion network; The region fusion network is used to extract local feature information of image data multiple times, and fuse the local feature information obtained from the multiple extractions to obtain the key feature information of the image data; The key point fusion network is used to calibrate the key point information in the key feature information multiple times based on the calibration matrix, and fuse the key point information calibrated multiple times to obtain a feature map with key point feature information. The result fusion network is used to fuse key point feature information in the feature map to narrow the search range of key point feature information and obtain the wire image.
4. The wire gripping method according to claim 1, characterized in that, The semantic segmentation network includes: an encoder and a decoder; The semantic segmentation network encodes the wire image, which has determined size loss, position loss, and wire detection loss, through an encoder, and then decodes the encoder's encoding result through a decoder to obtain segmented image data.
5. A wire grasping system based on a robotic arm, characterized in that, The wire gripping system includes: The image acquisition unit is used to acquire image data of the wire to be inspected based on the camera of the robotic arm, and to determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be inspected. An image processing unit is used to input the image data as input data into a preset network and function for detection and segmentation, so as to obtain segmented image data of the image data; The information capturing unit is used to extract the coordinates of the wire to be detected in the segmented image data, transform the initial coordinates of the wire to be detected into the coordinates of the wire to be detected in the base coordinate system of the robotic arm based on the transformation matrix, and capture the wire to be detected according to the coordinates of the wire to be detected. The preset networks and functions include: an object detection network, a loss function, and a semantic segmentation network; The target detection network is used to detect the wire image of the wire to be detected in the obtained image data; The loss function is used to determine the positional and dimensional losses of the wire image; The semantic segmentation image is used to segment the wire image based on the position loss, size loss, and wire detection loss to obtain segmented image data; The formula for the loss function is as follows: in, For the location loss function, For size loss function, For wire detection loss function, For the location information of the true bounding box, To predict location, and These are the weighting coefficients used to balance positional and dimensional losses.
6. The wire gripping system according to claim 5, characterized in that, The image acquisition unit uses the TCP calibration method based on hand-eye calibration technology to determine the transformation matrix between the base coordinate system of the robotic arm, the camera coordinate system of the robotic arm, and the worktable coordinate system of the robotic arm when the robotic arm acquires the image data of the wire to be detected.
7. The wire gripping system according to claim 5, characterized in that, The target detection network includes: a region fusion network, a key point fusion network, and a result fusion network; The region fusion network is used to extract local feature information of image data multiple times, and fuse the local feature information obtained from the multiple extractions to obtain the key feature information of the image data; The key point fusion network is used to calibrate the key point information in the key feature information multiple times based on the calibration matrix, and fuse the key point information calibrated multiple times to obtain a feature map with key point feature information. The result fusion network is used to fuse key point feature information in the feature map to narrow the search range of key point feature information and obtain the wire image.
8. The wire gripping system according to claim 5, characterized in that, The semantic segmentation network includes: an encoder and a decoder; The semantic segmentation network encodes the wire image, which has determined size loss, position loss, and wire detection loss, through an encoder, and then decodes the encoder's encoding result through a decoder to obtain segmented image data.
9. A computer device, characterized in that, include: One or more processors; A processor is used to execute one or more programs; When the one or more programs are executed by the one or more processors, the method described in any one of claims 1-4 is implemented.
10. A computer-readable storage medium, characterized in that, It contains a computer program, which, when executed, implements the method as described in any one of claims 1-4.