A method and system for monitoring the safety of lifting equipment

By using machine vision technology to identify and analyze key areas of lifting equipment, combined with safety analysis models and personnel identification, the problem of insufficient identification accuracy due to human factors and complex backgrounds in the safety monitoring of large lifting equipment has been solved, achieving high accuracy and real-time safety monitoring.

CN122244761APending Publication Date: 2026-06-19VITAL INT ELEVATORING EQUIP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
VITAL INT ELEVATORING EQUIP
Filing Date
2026-03-23
Publication Date
2026-06-19

Smart Images

  • Figure CN122244761A_ABST
    Figure CN122244761A_ABST
Patent Text Reader

Abstract

This invention provides a method and system for safety monitoring of lifting equipment. The method includes the following steps: S1, acquiring real-time video monitoring data of the lifting equipment's operating area; S2, extracting key areas from the acquired video monitoring data and marking these key areas, which include the hook area, the load area, and the sling area; S3, extracting the image to be analyzed from the video monitoring data based on the marked key areas; and S4, using a trained safety analysis model to perform safety analysis processing on the image to be analyzed, obtaining the safety analysis results. This invention helps improve the overall performance of machine vision-based safety monitoring of lifting equipment.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of safety monitoring technology for lifting equipment, and in particular to a method and system for safety monitoring of lifting equipment. Background Technology

[0002] Large lifting equipment (such as tower cranes) is a core piece of equipment in scenarios such as building construction, steel structure installation, precast component hoisting, port loading and unloading, and large equipment placement. Because large lifting equipment is prone to safety hazards in various aspects during operation, a comprehensive safety monitoring strategy is necessary.

[0003] Currently, safety monitoring of large lifting equipment typically involves collaboration between operators at the tower top and ground safety personnel to identify and prevent potential safety hazards during tower crane operations. However, manual monitoring is inherently susceptible to human error (for example, safety personnel need to monitor the hook, the area under the boom, and the surrounding area simultaneously), which can compromise the reliability of the monitoring.

[0004] Existing technologies also include intelligent monitoring solutions for the safety of large lifting equipment during operation based on machine vision technology. However, current image recognition-based safety monitoring and assessment models for lifting equipment still lack sufficient accuracy in safety identification under complex background scenarios. Alternatively, solutions equipped with more advanced and complex computing models cannot meet the real-time requirements of safety monitoring, resulting in the effectiveness of safety monitoring for the operation of large lifting equipment failing to meet the needs of practical applications. Summary of the Invention

[0005] To address the aforementioned problems, this invention aims to provide a method and system for safety monitoring of lifting equipment.

[0006] The objective of this invention is achieved through the following technical solution: Firstly, a method for safety monitoring of lifting equipment is proposed, including the following steps: S1 acquires real-time video monitoring data of the operating area of ​​the lifting equipment; S2 extracts key areas from the acquired video monitoring data and marks these key areas, which include the hook area, the load area, and the sling area. S3 extracts the image to be analyzed from the video monitoring data based on the marked key areas; S4 uses a trained security analysis model to perform security analysis on the image to be analyzed, and obtains the security analysis results.

[0007] Preferably, step S2 includes: S21 extracts continuous frame data based on the acquired video monitoring data. ,in This represents the frame corresponding to time t; S22 based on the current frame Extract the hook area from the frame. and hoisting area ; S23 Based on the obtained hook area and hoisting area Further from the hook area and hoisting area Extracting sling areas ; S24 Based on the obtained hook area hoisting area and sling area Key regions that make up the current frame .

[0008] Preferably, step S21 further includes; Based on the obtained image frames Preprocessing is performed to eliminate noise interference in the image frames.

[0009] Preferably, step S22 includes: Based on the current frame, a template matching method is used to identify the hook and load from the video monitoring image, and they are marked as hook areas. and hoisting area .

[0010] Preferably, step S23 includes: Based on the hook area and load area in the current frame, extract the center point coordinates of the hook area and load area respectively. and ; Center point of the hook area Using these as anchor points, N pre-designed directional templates are applied to obtain the corresponding compensation areas. A directional filter corresponding to the direction of the compensation area is used to filter the pixels within the compensation area to obtain the directional feature values ​​of each pixel within the compensation area. The directional eigenvalue calculation function used is:

[0011] in, Represents pixels The corresponding directional eigenvalues, ,in This indicates the nth compensation region. Corresponding to the current moment; This represents the directional filter corresponding to the nth compensation region. This indicates that a directional filter is used to align the pixels. The output value obtained at that time; Based on the pixel value change characteristics of pixels within the compensation area over a period of time, obtain the change characteristic values ​​of each pixel within the compensation area. The function used to calculate the changing eigenvalues ​​is:

[0012] in, Represents pixels The corresponding characteristic values ​​of change, and Representing pixels High update parameters and low update parameters, among which , , in and These represent high update factor and low update factor, respectively. , ; , represents the time variable. Corresponding to the current moment, This represents the total number of sampling moments over a given period of time. Represents pixels The corresponding grayscale value; This represents the set normalization threshold. Based on the obtained directional and change feature values, the sling feature values ​​of each pixel within the compensation area are calculated. The sling feature value calculation function used is as follows:

[0013] in, Represents pixels The corresponding sling characteristic value, Represents the normalization function. This represents the set compensation factor, where , Represents pixels The corresponding directional eigenvalues, Represents pixels The corresponding characteristic values ​​of change, Represents pixels The corresponding grayscale value, Represents pixels Low update parameters; This represents the set adjustment factor, where ; Based on the obtained sling characteristic values With the set threshold When comparing, At that time, mark the corresponding pixel. For pixels at the edge of the sling; Further connectivity processing is performed on the marked sling edge pixels to obtain the sling region. .

[0014] Preferably, step S3 includes: S31 Based on the obtained key areas Extract the hoisting area Corresponding surrounding areas and projection area ; S32 Extract hook area and sling area Corresponding surrounding areas and ; S33 is based on the critical area and the surrounding area of ​​the hoisting area. and projection area and hook area surrounding areas Sling area surrounding areas Areas that have gained attention.

[0015] Preferably, step S4 includes: Based on the obtained image to be analyzed, a personnel recognition model built on YOLOv4 is used to detect and identify personnel in the image. When a person is identified in the image, it is determined that there is a personnel safety hazard, and the corresponding personnel safety anomaly analysis result is output.

[0016] Preferably, the method further includes: S5 generates corresponding warning messages or braking control commands based on the safety analysis results.

[0017] Secondly, a safety monitoring system for lifting equipment is proposed, including a processor; The processor is used to execute the lifting equipment safety monitoring method as described in any of the embodiments of the first aspect above.

[0018] The beneficial effects of this invention are as follows: It proposes a safety monitoring method for lifting equipment. Based on the acquired video monitoring data of the lifting equipment's operating area, machine vision is used to perform safety analysis on the video monitoring data, thereby accurately monitoring the safety status of the lifting equipment and preventing safety accidents. Specifically, based on the acquired video monitoring data, an image analysis method is proposed that can accurately acquire the sling area. By accurately detecting the sling area in the image and extracting the image to be analyzed for targeted safety analysis, it can effectively improve the accuracy and anti-interference ability of monitoring the safety status near the sling area of ​​the lifting equipment, thus improving the overall performance of machine vision in safety monitoring of lifting equipment.

[0019] This invention also proposes a technical solution for targeted extraction of the sling portion based on monitoring images, thereby accurately extracting the sling area, reducing background interference, and effectively avoiding interference from background information when performing relevant safety analysis based on the sling area, thus improving the accuracy and robustness of safety monitoring.

[0020] Meanwhile, the safety monitoring method proposed in this invention can also ensure accuracy while taking into account real-time performance, thereby meeting the needs of on-site safety monitoring of lifting equipment. Attached Figure Description

[0021] The present invention will be further described with reference to the accompanying drawings, but the embodiments in the drawings do not constitute any limitation on the present invention. For those skilled in the art, other drawings can be obtained based on the following drawings without creative effort.

[0022] Figure 1 This is a flowchart illustrating a method for safety monitoring of lifting equipment according to an embodiment of the present invention; Figure 2 This is a structural diagram of a safety monitoring system for lifting equipment according to an embodiment of the present invention. Detailed Implementation

[0023] The present invention will be further described in conjunction with the following application scenarios.

[0024] See Figure 1 The embodiment illustrates a method for safety monitoring of lifting equipment, comprising the following steps: S1 acquires real-time video monitoring data of the operating area of ​​the lifting equipment; S2 extracts key areas from the acquired video monitoring data and marks these key areas, which include the hook area, the load area, and the sling area. S3 extracts the image to be analyzed from the video monitoring data based on the marked key areas; S4 uses a trained security analysis model to perform security analysis on the image to be analyzed, and obtains the security analysis results.

[0025] The present invention proposes a method for safety monitoring of lifting equipment. Based on acquired video monitoring data of the lifting equipment's operating area, machine vision is used to perform safety analysis on the video monitoring data, thereby accurately monitoring the safety status of the lifting equipment and preventing accidents. Specifically, based on the acquired video monitoring data, an image analysis method is proposed that can accurately identify the sling area. By accurately detecting the sling area in the image and extracting the image to be analyzed for targeted safety analysis, the accuracy and anti-interference ability of safety monitoring near the sling area of ​​the lifting equipment can be effectively improved, thus enhancing the effectiveness and robustness of machine vision in lifting equipment safety monitoring.

[0026] Traditional machine vision real-time safety monitoring solutions often overlook the sling section when analyzing lifting equipment. Even when the sling section is included, its extraction is frequently affected by background information, leading to interference from nearby background features and potential misjudgments, thus impacting the effectiveness of machine vision safety monitoring. Maintaining accuracy despite background interference requires complex, high-performance image analysis models, increasing data processing load and compromising real-time performance. Therefore, this invention proposes a targeted extraction technique for the sling section based on monitoring images. This accurately extracts the sling region, reduces background interference, and effectively avoids background information interference during safety analysis, improving both accuracy and robustness.

[0027] Meanwhile, the safety monitoring method proposed in this invention can ensure both accuracy and real-time performance, thus meeting the needs of on-site safety monitoring of lifting equipment. It achieves higher accuracy when using an image processing model with equivalent performance, and compared to solutions achieving the same accuracy, it has a faster processing speed, improving the overall performance of machine vision-based safety monitoring of lifting equipment.

[0028] Preferably, step S1 includes: The system acquires real-time video monitoring data of the crane's operating area, which is collected by an industrial high-definition camera mounted on the crane tower with the camera pointing downwards at the hook.

[0029] In another scenario, cameras can be positioned at fixed locations based on the operating area of ​​the lifting equipment to collect video monitoring data from a distance as the lifting equipment operates over or within the entire operating area. For example, cameras can be mounted on temporary rooftops or construction scaffolding platforms, capturing images of the lifting equipment's operating area at a 30-60 degree downward angle.

[0030] Optionally, the industrial HD camera has a frame rate of 30 frames per second and a resolution of 1080p.

[0031] The specific design of the camera adopts a similar approach to existing technologies, without requiring stringent setup conditions.

[0032] Preferably, step S2 includes: S21 extracts continuous frame data based on the acquired video monitoring data. ,in This represents the frame corresponding to time t; S22 based on the current frame Extract the hook area from the frame. and hoisting area ; S23 Based on the obtained hook area and hoisting area Further from the hook area and hoisting area Extracting sling areas ; S24 Based on the obtained hook area hoisting area and sling area Key regions that make up the current frame .

[0033] The above-described embodiments of the present invention, based on the acquired video monitoring data, further extract information from the image to accurately identify the key areas of the lifting equipment in the image, laying the foundation for subsequent determination of the monitoring area and extraction of the image to be analyzed. By extracting the key areas, the position of the lifting equipment in the image can be accurately detected, thereby establishing the area requiring safety monitoring based on this correspondence. This effectively avoids interference caused by unnecessary background information and improves the effectiveness of safety monitoring.

[0034] Among the key aspects requiring safety monitoring of lifting equipment are the hook, load, and sling. During operation, the swinging or lifting of the equipment can pose safety hazards, such as sweeping collisions or crushing injuries to personnel. Therefore, extracting the areas that the hook, load, and sling may sweep during movement is crucial. For sling sweep areas, identifying the sling's location is a prerequisite. This invention specifically identifies and extracts sling sweep areas from images, improving the accuracy of sling sweep area detection.

[0035] Preferably, step S21 further includes; Based on the obtained image frames Preprocessing is performed to eliminate noise interference in the image frames.

[0036] Among these, the obtained video monitoring data can be preprocessed, such as denoising, to improve image quality and clarity.

[0037] Preferably, step S22 includes: Based on the current frame, a template matching method is used to identify the hook and load from the video monitoring image, and they are marked as hook areas. and hoisting area .

[0038] In the initialization phase, by pre-recording image templates of the hook and the load, the target detection of the current frame can be performed based on the image templates, thereby accurately obtaining the location of the hook and the load in the frame image and marking the corresponding areas.

[0039] In another scenario, based on the current frame, the hook and load can also be identified and tracked by recognizing moving foreground targets, thereby obtaining the hook area and load area in the current frame.

[0040] Since hooks and loads have obvious features in images, existing methods such as template matching can be used to identify them, thereby accurately identifying their positions and laying the foundation for subsequent safety analysis. This also serves as the basis for sling identification.

[0041] Preferably, step S23 includes: Based on the hook area and load area in the current frame, extract the center point coordinates of the hook area and load area respectively. and ; Center point of the hook area Using these as anchor points, N pre-designed directional templates are applied to obtain the corresponding compensation areas. A directional filter corresponding to the direction of the compensation area is used to filter the pixels within the compensation area to obtain the directional feature values ​​of each pixel within the compensation area. The directional eigenvalue calculation function used is:

[0042] in, Represents pixels The corresponding directional eigenvalues, ,in This indicates the nth compensation region. Corresponding to the current moment; This represents the directional filter corresponding to the nth compensation region. This indicates that a directional filter is used to align the pixels. The output value obtained at that time; Based on the pixel value change characteristics of pixels within the compensation area over a period of time, obtain the change characteristic values ​​of each pixel within the compensation area. The function used to calculate the changing eigenvalues ​​is:

[0043] in, Represents pixels The corresponding characteristic values ​​of change, and Representing pixels High update parameters and low update parameters, among which , , in and These represent high update factor and low update factor, respectively. , ; , represents the time variable. Corresponding to the current moment, This represents the total number of sampling moments over a given period of time. Represents pixels The corresponding grayscale value; This represents the set normalization threshold. Based on the obtained directional and change feature values, the sling feature values ​​of each pixel within the compensation area are calculated. The sling feature value calculation function used is as follows:

[0044] in, Represents pixels The corresponding sling characteristic value, Represents the normalization function. This represents the set compensation factor, where , Represents pixels The corresponding directional eigenvalues, Represents pixels The corresponding characteristic values ​​of change, Represents pixels The corresponding grayscale value, Represents pixels Low update parameters; This represents the set adjustment factor, where ; Based on the obtained sling characteristic values With the set threshold When comparing, At that time, mark the corresponding pixel. For pixels at the edge of the sling; Further connectivity processing is performed on the marked sling edge pixels to obtain the sling region. .

[0045] The direction template is designed with N directions, where N = 8, 12, or 16, depending on the required precision and computational power. For example, if N = 8, the direction template is designed to point to {0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°, 360°}. If N = 12, the direction template is designed to point to {0°, 30°, 60°, 90°, 120°, 150°, 180°, 210°}. 240°, 270°, 300°, 330°, 360°}; where each direction template is a strip-shaped region pointing in the specified direction, the width of the strip-shaped region is D = 5, 7, 9, ..., 15 pixels, and the length L of the strip-shaped region is set empirically, for example, L = 100, 150, 200, 300, depending on the precision requirements and computing power; the starting point of the direction template is aligned with the anchor point, and the area covered by the direction template extending in the corresponding direction is used as the compensation area. In one scenario, to further improve the accuracy of the compensation area acquisition, a maximum length can be set for the compensation area. Alternatively, the length can be determined based on the coordinates of the center point of the obtained lifting area. Set a threshold line (usually a horizontal line parallel to the x-axis, or a horizontal line set according to other coordinate systems) to limit the lower boundary of the compensation area.

[0046] The directional filter is designed such that the directional angle of the compensation region is... The corresponding directional filter is adopted. or Directional filters are used to accurately detect texture features extending in the same direction within the compensation region, thereby improving the adaptability of detecting sling features and sling regions.

[0047] In one scenario, when acquiring pixel value change features over a past period, the specific time period can be selected as the past 0.5 seconds or 1 second, and the number of sampling moments can be 15 or 30. The specific timeframe depends on the actual operation of the equipment and the time required for the sling to pass through a uniform pixel location.

[0048] In one scenario, the pixel value of a pixel can be selected from its grayscale value.

[0049] In one scenario, the normalized threshold The value is set based on experience. .

[0050] In one scenario, high update parameters for pixels. and low update parameters The method of obtaining this information can be achieved by continuously updating frequently updated parameters in the background. and low update parameters Perform continuous updates and maintenance (update at fixed intervals or continuously), and invoke them when needed to reduce the processing load required for analysis.

[0051] In one scenario, the normalization function A sigmoid function can be used as a normalization function, such as the sigmoid function.

[0052] In one scenario, connectivity processing is performed based on the marked sling edge pixels. This can be achieved by first dilating and then eroding the obtained sling edge pixels to obtain the sling region. Alternatively, connectivity repair can be performed on the obtained sling edge pixels, and the area enclosed by the repaired sling edge pixels can be marked as the sling region. .

[0053] Because there is a background component between the hook and the load, traditional image processing techniques typically include the area between the hook and sling as the critical area of ​​the lifting equipment. However, this critical area contains a significant amount of background interference, impacting accuracy in subsequent safety analysis. Furthermore, traditional sling recognition techniques suffer from low accuracy due to the indistinct features of the sling in the image (the sling's pixel width is typically small, similar to background interference), coupled with uncertain background interference near the sling. This makes accurate sling identification insufficient (especially in cases where there is more than one sling between the hook and the load, where background components exist between the slings, preventing direct identification based on the hook and load).

[0054] The present invention proposes a targeted sling identification scheme. First, the identified hook is used as an anchor point in the image. Using this anchor point as a reference, a directional template emanating from the anchor point is used as a basis to initially delineate compensation regions where slings may exist. Then, utilizing the angular characteristics of the sling (from the hook towards a certain direction), edge features consistent with the direction of the compensation region are extracted to obtain directional feature values ​​for pixels. The higher the consistency between the edge direction in the compensation region and the direction of the compensation region, the larger the directional feature value. Through these directional feature values, the edge features of the sling and interference features in the background can be effectively filtered. Furthermore, considering that a single directional feature value cannot accurately filter out the sling portion, a specially designed variable feature value is incorporated, taking into account the characteristics of background interference and the slow movement or potential dwell time of the sling in actual operation, to further accurately filter out the sling portion.

[0055] For relatively fixed or slightly changing backgrounds, the difference between high and low update parameters is small, resulting in small change feature values. When the sling passes slowly, the change feature value increases and reaches its maximum after a short lag (allowing accurate analysis of the rear edge). For rapidly changing background interference (e.g., lights, welding sparks), the change feature value briefly increases but cannot sustain a large increase. For backgrounds that are also slowly changing, although the change feature value increases, its directional feature value is small, still allowing for accurate sling selection. Based on the directional and change feature values ​​corresponding to pixels, a sling feature value calculation function is proposed to combine directional and change features for edge selection. An adjustment term is also included to further suppress strong light points and noise points, accurately extracting and selecting sling edge features.

[0056] Further connectivity processing is performed based on the obtained sling features to accurately obtain the sling region.

[0057] The sling area extraction scheme proposed based on the above embodiments can accurately extract the sling area. This provides a foundation for further acquisition of potentially swept surrounding areas based on the extracted sling area and for conducting security analysis of the surrounding areas.

[0058] The sling area extracted using the above method can accurately identify the sling position in the image, especially in lifting scenarios with multiple slings, avoiding interference and misjudgment caused by background parts between slings in subsequent safety monitoring (traditional image processing models are prone to misjudging abnormal interference in the background area between slings as a safety anomaly, when in fact it is a background anomaly), thus improving the targeting of subsequent surrounding area extraction and safety monitoring of the surrounding area.

[0059] In one scenario, the key area obtained in step S23 above is the hook part + the lifting part + the sling part, excluding any background part.

[0060] Preferably, step S3 includes: S31 Based on the obtained key areas Extract the hoisting area Corresponding surrounding areas and projection area ; S32 Extract hook area and sling area Corresponding surrounding areas and ; S33 is based on the critical area and the surrounding area of ​​the hoisting area. and projection area and hook area surrounding areas Sling area surrounding areas Areas that have gained attention.

[0061] Based on the obtained key areas, further extraction is performed on the possible sweep areas of the hook, load, and sling, as well as the projected area below the load, thereby extracting the image to be analyzed.

[0062] In one scenario, based on the obtained key area The lifting area in Hook area and sling area Morphological dilation calculations can be performed on the outer contour of the current key region to obtain the neighboring regions corresponding to the outer contour, including the hoisting region. The adjacent area corresponding to the outer contour is the surrounding area corresponding to the hoisting area. The area adjacent to the outer contour of the hook area is considered as the surrounding area of ​​the hook area. The area adjacent to the outer contour of the sling area is considered as the surrounding area of ​​the sling area. ; Furthermore, based on the surrounding area corresponding to the hoisting area, the corresponding projection area is obtained according to a pre-set vertical baseline. .

[0063] Based on the obtained regions, a corresponding region of interest mask is generated; the current video detection frame is then cropped based on the region of interest mask to obtain the image to be analyzed.

[0064] In one scenario, when performing morphological dilation calculations based on the outer contour of the current key region, the degree of dilation needs to be reasonably set according to parameters such as the distance in the specific image and the actual response time, so as to obtain a surrounding area of ​​appropriate size. This allows for accurate feedback of the crane's operating area through the detection of the surrounding area. In one scenario, to reduce data processing load, after obtaining the region of interest mask, the mask can be extended for a certain period (e.g., 1 second or 0.5 seconds). This is achieved by simply aligning the center point of the corresponding hook region in the region of interest mask with the center point of the hook region obtained in the current frame. Considering the limited displacement rate of the device hook, selecting a reasonable data processing interval to adapt to the performance of the current data processing equipment can meet the requirements of reliability and real-time performance.

[0065] After a period of time, the mask of the region of interest is updated through steps S2 and S3.

[0066] In one scenario, where multiple slings are used in conjunction with the hook and the load, the area between the slings in the extracted image to be analyzed is not included in the region of interest mask. Therefore, the background content of this area is not included in the cropped image to be analyzed.

[0067] By extracting images from video monitoring footage using the methods described above, we can focus on key areas to the greatest extent possible during subsequent security analysis, thereby improving the effectiveness of the security analysis.

[0068] Preferably, step S4 includes: Based on the obtained image to be analyzed, a trained security analysis model is used to perform security analysis and obtain the security analysis results.

[0069] Safety analysis and processing includes personnel safety analysis and equipment safety analysis.

[0070] In one scenario, safety analysis is performed based on the image to be analyzed, with a focus on identifying pedestrians and objects in the ground area and objects within the sweeping zone, thereby determining the safety status of the current lifting equipment.

[0071] Preferably, step S4 includes: Based on the obtained image to be analyzed, a personnel recognition model built on YOLOv4 is used to detect and identify personnel in the image. When a person is identified in the image, it is determined that there is a personnel safety hazard, and the corresponding personnel safety anomaly analysis result is output.

[0072] For pedestrians on the ground, YOLOv4 can be used to quickly identify pedestrians in the images to be analyzed, thereby completing safety monitoring of people near the suspended projection area.

[0073] Preferably, step S4 includes: Based on the obtained image to be analyzed, a trained security analysis model based on a neural network is used to process the image to identify anomalies in the image and obtain the device security analysis results. In one scenario, the security analysis model is built on a CNN neural network, which includes an input layer, a first convolutional module, a second convolutional module, a third convolutional module, a fourth convolutional module, a global average pooling layer, a Dropout layer, an output classification layer, and a Softmax layer connected in sequence. The input layer is used to input the extracted image to be analyzed, where the image is padded to a rectangular image of a specified size. The first convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max-pooling layer connected in sequence. The convolutional layer uses a kernel size of [missing information]. The convolutional kernels have 64 kernels and a stride of 1; the normalization layer normalizes the feature maps output by the convolutional layers; the activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel size of [missing value]. The stride is 2; the third convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max pooling layer connected in sequence; the convolutional layer uses a kernel size of [missing value]. The convolutional kernels have 128 kernels and a stride of 1; the normalization layer normalizes the feature maps output by the convolutional layers; the activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel size of [missing value]. The stride is 2; the fourth convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max pooling layer connected in sequence; the convolutional layer uses a kernel size of [missing value]. The number of convolutional kernels is 256, and the stride is 1. The normalization layer normalizes the feature maps output by the convolutional layers. The activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel size of [missing value]. The stride is 2; the global average pooling layer performs spatial averaging on the output of the fourth convolutional module; the fully connected layer uses ReLU as the activation function; the Dropout layer sets the Dropout probability to 0.5; the output classification layer unifies the features into 3-dimensional features through a fully connected operation; the Softmax layer uses the softmax function to calculate the probability of the security analysis results, which include security, warning, and anomaly.

[0074] During the model training phase, the model is trained by using images of pre-prepared safe conditions, warning conditions (such as images of birds or other creatures around the equipment), and abnormal conditions (such as images of equipment colliding with other objects (buildings, sheds), swaying paths appearing, or foreign objects appearing in the projection area (such as swaying paths (especially sling and load-bearing swaying paths), buildings, greenery; people, vehicles, piles of building materials, etc. appearing in the projection area).

[0075] In practical applications, the training set can be further improved based on the continuously collected images, allowing for continuous training and updating of the security analysis model and enhancing its effectiveness.

[0076] Alternatively, a trained neural network can be used to perform safety anomaly analysis on images. By learning and training on anomalies occurring in the sweeping zone, it is possible to accurately analyze anomalies occurring in the sweeping zone, thereby accurately detecting safety anomalies in the operating area of ​​the lifting equipment.

[0077] Preferably, the method further includes: S5 generates corresponding warning messages or braking control commands based on the safety analysis results.

[0078] In one scenario, based on the obtained safety analysis results, when the safety analysis result is a safety warning, a corresponding warning message is generated and an instruction is issued to indicate possible safety anomalies; when the safety analysis result is an anomaly (including personnel safety anomalies and equipment safety anomalies), a braking control instruction is generated to stop the operating lifting equipment, and the lifting equipment is restarted after the safety anomaly is resolved.

[0079] When safety analysis detects an abnormal situation, it can automatically brake the lifting equipment or issue a safety alarm, thereby helping to eliminate or prevent the occurrence of safety anomalies.

[0080] See Figure 2The document illustrates a safety monitoring system for lifting equipment, comprising a camera and a processor; wherein the camera and processor are connected wirelessly. The processor is used to perform the above-described operations. Figure 1 The embodiment illustrates the method for safety monitoring of lifting equipment, and the specific implementation methods for each step.

[0081] The processor can be built on local servers or cloud servers.

[0082] It should be noted that the functional units / modules in the various embodiments of the present invention can be integrated into one processing unit / module, or each unit / module can exist physically separately, or two or more units / modules can be integrated into one unit / module. The integrated unit / module described above can be implemented in hardware or in the form of software functional units / modules.

[0083] From the above description of the embodiments, those skilled in the art will clearly understand that the embodiments described herein can be implemented in hardware, software, firmware, middleware, code, or any suitable combination thereof. For hardware implementation, the processor can be implemented in one or more of the following units: Application-Specific Integrated Circuit (ASIC), Digital Signal Processor (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), processor, controller, microcontroller, microprocessor, other electronic units designed to implement the functions described herein, or combinations thereof. For software implementation, some or all of the processes of the embodiments can be implemented by a computer program instructing the associated hardware. During implementation, the program can be stored in a computer-readable medium or transmitted as one or more instructions or code on a computer-readable medium. Computer-readable media include computer storage media and communication media, wherein communication media include any medium that facilitates the transmission of a computer program from one place to another. Storage media can be any available medium accessible to a computer. Computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage media or other magnetic storage devices, or any other medium capable of carrying or storing desired program code having the form of instructions or data structures and accessible to a computer.

[0084] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit the scope of protection of the present invention. Although the present invention has been described in detail with reference to preferred embodiments, those skilled in the art should be able to analyze that modifications or equivalent substitutions can be made to the technical solutions of the present invention without departing from the essence and scope of the technical solutions of the present invention.

Claims

1. A method for safety monitoring of lifting equipment, characterized in that, Includes the following steps: S1 acquires real-time video monitoring data of the operating area of ​​the lifting equipment; S2 extracts key areas from the acquired video monitoring data and marks these key areas, which include the hook area, the load area, and the sling area. S3 extracts the image to be analyzed from the video monitoring data based on the marked key areas; S4 uses a trained security analysis model to perform security analysis on the image to be analyzed, and obtains the security analysis results.

2. The method for safety monitoring of lifting equipment according to claim 1, characterized in that, Step S2 includes: S21 extracts continuous frame data based on the acquired video monitoring data. ,in This represents the frame corresponding to time t; S22 based on the current frame Extract the hook area from the frame. and hoisting area ; S23 Based on the obtained hook area and hoisting area Further from the hook area and hoisting area Extracting sling areas ; S24 Based on the obtained hook area hoisting area and sling area Key regions that make up the current frame .

3. The method for safety monitoring of lifting equipment according to claim 2, characterized in that, Step S21 also includes; Based on the obtained image frames Preprocessing is performed to eliminate noise interference in the image frames.

4. The method for safety monitoring of lifting equipment according to claim 2, characterized in that, Step S22 includes: Based on the current frame, a template matching method is used to identify the hook and load from the video monitoring image, and they are marked as hook areas. and hoisting area .

5. The method for safety monitoring of lifting equipment according to claim 1, characterized in that, Step S23 includes: Based on the hook area and load area in the current frame, extract the center point coordinates of the hook area and load area respectively. and ; Center point of the hook area Using these as anchor points, N pre-designed directional templates are applied to obtain the corresponding compensation areas. A directional filter corresponding to the direction of the compensation area is used to filter the pixels within the compensation area to obtain the directional feature values ​​of each pixel within the compensation area. The directional eigenvalue calculation function used is: in, Represents pixels The corresponding directional eigenvalues, ,in This indicates the nth compensation region. Corresponding to the current moment; This represents the directional filter corresponding to the nth compensation region. This indicates that a directional filter is used to align the pixels. The output value obtained at that time; Based on the pixel value change characteristics of pixels within the compensation area over a period of time, obtain the change characteristic values ​​of each pixel within the compensation area. The function used to calculate the changing eigenvalues ​​is: in, Represents pixels The corresponding characteristic values ​​of change, and Representing pixels High update parameters and low update parameters, among which , , in and These represent high update factor and low update factor, respectively. , ; , represents the time variable. Corresponding to the current moment, This represents the total number of sampling moments over a given period of time. Represents pixels The corresponding grayscale value; This represents the set normalization threshold. Based on the obtained directional and change feature values, the sling feature values ​​of each pixel within the compensation area are calculated. The sling feature value calculation function used is as follows: in, Represents pixels The corresponding sling characteristic value, Represents the normalization function. This represents the set compensation factor, where , Represents pixels The corresponding directional eigenvalues, Represents pixels The corresponding characteristic values ​​of change, Represents pixels The corresponding grayscale value, Represents pixels Low update parameters; This represents the set adjustment factor, where ; Based on the obtained sling characteristic values With the set threshold When comparing, At that time, mark the corresponding pixel. For pixels at the edge of the sling; Further connectivity processing is performed on the marked sling edge pixels to obtain the sling region. .

6. The method for safety monitoring of lifting equipment according to claim 2, characterized in that, Step S3 includes: S31 Based on the obtained key areas Extract the hoisting area Corresponding surrounding areas and projection area ; S32 Extract hook area and sling area Corresponding surrounding areas and ; S33 is based on the critical area and the surrounding area of ​​the hoisting area. and projection area and hook area surrounding areas Sling area surrounding areas Areas that have gained attention.

7. The method for safety monitoring of lifting equipment according to claim 1, characterized in that, Step S4 includes: Based on the obtained image to be analyzed, a personnel recognition model built on YOLOv4 is used to detect and identify personnel in the image. When a person is identified in the image, it is determined that there is a personnel safety hazard, and the corresponding personnel safety anomaly analysis result is output.

8. The method for safety monitoring of lifting equipment according to claim 1, characterized in that, Based on the obtained image to be analyzed, a trained security analysis model based on a neural network is used to process the image to identify anomalies in the image and obtain the device security analysis results. The security analysis model is built on a CNN neural network, which includes an input layer, a first convolutional module, a second convolutional module, a third convolutional module, a fourth convolutional module, a global average pooling layer, a Dropout layer, an output classification layer, and a Softmax layer connected in sequence. The input layer is used to input the extracted image to be analyzed, where the image is padded to a rectangular image of a specified size. The first convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max-pooling layer connected in sequence. The convolutional layer uses a kernel size of [missing information]. The convolutional kernels have 32 kernels and a stride of 1; the normalization layer normalizes the feature maps output by the convolutional layers; the activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel of [size missing]. The stride is 2; the second convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max pooling layer connected in sequence; the convolutional layer uses a kernel size of [missing value]. The convolutional kernels have 64 kernels and a stride of 1; the normalization layer normalizes the feature maps output by the convolutional layers; the activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel size of [missing value]. The stride is 2; the third convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max pooling layer connected in sequence; the convolutional layer uses a kernel size of [missing value]. The convolutional kernels have 128 kernels and a stride of 1; the normalization layer normalizes the feature maps output by the convolutional layers; the activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel size of [missing value]. The stride is 2; the fourth convolutional module includes a convolutional layer, a normalization layer, an activation layer, and a max pooling layer connected in sequence; the convolutional layer uses a kernel size of [missing value]. The number of convolutional kernels is 256, and the stride is 1. The normalization layer normalizes the feature maps output by the convolutional layers. The activation layer uses ReLU as the activation function, and the max pooling layer uses a pooling kernel size of [missing value]. The stride is 2; the global average pooling layer performs spatial averaging on the output of the fourth convolutional module; the fully connected layer uses ReLU as the activation function; the Dropout layer sets the Dropout probability to 0.5; the output classification layer unifies the features into 3-dimensional features through a fully connected operation; the Softmax layer uses the softmax function to calculate the probability of the security analysis results, which include security, warning, and anomaly.

9. The method for safety monitoring of lifting equipment according to claim 1, characterized in that, The method also includes: S5 generates corresponding warning messages or braking control commands based on the safety analysis results.

10. A safety monitoring system for lifting equipment, characterized in that, Including the processor; The processor is used to execute the lifting equipment safety monitoring method as described in any one of claims 1-9.