A method for detecting a fault light of a low-resolution machine room server

By using deep learning networks and virtual data generation technology, the problems of ghosting and blurring in low-resolution images during server inspection in the data center have been solved, enabling efficient fault light location, shortening inspection time, and reducing inspection costs.

CN116342459BActive Publication Date: 2026-06-26SHENYANG SIASUN ROBOT & AUTOMATION

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SHENYANG SIASUN ROBOT & AUTOMATION
Filing Date
2021-12-17
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In existing technologies for server fault detection in data centers, the ghosting and blurring caused by low-resolution images make it difficult to accurately locate faulty server racks using traditional image processing algorithms, and the use of high-frame-rate industrial cameras adds extra overhead.

Method used

A deep learning-based method for detecting server fault lights is adopted. By acquiring low-resolution RGB images captured by an inspection vehicle, a fault light detection model is constructed, and feature extraction and non-maximum suppression processing are performed. The model is then trained by combining virtual data generation and adaptive moment estimation to achieve accurate localization of server lights.

Benefits of technology

Accurate location of fault lights was achieved under low-resolution image conditions, avoiding additional costs, shortening inspection time, and improving detection efficiency.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116342459B_ABST
    Figure CN116342459B_ABST
Patent Text Reader

Abstract

The application discloses a kind of machine room server fault lamp detection methods based on deep learning.Utilize the RGB image that pan-tilt camera on inspection car is shot, combine image processing algorithm and deep learning network, finally give detection result.The application directly uses the RGB image of monitoring camera instead of special industrial camera, and adopts the low-resolution image of monitoring camera substream, avoids additional overhead while ensuring network transmission rate during communication.In addition, the application processes the image shot in movement, ensures the continuity of inspection process.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of industrial vision, specifically a method for detecting fault lights in low-resolution server rooms. Background Technology

[0002] Generally, when a server in a data center malfunctions, the host computer can only obtain vague information and cannot accurately locate the faulty rack or capture error messages. In such cases, manual inspection of the data center is required. With the expansion of data center scale and the increase in labor costs in recent years, more and more data centers hope to use data center inspection vehicles for automated rack fault diagnosis.

[0003] In server light inspection in data centers, the obstruction of mesh doors and the glare caused by server lights shining on the mesh doors significantly interfere with traditional detection algorithms. Existing solutions include: 1. Stopping and taking photos at each rack's teaching point to ensure image clarity; 2. Using high-frame-rate industrial cameras instead of surveillance cameras to avoid motion blur issues caused by moving images. The former results in excessively long inspection times when there are a large number of racks, while the latter incurs additional costs and limits the application scenarios of inspection vehicles. Summary of the Invention

[0004] To address the problems of existing technologies, this invention provides a method for detecting server fault lights in a data center based on low-resolution, low-quality images. Since the images to be transmitted are captured directly by a monitoring camera during movement, motion blur and ghosting are inevitable. Traditional image processing algorithms struggle to solve this problem; therefore, this invention employs a deep learning-based method for detecting server fault lights.

[0005] The technical solution adopted by the present invention to achieve the above objectives is as follows:

[0006] A method for detecting fault lights on low-resolution server lights in a data center includes the following steps:

[0007] Acquire RGB images captured by the inspection vehicle as it moves within a set coordinate area;

[0008] A fault light detection model is built based on a deep learning network, and the fault light detection model is trained.

[0009] The trained fault light detection model is used to extract features from the RGB image to obtain image candidate boxes. Non-maximum suppression is then applied to the image candidate boxes to obtain the detection results, namely the coordinate information, category information, instance pixel coordinate information and confidence probability of the server lights in the RGB image, thus completing the detection of fault lights of the server in the data center.

[0010] During the movement of the inspection vehicle, it also acquires the rack number information and computer room number information within the set coordinate area.

[0011] The training of the fault light detection model involves supplementing the dataset with virtual data and then training the model using the supplemented dataset. Specifically, this includes the following steps:

[0012] The deep learning network was trained using only the network port lights and normal status lights data from the dataset.

[0013] The trained network is used to monitor all computer room images to select the pixel coordinates of the network port lights and normal status lights in the images.

[0014] Based on the set probability, the RGB channels of the pixel coordinates in the selected area are randomly converted to HSV space, and the offset value is adjusted according to the color wheel to generate fault light image data and corresponding label information. This data is then added to the dataset, and the deep learning network is trained again using the supplemented dataset.

[0015] The category information indicates the types of server lights, including network port lights, indicator lights, and fault lights of different colors.

[0016] The instance pixel coordinate information refers to all the corresponding pixels of the server light on the RGB image detected by the deep learning network.

[0017] After obtaining the detection results, remove false detection instances and assess the status of the server rack based on the detection results after removing false detection instances.

[0018] The removal of false detection instances specifically involves: based on the instance pixel coordinate information in the detection results of the deep learning network, identifying the instance with the most pixels in the RGB image, and then judging and removing all instances with a pixel count lower than the maximum pixel count * a, where a is a fixed constant with a value between 0 and 1.

[0019] The status of the server rack where the server lights are located is evaluated based on the detection results after removing false detection instances. Specifically, the abnormal status indicator light categories and corresponding confidence probabilities in a set of RGB images of the same rack are identified. The average confidence probability of each RGB image is calculated and compared with a threshold. If it is higher than the threshold A, the RGB image is determined to be an abnormal image. Then, the number of abnormal images in the same rack is counted. If it exceeds the threshold B, the rack is reported as abnormal; otherwise, the rack is reported as normal.

[0020] The present invention has the following beneficial effects and advantages:

[0021] This invention replaces traditional image processing algorithms with deep learning networks to directly process images with motion blur and low resolution, avoiding additional overhead while ensuring network communication speed. Continuous image capture and processing during the inspection vehicle's movement ensures the continuity of the inspection process and significantly reduces the time required for a single inspection. Attached Figure Description

[0022] Figure 1 This is a flowchart illustrating the overall process for detecting server faults in a data center.

[0023] Figure 2 This is a schematic diagram of a deep learning network structure;

[0024] Figure 3 Flowchart of the method for generating virtual data for fault lights. Detailed Implementation

[0025] The present invention will now be described in further detail with reference to the accompanying drawings and embodiments.

[0026] Specific methods are as follows Figure 1 As shown.

[0027] After the inspection operation begins, the server fault light detection program in the host computer continuously listens for messages from the inspection vehicle. When the inspection vehicle enters the pre-set coordinate area, the monitoring camera takes one or more images, which, along with the rack number and data center number information corresponding to that coordinate area, are transmitted to the host computer's server fault light detection program.

[0028] The host computer then feeds the received RGB image of the server rack into a deep learning network for analysis. It extracts two-dimensional features from the image and calculates the category information, confidence probability, and candidate instance information for the server lights in the image. The category information represents the pre-specified types of server lights, including different colored network port lights, indicator lights, fault lights, etc. The candidate instance information represents all the corresponding pixels of the server lights detected by the deep learning network in the RGB image. After removing duplicate detection boxes using non-maximum suppression, the final instance coordinate information and its corresponding category and confidence probability are obtained.

[0029] Next, the server fault light detection results were evaluated. The evaluation process is as follows:

[0030] 1. Remove false detection instances. Due to the presence of grid gates, a server light will scatter into several light spots on an RGB image. It cannot be ruled out that these light spots may be falsely detected as server lights on low-resolution images.

[0031] 2. By combining the fault category information and corresponding confidence probabilities of all RGB images in the same cabinet, assess the current cabinet status. If the assessment result is "normal", send the cabinet number, data center number, and cabinet status to the backend data; if the assessment result is "abnormal", send the cabinet number, data center number, and the abnormal image after selecting the detection result to the inspection system monitoring interface and trigger an alarm.

[0032] The deep learning network used in this invention is as follows: Figure 2As shown, a convolutional neural network is first used to extract image features. Then, feature maps at three different scales are extracted, which are used to detect large, medium, and small targets, respectively. Finally, the deep learning network results are subjected to non-maximum suppression to calculate the final class information, confidence probability, and instance coordinate information.

[0033] This invention uses a deep learning network to implement the detection function, thus requiring a large amount of data on normal and faulty server indicator lights as training samples. Furthermore, this data must include data on normal and abnormal indicator lights in states with ghosting or blurring. However, in reality, most server racks in a data center are in normal operation, and the number of faulty indicator light samples is extremely small. Therefore, to ensure the balance of various indicator light data in the training samples, this invention proposes a method to automatically generate labeled virtual data to expand the training set, thereby achieving a balance of various indicator light data and saving on the labor costs of labeling. The virtual data generation method is as follows: Figure 3 As shown.

[0034] First, ignoring abnormal and faulty status lights in the training set, we label a subset of network port lights and normal status lights with a large number of samples. We then train a model that only detects network port lights and normal status lights, and use this model to detect the training set data, selecting instance pixel coordinates. Next, with a certain probability, we randomly convert the RGB channels of the selected area to the color-sensitive HSV space, adjusting appropriate offset values ​​according to the color wheel. This allows us to automatically generate fault light data and corresponding category information using normal status lights. The generation speed of virtual data is typically tens of times faster than manual annotation, and it avoids the labeling errors caused by fatigue, making it more efficient and accurate. We then train the deep learning network using the labeled data. Given that this invention uses a self-made small training set, we use adaptive moment estimation as the optimization method. Finally, we obtain the weight model of the deep learning network. Loading this weight model onto the deep learning network enables it to detect server fault lights on low-resolution, ghosted images.

[0035] In this invention, the deep learning network still needs to further evaluate the detection results after the detection is completed to ensure the robustness of the detection results.

[0036] First, based on the instance pixel coordinates in the deep learning network detection results, identify the instance with the most pixels in the image. Then, remove all instances with fewer than the maximum number of pixels multiplied by 'a', where 'a' is a fixed constant ranging from 0 to 1. This removes false detections caused by the light spots on the mesh gate. Next, determine if the rack's current status is abnormal. Specifically, identify the abnormal status indicator light categories and their corresponding confidence probabilities in a set of images of the same rack. Calculate the average confidence probability for each image and compare it to a fixed threshold. If the average is higher than the threshold, the image is considered abnormal. Then, count the number of abnormal images in the same rack. If the number exceeds a fixed threshold, report an abnormal status; otherwise, report a normal status.

[0037] This invention simultaneously satisfies two conditions: mobile shooting by inspection vehicles and the use of only monitoring cameras instead of high frame rate industrial cameras. It directly utilizes deep learning networks to detect server fault lights on blurred images.

[0038] This invention uses a large amount of virtual generated data to expand the dataset, solving the problem of uneven sample size between normal and faulty server status lights.

Claims

1. A method for detecting fault lights in low-resolution data center servers, characterized in that, Includes the following steps: Acquire RGB images captured by the inspection vehicle as it moves within a set coordinate area; A fault light detection model is built based on a deep learning network, and the fault light detection model is trained. The trained fault light detection model is used to extract features from the RGB image to obtain image candidate boxes. Non-maximum suppression is then applied to the image candidate boxes to obtain the detection results, namely the coordinate information, category information, instance pixel coordinate information and confidence probability of the server lights in the RGB image, thus completing the detection of fault lights of the server in the data center. The training of the fault light detection model involves supplementing the dataset with virtual data and then training the model using the supplemented dataset. Specifically, this includes the following steps: The deep learning network was trained using only the network port lights and normal status lights data from the dataset. The trained network is used to monitor all computer room images to select the pixel coordinates of the network port lights and normal status lights in the images. Based on the set probability, the RGB channels of the pixel coordinates in the selected area are randomly converted to HSV space, and the offset value is adjusted according to the color wheel to generate fault light image data and corresponding label information. This data is then added to the dataset, and the deep learning network is trained again using the supplemented dataset. After obtaining the detection results, remove false detection instances and assess the status of the server rack based on the detection results after removing false detection instances; The removal of false detection instances specifically involves: based on the instance pixel coordinate information in the deep learning network detection results, identifying the instance with the largest number of pixels in the RGB image, and then removing all instances with a pixel count lower than the maximum pixel count. An instance of a, where a is a constant with a fixed value between 0 and 1; The status of the server rack where the server lights are located is evaluated based on the detection results after removing false detection instances. Specifically, the abnormal status indicator light categories and corresponding confidence probabilities in a set of RGB images of the same rack are identified. The average confidence probability of each RGB image is calculated and compared with a threshold. If it is higher than the threshold A, the RGB image is determined to be an abnormal image. Then, the number of abnormal images in the same rack is counted. If it exceeds the threshold B, the rack is reported as abnormal; otherwise, the rack is reported as normal.

2. The method for detecting fault lights in low-resolution data center servers according to claim 1, characterized in that, During the movement of the inspection vehicle, it also acquires the rack number information and computer room number information within the set coordinate area.

3. The method for detecting fault lights in low-resolution data center servers according to claim 1, characterized in that, The category information indicates the types of server lights, including network port lights, indicator lights, and fault lights of different colors.

4. The method for detecting fault lights in low-resolution data center servers according to claim 1, characterized in that, The instance pixel coordinate information refers to all the corresponding pixels of the server light on the RGB image detected by the deep learning network.