A continuous casting slab running state monitoring method and system based on lightweight YOLOv8 and continuous learning

By combining a lightweight YOLOv8 model with continuous learning, the problems of high computational cost and low detection accuracy in continuous casting slab inspection are solved. This achieves efficient and accurate real-time detection and positional deviation anomaly diagnosis, with adaptive capabilities, meeting the needs of complex working conditions in industrial sites.

CN122244802APending Publication Date: 2026-06-19YANSHAN UNIV

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
YANSHAN UNIV
Filing Date
2026-04-16
Publication Date
2026-06-19

Smart Images

  • Figure CN122244802A_ABST
    Figure CN122244802A_ABST
Patent Text Reader

Abstract

This invention provides a method and system for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning, particularly relating to the fields of industrial machine vision and intelligent manufacturing. The method includes: acquiring image data of continuously cast slabs, constructing a training dataset through small target enhancement and occlusion enhancement; constructing a lightweight YOLOv8 detection model, replacing the original backbone network with StarNet, designing a feature fusion module based on multi-scale adaptive convolutional kernels on the basis of BiFPN, and integrating occlusion-aware attention into the DyHead detection head; performing real-time detection and multi-target tracking of the continuously cast slab based on the trained model and combined with a tracking algorithm; constructing normalized coordinates through multiple composite calibrations and affine transformations, calculating position and angle offsets based on motion trajectories, and achieving anomaly diagnosis based on continuous multi-frame trend analysis; and initiating a continuous learning mechanism to achieve online model updates. This invention achieves efficient, robust, and adaptive real-time monitoring of the operating status of continuously cast slabs.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to image detection technology in computer vision, and more particularly to the fields of industrial machine vision and intelligent manufacturing. Specifically, it discloses a method and system for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning. Background Technology

[0002] As a pillar industry of modern society, the steel industry is a crucial cornerstone for driving economic growth and industrialization, and a core support for manufacturing. Continuously cast slabs, as a key intermediate product in steel production, are the central hub connecting steelmaking and hot rolling processes. With the widespread application of hot charging and hot conveying processes, high-temperature slabs must be immediately cut, scheduled, and transported after being pulled from the crystallizer. Their running status on the roller conveyor not only affects production safety but also directly impacts production rhythm and resource allocation efficiency. In this high-speed, continuous production process, if the slab status cannot be accurately and promptly perceived and tracked, it will be difficult to effectively handle safety accidents such as slab position deviation, falling, and collisions with equipment. More seriously, once a steel mixing accident occurs (i.e., slabs of different steel grades are mixed), it will lead to a decline in the quality of the entire batch of products, waste of materials and energy, and even production interruption. Therefore, achieving real-time detection and tracking of slabs throughout the entire process from the continuous casting zone to the hot rolling zone, and real-time monitoring and analysis of the slab's operating status, is of great significance for ensuring production safety in high-temperature operating environments, preventing "mixed steel" accidents, and realizing closed-loop feedback and control of logistics information in the manufacturing execution system.

[0003] Currently, steel companies commonly use infrared radio frequency systems based on programmable logic controllers (PLCs) for slab detection and tracking. However, this technology has significant limitations in practical applications: First, infrared sensors are installed at intervals on the roller conveyor, thus only acquiring discrete data of the slab passing through specific positions, making continuous position tracking difficult. Second, the sensors are exposed to harsh environments such as high-temperature radiation, strong electromagnetic interference, high concentrations of water vapor, and conductive dust in the continuous casting process, leading to easy aging and performance degradation of their internal electronic / optical components, and high maintenance costs. These problems often result in unstable detection signals and high false alarm rates, causing a serious disconnect between the actual slab position and the tracking data in the information system, or even tracking loss. Furthermore, traditional manual inspection methods are not only labor-intensive and subjective, but also difficult to achieve high-precision, all-weather detection in high-temperature and high-risk environments, leading to frequent missed detections and directly interfering with the accuracy and continuity of production scheduling, failing to meet the actual data closed-loop requirements of intelligent production lines.

[0004] To overcome the inherent limitations of traditional contact sensors, non-contact machine vision technology has gradually become a research hotspot in the industry. Early explorations mainly relied on traditional digital image processing techniques, extracting targets through edge detection or optical flow methods. However, these methods are extremely sensitive to changes in lighting and background interference, making them difficult to adapt to unstructured and complex industrial environments. In recent years, with the rapid development of convolutional neural networks, deep learning-based target detection technology has brought breakthrough progress to industrial machine vision. The technological evolution has mainly gone through two stages: First, two-stage algorithms. Represented by R-CNN and Faster R-CNN, although the introduction of region proposal networks has achieved end-to-end detection, their complex feature extraction process leads to large computational redundancy and slow inference speed, making it difficult to meet the real-time requirements of high-speed billet pulling in continuous casting production lines; Second, single-stage algorithms. Represented by the YOLO series and SSD, these algorithms complete target localization and classification in one step through regression thinking, significantly improving inference speed and demonstrating stronger real-time application potential. In particular, the YOLO series detection algorithms, due to their ability to consistently and stably balance high detection accuracy and fast inference speed, have been widely deployed in various industrial scenarios with high real-time performance requirements. However, directly applying a general object detection model to the specific scenario of continuous casting slabs still faces a series of challenges, mainly in the following aspects:

[0005] First, there is a conflict between the computational cost of the model and the real-time requirements of industry. Industrial field computing resources are limited, while general-purpose target detection models have a large number of parameters and high computational costs, making it difficult to achieve real-time inference on edge devices.

[0006] Secondly, the detection accuracy and robustness are insufficient in complex industrial scenarios. The continuous casting site environment is complex, with two major challenges: one is the problem of small target detection. Due to the limitations of camera installation angle and distance, slabs that are far away or just entering the field of view occupy a small proportion of the image and have weak features, which can easily lead to missed detection by the model; the other is severe occlusion interference. Frequent operation of large equipment such as overhead cranes can cause partial or complete occlusion of the slab, resulting in model detection failure.

[0007] Secondly, there is a lack of systematic diagnostic capabilities for positional offset anomalies. Existing methods mostly focus on static detection of single-frame images, lacking continuous tracking and spatiotemporal correlation analysis of the slab's motion state, and thus cannot effectively diagnose temporal anomalies such as "continuous deviation." Furthermore, there is a lack of calibration and normalization methods to accurately correlate the image coordinate system with the physical world coordinate system, resulting in inaccurate position and angle measurements and a high false alarm rate.

[0008] Finally, the model lacks the ability to self-evolve in response to dynamic environmental changes. Lighting conditions, equipment layout, and process parameters in industrial settings are constantly changing, causing the model's performance to gradually degrade over time after deployment.

[0009] In summary, there is an urgent need in this field to develop a comprehensive monitoring solution that features high detection accuracy, low computational cost, fast detection speed, and the ability to effectively handle complex working conditions such as small targets and occlusions. This solution should also possess high-precision positional offset anomaly diagnosis capabilities and online self-evolution capabilities to overcome the multiple shortcomings of existing technologies in terms of performance, functionality, and sustainability. Summary of the Invention

[0010] To address the technical problems of directly applying YOLO series detection algorithms to the specific scenario of continuous casting slabs, such as high computational cost, poor detection results, and weak model self-evolution ability, this invention provides a method and system for monitoring the operating status of continuous casting slabs based on lightweight YOLOv8 and continuous learning. On the one hand, this invention extracts working condition image data from the continuous casting site to construct a rapid perception and control closed loop that enables real-time detection, multi-target tracking, and high-precision position offset anomaly diagnosis, thereby improving the real-time perception and rapid response capabilities of the continuous casting site. On the other hand, it establishes a slow model optimization closed loop that automatically collects difficult examples, actively learns and filters them, continuously learns, and updates the model online, achieving the system's adaptive capability and continuous optimization effect throughout its entire lifecycle. These two closed loops work together to support the stable, accurate, and efficient operation of the intelligent monitoring system in complex industrial environments.

[0011] The technical means employed in this invention are as follows: This invention provides a method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning, comprising the following steps: Step S1: Obtain and label the original image data of the continuous casting slab conveyor roller table, perform data augmentation on the original image data, and construct a training dataset; the data augmentation includes at least small target augmentation and occlusion augmentation; Step S2: Construct a lightweight YOLOv8 continuous casting slab detection model. The lightweight YOLOv8 continuous casting slab detection model is obtained by improving the YOLOv8 model. The improvements include: on the one hand, replacing the original backbone network with a lightweight backbone network StarNet, and designing a feature fusion module based on multi-scale adaptive convolutional kernels on the basis of BiFPN; on the other hand, incorporating occlusion-aware attention into the DyHead detection head; and training the constructed lightweight YOLOv8 continuous casting slab detection model based on the training dataset. Step S3: The trained lightweight YOLOv8 continuous casting slab detection model is used to detect the continuous casting slab in real time, and the detected continuous casting slab is tracked by a multi-target tracking algorithm. Each slab is assigned a unique ID, and the continuous motion trajectory of each slab is drawn. Step S4: Through multiple composite calibrations, establish the original image coordinate system of the slab conveyor rollers, and map the detected slab targets to a normalized coordinate system with the horizontal direction as the X-axis and the vertical direction as the Y-axis based on affine transformation. The center line of the slab conveyor rollers after affine transformation is parallel to the X-axis of the normalized coordinate system. Combined with the motion trajectory of the slab targets, calculate the position offset and angle offset of each slab in real time, and perform trend analysis based on continuous multi-frame data to realize multi-dimensional slab state analysis and position offset anomaly diagnosis based on spatiotemporal context.

[0012] Furthermore, the method also includes: Step S5: Deploy the trained slab detection model and multi-target tracking algorithm to an edge server. The edge server reports the real-time location, identity ID, and anomaly diagnosis results of the slab to the slab scheduling system through an industrial communication protocol, and receives control instructions from the scheduling system.

[0013] Furthermore, the method also includes: Step S6: After the model algorithm is deployed, continuously collect the image data corresponding to low confidence detection boxes, tracking missing detection boxes, and the abnormal diagnosis results, and re-label the slab position; when optimizing the model, start the continuous learning mechanism to update the detection model online; wherein the continuous learning mechanism integrates the slab running process rules to constrain the learning direction of the model, and adopts an anti-forgetting strategy to retain historical knowledge.

[0014] Further, the original image data is enhanced with small targets, including: scaling down the original image data proportionally according to at least one preset scaling ratio of 30%, 50%, and 70%, and scaling the bounding box coordinates of the slab target according to the preset scaling ratio, filling the perimeter of the scaled image with grayscale pixels to restore it to the original image input size, and updating the coordinates by translation according to the relative offset of the image in the filled area. Occlusion enhancement of the original image data includes: using grid discard enhancement or random cropping enhancement methods to generate a rectangular occlusion region on the image while keeping the bounding box annotation of the original target unchanged.

[0015] Furthermore, through multiple composite calibrations, an original image coordinate system for the slab conveyor rollers is established, and the detected slab targets are mapped to a normalized coordinate system with the horizontal direction as the X-axis and the vertical direction as the Y-axis based on affine transformation, including: The two edge lines of the slab conveyor roller are manually calibrated. The center line of the roller is fitted by calculating the midpoints of the corresponding positions on both sides of the roller, so as to obtain the equation of the standard roller center line and its angle with the horizontal direction of the image. Using the image center as the rotation center point, a rotation matrix is ​​constructed based on the included angle, and the entire image and the detected slab target are rotated to map the slab target to a normalized coordinate system; In the normalized coordinate system, the vertical distance from the center point of the slab to the center line of the roller conveyor is calculated as the position offset. If it exceeds the preset lateral offset threshold, it is determined that the slab position offset is abnormal. The angle difference between the running direction of the slab and the direction of the roller conveyor is calculated as the angle offset. If it exceeds the preset angle offset threshold, it is determined that the slab angle tilt is abnormal.

[0016] Furthermore, by combining the motion trajectory of the slab target, the positional and angular offsets of each slab are calculated in real time, and trend analysis is performed based on continuous multi-frame data to achieve multi-dimensional slab state analysis and positional offset anomaly diagnosis based on spatiotemporal context, including: By tracking continuous frame data of the same slab, the positional and angular offsets of the slab are analyzed for trends, and a continuous frame window is set. The final alarm is triggered only if the abnormal state continues to exceed the window.

[0017] Furthermore, during model optimization, a continuous learning mechanism is initiated to update the detection model online, including: Based on the low-confidence detection box, the target tracking loss detection box, and the anomaly diagnosis results, image data is automatically collected to construct a difficult case database for fine-tuning the lightweight YOLOv8 continuous casting slab detection model. In the process of fine-tuning the lightweight YOLOv8 continuous casting slab detection model, sample weights or loss function terms are designed in combination with the process rules of slab operation, so that the model can learn abnormal samples related to position offset and angle tilt first. When fine-tuning the lightweight YOLOv8 continuous casting slab detection model using newly labeled data, an elastic weight consolidation algorithm or an experience replay mechanism is used to penalize the updates of key weight parameters of old tasks in order to retain historical knowledge.

[0018] This invention also provides a continuous casting slab operation status monitoring system based on lightweight YOLOv8 and continuous learning, used to implement the aforementioned method for monitoring the operation status of continuous casting slabs based on lightweight YOLOv8 and continuous learning, comprising: The data acquisition and processing module is used to acquire and label the original image data of the continuous casting slab conveyor roller table, and perform data augmentation operations on the original image data to construct a training dataset; the data augmentation includes at least small target augmentation and occlusion augmentation; The model training module is used to construct a lightweight YOLOv8 continuous casting slab detection model. This lightweight YOLOv8 continuous casting slab detection model is obtained based on an improved YOLOv8 model. The improvements include: firstly, replacing the original backbone network with a lightweight StarNet backbone network and designing a feature fusion module based on multi-scale adaptive convolutional kernels on top of BiFPN; secondly, integrating occlusion-aware attention into the DyHead detection head. The constructed lightweight YOLOv8 continuous casting slab detection model is trained based on the training dataset. The real-time detection and tracking module is used to perform real-time detection of continuous casting slabs using the trained lightweight YOLOv8 continuous casting slab detection model, and to track the detected continuous casting slabs using a multi-target tracking algorithm. Each slab is assigned a unique ID, and the continuous motion trajectory of each slab is drawn. The position offset anomaly diagnosis module is used to establish the original image coordinate system of the slab conveyor roller through multiple composite calibrations, and to map the detected slab targets to a normalized coordinate system with the horizontal X-axis and the vertical Y-axis based on affine transformation. The center line of the slab conveyor roller after affine transformation is parallel to the X-axis of the normalized coordinate system. Combined with the motion trajectory of the slab targets, the module calculates the position offset and angular offset of each slab in real time, and performs trend analysis based on continuous multi-frame data to realize multi-dimensional slab state analysis and position offset anomaly diagnosis based on spatiotemporal context.

[0019] Furthermore, the system also includes: The integration and deployment module is used to deploy the trained slab detection model and multi-target tracking algorithm to the edge server. The edge server reports the real-time location, identity ID and anomaly diagnosis results of the slab to the slab scheduling system through the industrial communication protocol, and receives control instructions from the scheduling system.

[0020] Furthermore, the system also includes: The continuous learning platform is used to continuously collect low-confidence detection boxes, track missing detection boxes, and image data corresponding to the abnormal diagnosis results after the model algorithm is deployed, and to re-label the slab position; during model optimization, the continuous learning mechanism is activated to update the detection model online; wherein the continuous learning mechanism integrates slab operation process rules to constrain the learning direction of the model, and adopts an anti-forgetting strategy to retain historical knowledge.

[0021] Compared with the prior art, the present invention has the following advantages: 1. This invention significantly reduces the number of model parameters and computational complexity by replacing the original backbone network with the lightweight backbone network StarNet, replacing the Concat connection with the Add operation, and pruning the model. It meets the stringent real-time requirements of industrial sites while ensuring detection accuracy and is easy to deploy on edge devices.

[0022] 2. This invention effectively solves the problems of weak small target features and missed detection caused by occlusion in complex industrial scenarios by using data preprocessing strategies of small target enhancement and occlusion enhancement, combined with the DyHead detection head that incorporates occlusion perception attention. This significantly improves the detection accuracy and robustness of the model under harsh working conditions.

[0023] 3. This invention establishes a mapping relationship between image space and physical space through multiple composite calibration and coordinate system normalization processing. Combined with continuous trajectory data obtained by multi-target tracking technology, it realizes accurate quantitative calculation of slab position offset and angle tilt. Through trend analysis based on continuous frames, it effectively filters instantaneous interference, reduces false alarm rate, and provides a reliable basis for preventing production accidents.

[0024] 4. This invention constructs a continuous learning mechanism to automatically collect difficult example samples and combines them with anti-forgetting strategies to update the model online. This enables the system to adapt to dynamic environmental changes such as changes in lighting and equipment adjustments, solving the problem of performance degradation after long-term model operation and reducing manual maintenance costs. Attached Figure Description

[0025] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.

[0026] Figure 1 This is a schematic diagram of the overall process of the continuous casting slab operation status monitoring method provided in the embodiments of the present invention.

[0027] Figure 2 This is a schematic diagram of the overall architecture and data flow of the continuous casting slab operation status monitoring system provided in this embodiment of the invention.

[0028] Figure 3 This is a schematic diagram of the installation position of the industrial camera in the roller conveyor area of ​​the continuous casting production line provided in an embodiment of the present invention.

[0029] Figure 4 This is a schematic diagram of the lightweight YOLOv8 detection model structure provided in an embodiment of the present invention.

[0030] Figure 5 This is a schematic diagram of coordinate system normalization during the diagnosis of abnormal slab position offset provided in the embodiments of the present invention. Detailed Implementation

[0031] To enable those skilled in the art to better understand the present invention, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings of the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort should fall within the scope of protection of the present invention.

[0032] Example 1 This embodiment provides a method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning. For example... Figure 1 As shown, this method achieves efficient real-time monitoring of the operating status of continuously cast slabs by constructing a lightweight detection model and combining it with spatiotemporal context analysis. Specifically, it includes the following steps: Step S1: Obtain and label the original image data of the continuous casting slab conveyor roller table, and perform data augmentation operations on the original image data to construct a training dataset; the data augmentation includes at least small target augmentation and occlusion augmentation.

[0033] Specifically, in continuous casting production lines, due to the fixed installation position and limited viewing angle of industrial cameras, slabs at the far end or just entering the field of view occupy a very small proportion in the image, with weak features, making them prone to missed detection by the model. Simultaneously, frequent operation of large equipment such as overhead cranes can partially or completely occlude the slabs. To address these pain points, this embodiment performs targeted data augmentation on the original image data. For small target augmentation, the original image is proportionally reduced according to a preset ratio, and grayscale pixels are added around it to restore it to the model input size, thereby simulating the slab imaging characteristics at different distances and forcing the model to learn the weak features of small targets. For occlusion augmentation, rectangular occlusion areas are generated on the image using mesh discarding or random cropping to simulate complex occlusion conditions on-site, improving the model's robustness in cases of partial information loss. It should be understood that the above scaling ratios and occlusion methods are merely examples, and those skilled in the art can adjust them according to the actual scene's lighting and occlusion level.

[0034] As a preferred embodiment of the present invention, small target enhancement is performed on the original image data, including the following steps: Step S101: Scale the original image data proportionally using at least one preset scaling factor: 30%, 50%, or 70%. It should be understood that this scaling factor is not randomly selected, but rather preset based on the imaging characteristics of industrial cameras at different viewing distances. For example, a 70% scaling factor simulates the image size of the slab in the middle section of the roller conveyor, where it is relatively close to the camera. A 30% scaling factor simulates the small target state of the slab just entering the edge of the field of view, where it is relatively far from the camera. This multi-scale scaling forces the model to learn the subtle features of the slab at different distances during the training phase, thereby significantly improving the model's detection rate of distant slabs.

[0035] Step S102: Calculate the scaling of the bounding box coordinates of the slab target according to a preset scaling ratio. Specifically, if the original image is reduced to its original size... k If the factor is multiplied by 1, then the top-left corner coordinates and the width and height of the bounding box must be multiplied by the factor. k This is to ensure that the annotation boxes correspond to the content of the scaled-down image.

[0036] Step S103: Fill the area around the scaled-down image with grayscale pixels to restore it to the original image input size. This step is crucial. Using the scaled-down image directly without filling it would result in a mismatch between the input resolution and the slab's proportion in the image being too small, while the background proportion is too large, introducing unnecessary noise. By filling with grayscale pixels, typically with a value of 128, the consistency of the model's input size is maintained, and misleading texture features are avoided in the filled area, allowing the model to focus on learning the scaled-down slab's features.

[0037] Step S104: Update the annotation coordinates by translation based on the relative offset of the image within the filled area. Since the image is usually located at the center of the filled area or a fixed position after being scaled down, the original coordinate origin has shifted. Therefore, it is necessary to perform corresponding translation compensation on the coordinates of the annotation box to ensure the accuracy of the coordinate values.

[0038] On the other hand, occlusion enhancement of the original image data includes the following steps: Step S105: Rectangular occlusion regions are generated on the image using either mesh discard enhancement or random cropping enhancement. Mesh discard enhancement divides the image into a regular mesh and randomly discards some mesh cells; random cropping enhancement generates rectangular occlusion blocks of varying sizes at random locations in the image. These two methods simulate the occlusion of slabs by structured equipment such as roller conveyor supports and unstructured objects such as overhead crane hooks and splashing slag in an industrial setting, respectively.

[0039] Step S106: While generating the occluded region, the bounding box annotations of the original target remain unchanged. In traditional target detection tasks, if a target is occluded, the annotation box is usually adjusted or it is marked as a hard sample. However, in the continuous casting slab monitoring scenario of this invention, even if the slab is partially occluded, its physical position does not change, and the system still needs to accurately detect and track the slab. Therefore, this embodiment forces the model to accurately regress the complete bounding box position based on remaining visible features such as slab edge texture and local contours, even when some pixel information is lost, by keeping the bounding box annotations unchanged. This training strategy significantly improves the robustness of the model under complex occlusion conditions and effectively reduces the false negative rate caused by overhead crane operations.

[0040] By combining small target enhancement and occlusion enhancement, the diversity and challenge of the training dataset are significantly improved. The model pre-considers various extreme working conditions during training, enabling it to handle the complex and ever-changing environment of the continuous casting production line during actual deployment, thus laying a solid data foundation for subsequent high-precision detection and tracking.

[0041] Step S2: Construct a lightweight YOLOv8 continuous casting slab detection model. The lightweight YOLOv8 continuous casting slab detection model is obtained by improving the YOLOv8 model. The improvement includes: on the one hand, replacing the original backbone network with a lightweight backbone network StarNet, and designing a feature fusion module based on multi-scale adaptive convolution kernels on the basis of BiFPN; on the other hand, incorporating occlusion awareness attention into the DyHead detection head; and training the constructed lightweight YOLOv8 continuous casting slab detection model based on the training dataset.

[0042] Specifically, while the general-purpose YOLOv8 model boasts high accuracy, its inherent number of parameters and computational complexity makes it difficult to meet the real-time inference requirements of industrial edge devices. This embodiment employs a deep lightweight improvement to the model architecture, as illustrated in the diagram below. Figure 4As shown, firstly, the original backbone network is replaced with a lightweight backbone network, StarNet. StarNet introduces the StarBlock module, which utilizes deep convolution (DWConv) to extract features and reduces redundant connections between channels through a unique star-shaped topology. Compared to standard convolution, deep convolution performs convolution operations on each input channel individually, significantly reducing the number of parameters and computational complexity, thus achieving a lightweight backbone network without significantly sacrificing accuracy. Secondly, a BiFPN structure is used to replace the traditional FPN+PAN structure of the neck network, and a feature fusion module IRB_MSCB based on multi-scale adaptive convolutional kernels is designed on top of BiFPN. Input features are first subjected to channel upscaling through 1×1 pointwise convolution, and then spatial and contextual features are extracted in parallel through deep convolution (DWConv) while reducing the number of model parameters and computational complexity. Simultaneously, a channel shuffling operation is introduced to break down the information barrier caused by deep convolution, and combined with a global heterogeneous kernel selection mechanism, the receptive field is dynamically allocated according to the depth of the feature pyramid hierarchy (i.e., small convolutional kernels are allocated to shallow layers, and large convolutional kernels are allocated to deep layers), achieving adaptive feature capture. Furthermore, while ensuring complete alignment of the spatial resolution and channel dimension of each input feature tensor, an Add operation is used for element-wise addition. This fusion method not only effectively avoids feature parameter redundancy caused by conventional Concat concatenation but also implicitly implements adaptive feature weighting, allowing the model to focus more on extracting key features of the continuously cast slab. Finally, 1×1 pointwise convolution is used for channel dimensionality reduction output. Finally, an occlusion-aware attention mechanism is integrated into the DyHead detection head. The DyHead detection head itself has multiple attention mechanisms; this embodiment further introduces occlusion-aware attention, generating robust attention maps through exponential normalization, strengthening the feature response of unoccluded areas, and suppressing noise interference in occluded areas, thereby significantly improving the model's detection accuracy in occluded scenarios. Through these improvements, the lightweight YOLOv8 continuously cast slab detection model maintains high accuracy while significantly reducing computational costs, laying the foundation for subsequent edge deployment.

[0043] Step S3: The trained lightweight YOLOv8 continuous casting slab detection model is used to detect the continuous casting slab in real time, and the detected continuous casting slab is tracked by a multi-target tracking algorithm. Each slab is assigned a unique ID, and the continuous motion trajectory of each slab is drawn.

[0044] Specifically, the trained model is deployed on the computing unit to perform frame-by-frame inference on the video stream captured by the industrial camera, outputting the bounding box positions of the slab. Subsequently, multi-object tracking algorithms (such as OCSORT or DeepSORT) are used to associate the detection boxes in consecutive frames. For a newly entered slab, the system automatically assigns a globally unique ID and continuously records its position changes in the image coordinate system, forming a continuous motion trajectory. This trajectory data is not only used for real-time monitoring but also provides crucial temporal information for subsequent anomaly diagnosis.

[0045] Step S4: Through multiple composite calibrations, establish the original image coordinate system of the slab conveyor rollers, and map the detected slab targets to a normalized coordinate system with the horizontal direction as the X-axis and the vertical direction as the Y-axis based on affine transformation. The center line of the slab conveyor rollers after affine transformation is parallel to the X-axis of the normalized coordinate system. Combined with the motion trajectory of the slab targets, calculate the position offset and angle offset of each slab in real time, and perform trend analysis based on continuous multi-frame data to realize multi-dimensional slab state analysis and position offset anomaly diagnosis based on spatiotemporal context.

[0046] Specifically, due to the deviation in the installation angle of industrial cameras, the roller conveyor in the original image often appears tilted, and directly calculating the offset in the image coordinate system will produce a large error. This embodiment uses a multi-layer composite calibration technique to first fit the centerline of the roller conveyor in the original image and calculate its angle with the horizontal direction of the image; then, a rotation matrix is ​​constructed, and affine transformations are performed on the image and target coordinates to establish a normalized coordinate system. In this coordinate system, the centerline of the roller conveyor is parallel to the X-axis, and the physical movement direction of the slab is corrected to the horizontal direction, thereby eliminating the geometric distortion caused by the viewing angle. On this basis, the system calculates the vertical distance from the center point of the slab to the centerline of the roller conveyor as the position offset in real time, and calculates the angle difference between the slab's running direction and the roller conveyor direction as the angle offset. More importantly, this embodiment introduces a trend analysis mechanism based on spatiotemporal context. Single-frame detection may produce false alarms due to noise or instantaneous interference, so the system sets a continuous frame window to perform time-series analysis on the position and angle offset of the slab. Only when the abnormal state persists within this window is the final alarm triggered. This mechanism effectively filters out transient interference, significantly reduces the false alarm rate, and enables accurate diagnosis of abnormal states such as continuous deviation or tilting of the slab.

[0047] The following section provides a detailed explanation of the specific geometric algorithms for coordinate system normalization and offset calculation.

[0048] In industrial settings, due to camera installation angle deviations, the roller conveyor in the image often appears tilted. Directly calculating the slab offset in the original image coordinate system introduces significant geometric errors. Therefore, this embodiment establishes a precise mapping relationship between image space and physical space through multiple composite calibrations and coordinate system normalization. Specifically, through multiple composite calibrations, the original image coordinate system of the slab conveyor roller conveyor is established, and based on affine transformation, the detected slab target is mapped to a normalized coordinate system with the horizontal direction as the X-axis and the vertical direction as the Y-axis. The centerline of the slab conveyor roller conveyor after the affine transformation is parallel to the X-axis of the normalized coordinate system. This process includes the following steps: Step S401: Manually calibrate the two edge lines of the slab conveyor rollers. The roller centerline is fitted by calculating the midpoints at corresponding positions on both sides of the rollers to obtain the standard roller centerline equation and its angle with the horizontal direction of the image. Specifically, the operator marks several key points along both edges of the rollers in the image using a human-machine interface, and fits the equations of the two edge lines using the least squares method. To eliminate fitting deviations caused by wear or stains on one side of the edge, this embodiment adopts a midpoint fitting strategy. That is, equally spaced corresponding points are selected on the two edge lines, the midpoint of each pair of corresponding points is calculated, and then these midpoints are used to fit the roller centerline. It should be understood that the midpoint fitting here is not a simple averaging, but based on the principle of geometric symmetry, assuming the roller structure is symmetrical. This effectively filters out noise caused by local edge anomalies through midpoint calculation, improving the robustness of the centerline equation. The final standard roller centerline equation is usually expressed as A. x +B y The form +C=0 can be used to calculate the angle between the center line and the horizontal direction of the image. θ .

[0049] Step S402: Using the image center as the rotation center point, based on the included angle θ A rotation matrix is ​​constructed, and the entire image and the detected slab target are rotated to map the slab target to a normalized coordinate system. Specifically, since the camera optical axis is not necessarily perpendicular to the roller conveyor plane, perspective distortion occurs in the image. However, considering that the length of the roller conveyor in the image is much greater than its width, it can be approximated that in the local area of ​​the roller conveyor, the perspective distortion is mainly manifested as an overall tilt. Therefore, this embodiment constructs a rotation matrix. M An affine transformation is performed on the image. The purpose of rotation is to flatten the centerline of the inclined roller conveyor, making it parallel to the X-axis of the normalized coordinate system. This step transforms the complex unstructured scene into a standard geometric problem, allowing subsequent offset calculations to directly apply the distance formula from a point to a line, greatly simplifying the calculation logic and improving accuracy.

[0050] Step S403: In the normalized coordinate system, calculate the vertical distance from the slab center point to the roller conveyor centerline as the position offset. If it exceeds a preset lateral offset threshold, the slab position offset is determined to be abnormal. Calculate the angle difference between the slab running direction and the roller conveyor direction as the angle offset. If it exceeds a preset angle offset threshold, the slab angle tilt is determined to be abnormal. Specifically, after the normalization process in step S402, the roller conveyor centerline is parallel to the X-axis, and its expression is: At this point, the coordinates of the center points of the slab boundary frames detected before and after are respectively... and Then the position offset For the angular offset, the system uses a multi-target tracking algorithm to obtain the position of the same slab in consecutive frames and calculates the slab's motion vector. The angular offset can then be calculated in a normalized coordinate system. It should be understood that the angular offset reflects whether the slab has deviated or tilted, and is a key indicator for determining whether the slab will collide with the side equipment. For example, if the positional offset is within the limit but the angular offset continues to increase, it indicates that the slab is trending at an angle, and if not intervened in time, a collision will inevitably occur within seconds. Therefore, this embodiment achieves early warning of potential accidents by introducing angular dimension diagnosis.

[0051] Furthermore, by combining the motion trajectory of the slab target, the positional and angular offsets of each slab are calculated in real time, and trend analysis is performed based on continuous multi-frame data to achieve multi-dimensional slab state analysis and positional offset anomaly diagnosis based on spatiotemporal context, including: Step S404: By tracking continuous frame data of the same slab, trend analysis is performed on the positional and angular offsets of the slab, and a continuous frame window is set. The final alarm is triggered only when the abnormal state continues to exceed this window. Specifically, the industrial environment is complex. Splashing steel slag, instantaneous light and shadow fluctuations, and the rapid passing of overhead crane hooks can all cause brief false anomalies in the single-frame detection results. If the alarm is triggered based solely on single-frame data, it will result in an extremely high false alarm rate, seriously affecting the reliability of production scheduling. This embodiment introduces a time-series trend analysis mechanism, setting a time window (e.g., 3-5 seconds). The system maintains a sliding window queue, storing the offset data of the most recent N frames corresponding to the current slab ID. Only when all frames within the window (or frames exceeding a preset proportion) are determined to be in an abnormal state does the system confirm that the slab is in a true deviation or tilt state and trigger an alarm. This logic based on spatiotemporal context essentially utilizes the principle of the continuity of physical motion—true deviation is a continuous process, while noise interference is often instantaneous. Through this mechanism, this embodiment effectively filters out transient interference, significantly reduces the false alarm rate, and ensures the reliability of diagnostic results.

[0052] By organically combining the above-mentioned multiple composite calibration, coordinate system normalization, and time series trend analysis, this embodiment constructs a set of high-precision and highly robust slab position offset anomaly diagnosis algorithm, providing a solid technical guarantee for the safe operation of continuous casting production lines.

[0053] In a preferred embodiment of the present invention, the above method further includes: Step S5: Deploy the trained slab detection model and multi-target tracking algorithm to an edge server. The edge server reports the real-time location, identity ID, and anomaly diagnosis results of the slab to the slab scheduling system through an industrial communication protocol, and receives control instructions from the scheduling system.

[0054] In industrial settings, high-precision models alone are insufficient; it is also necessary to address how to efficiently run the model on edge devices and how to adapt to dynamic environmental changes. Specifically, the trained slab detection model and multi-target tracking algorithm are deployed to an edge server. The edge server reports the real-time location, identification ID, and anomaly diagnosis results of the slab to the slab scheduling system via industrial communication protocols and receives control commands from the scheduling system. At the hardware level, considering the space constraints and harsh environment (high temperature, high dust) of the continuous casting production line, the edge server is typically deployed in a nearby electrical control cabinet. At the software level, to meet the speed requirements of real-time detection, this embodiment uses TensorRT to accelerate and optimize the trained model. TensorRT significantly reduces model inference latency through techniques such as layer fusion, accuracy calibration, and automatic kernel adjustment. It should be understood that although this embodiment preferably uses TensorRT, similar acceleration effects can be achieved using inference engines such as OpenVINO or TensorFlow Lite on other computing platforms. At the communication level, the edge server acts as a data bridge, interacting with the slab scheduling system via standard industrial protocols such as OPC UA or Modbus TCP. The OPC UA protocol, due to its strong security and cross-platform capabilities, is often used to transmit complex structured data (such as slab IDs and trajectory coordinates); while the Modbus TCP protocol is more suitable for transmitting simple register data (such as alarm signals). Through this deployment architecture, the system constructs a rapid perception and control closed loop from sensing and analysis to execution, ensuring that abnormal alarms can be fed back to the scheduling system in milliseconds, thereby enabling timely intervention in production.

[0055] In a preferred embodiment of the present invention, the above method further includes: Step S6: After the model algorithm is deployed, continuously collect the image data corresponding to low confidence detection boxes, tracking missing detection boxes, and the abnormal diagnosis results, and re-label the slab position; when optimizing the model, start the continuous learning mechanism to update the detection model online; wherein the continuous learning mechanism integrates the slab running process rules to constrain the learning direction of the model, and adopts an anti-forgetting strategy to retain historical knowledge.

[0056] The industrial environment is dynamic, with factors such as varying lighting conditions throughout the day and seasons, changes in background texture due to roller wear, and the introduction of new steel grades into production. These factors can all cause model performance to degrade over time. This embodiment addresses this issue through a continuous learning mechanism, specifically including the following steps: Step S601: Based on low-confidence detection boxes, target tracking loss detection boxes, and anomaly diagnosis results, image data is automatically collected to construct a difficult example database for fine-tuning the lightweight YOLOv8 continuous casting slab detection model. Specifically, a data collection process runs in the background, setting difficult example selection rules. For example, when the confidence of the bounding box output by the detection model is lower than a preset threshold (e.g., 0.6), it indicates that the model has uncertainty about the target, and the frame image is marked as a low-confidence difficult example; when the multi-target tracking algorithm loses the trajectory of a certain ID in consecutive frames, it indicates that the target may have been occluded or deformed, and the frame image is marked as a tracking loss difficult example; when the system triggers a position offset anomaly alarm, the frame image is marked as an abnormal scene difficult example. This image data is automatically stored in the difficult example database, along with metadata such as timestamps and camera IDs, providing high-value training material for subsequent model optimization.

[0057] Step S602: During the fine-tuning of the lightweight YOLOv8 continuous casting slab detection model, sample weights or loss function terms are designed in conjunction with the process rules of slab operation, allowing the model to prioritize learning abnormal samples related to positional deviation and angular tilt. Traditional continuous learning often treats all samples equally, but in the continuous casting slab monitoring scenario, different types of samples have different importance. For example, although the number of slab samples that have deviated or tilted is small, they are crucial to production safety. This embodiment integrates process rules into the model training process. Specifically, the process rules can be defined as: the weight coefficient of deviation samples is... α ( α >1), the weighting coefficient of the skewed sample is β ( β>1The weight coefficient for normal samples is 1. When calculating the loss function, the model imposes a greater penalty on the prediction error of high-weight samples, thus forcing the model to prioritize these key samples during optimization. It should be understood that this sample weight design based on process rules essentially transforms the experiential knowledge of human experts into mathematical constraints that the model can understand, effectively solving the problems of extreme imbalance between positive and negative samples and scarcity of key samples in industrial scenarios.

[0058] Step S603: When fine-tuning the lightweight YOLOv8 continuous casting slab detection model using newly labeled data, an elastic weight consolidation algorithm or a combination of experience replay mechanisms is employed to penalize the updates of key weight parameters for old tasks, thus preserving historical knowledge. The biggest challenge in continuous learning is that when the model learns new data, it may overwrite previously learned knowledge, leading to performance degradation in older scenarios. This embodiment uses an elastic weight consolidation algorithm to alleviate this problem. The core idea of ​​this algorithm is to calculate the importance of model parameters to older tasks, and then apply a regularization constraint to important older parameters during fine-tuning, limiting their update magnitude, thereby fixing historical knowledge in the model. In this way, knowledge with a higher degree of fixation is less likely to be forgotten when learning new knowledge. Furthermore, an experience replay mechanism can be combined, that is, while fine-tuning new data, a portion of old data is randomly mixed in for joint training to consolidate memory. Through the above anti-forgetting strategies, the model can adapt to new environments while maintaining its ability to recognize old working conditions, achieving online model evolution.

[0059] By organically combining the aforementioned edge deployment with a continuous learning mechanism, this embodiment constructs an intelligent monitoring system with adaptive capabilities. The system can not only perceive the slab status in real time at the edge, but also continuously optimize itself through a background continuous learning platform, forming a complete closed loop of perception-evolution. This significantly reduces manual maintenance costs and extends the system's lifespan.

[0060] Example 2 This embodiment provides a continuous casting slab operation status monitoring system based on lightweight YOLOv8 and continuous learning. This system is used to implement the monitoring method described in Embodiment 1. Figure 2 As shown, the system is physically divided into three layers: the field data acquisition layer, the edge computing layer, and the platform and scheduling system layer. These layers work together to construct two core mechanisms: the perception and control closed loop and the model optimization closed loop.

[0061] Specifically, the on-site data acquisition layer, as the system's sensing front end, mainly includes a data acquisition and processing module. This module consists of multiple industrial cameras deployed above the roller conveyor of the continuous casting production line. For example... Figure 3As shown, industrial cameras typically employ high-resolution linear or area array cameras, fixed directly above the roller conveyor via brackets, providing a field of view covering the entire roller conveyor cross-section without overlapping blind spots. The cameras transmit the acquired raw image data to the edge computing layer in real time via gigabit Ethernet or fiber optics. It should be understood that the number and installation location of the cameras need to be adaptively adjusted according to the length of the roller conveyor and lighting conditions on site; for example, high-brightness flash sources may be added in areas with insufficient lighting. After acquiring the raw images, the data acquisition and processing module is also responsible for performing the data augmentation operation described in step S1, constructing a training dataset to provide a data foundation for model training. The edge computing layer is the core processing unit of the system, deployed in an edge server on site, and mainly includes a real-time detection and multi-target tracking module, a position offset anomaly diagnosis module, a difficult case collection module, and an integration and interface module. The real-time detection and multi-target tracking module loads a lightweight YOLOv8 continuous casting slab detection model accelerated and optimized by TensorRT. It performs real-time decoding and inference on the input video stream, outputting the bounding box position of the slab. Combined with multi-target tracking algorithms such as OCSORT, it assigns a unique ID and continuous motion trajectory to each slab. The position offset anomaly diagnosis module, based on preset calibration parameters, executes the multi-composite calibration and coordinate system normalization algorithm described in step S4 to calculate the slab's position and angle offsets in real time and perform trend analysis. The difficult case collection module is responsible for automatically capturing image frames and storing them in the difficult case database according to preset rules such as low-confidence detection boxes, missing tracking detection boxes, and anomaly alarms. The system integration and interface module is responsible for reporting the detection results and diagnostic information to the slab scheduling system via industrial communication protocols such as OPC UA or Modbus TCP. Through this architecture, the entire link from image acquisition and inference detection to anomaly alarm reporting is completed at the edge, ensuring millisecond-level response to production anomalies. The platform and scheduling system layer mainly includes a continuous learning platform and an external slab scheduling system. The continuous learning platform is responsible for executing the continuous learning mechanism described in step S6. This module periodically extracts image data corresponding to low-confidence, tracking loss, and anomaly alarms from the edge server's difficult example database. After manual review and annotation, it initiates an incremental learning process to fine-tune the model and updates the optimized model back to the edge server. This process enables the system to adapt to environmental changes and continuously evolve. The slab scheduling system, as the execution center, receives slab position, identification ID, and anomaly diagnosis results reported by the edge server, displays them visually, and issues control commands to the roller conveyor or overhead crane when an anomaly occurs, or reminds manual verification, forming a control closed loop.

[0062] Through the aforementioned layered architecture and dual closed-loop design, the system in this embodiment not only realizes real-time monitoring and anomaly diagnosis of the continuous casting slab's operating status, but also solves the problem of performance degradation after long-term operation of the industrial model through a continuous learning mechanism, providing a reliable integrated hardware and software solution for the intelligent production of steel enterprises.

[0063] To verify the practical application effect of the continuous casting slab operation status monitoring method and system based on lightweight YOLOv8 and continuous learning provided in this invention, this embodiment constructs a test environment close to a real industrial site. The implementation process of the continuous casting slab operation status monitoring method and system based on lightweight YOLOv8 and continuous learning is described in detail. This solution effectively overcomes the problems of large model computation, poor detection effect on small targets and occluded targets, lack of systematic slab position displacement anomaly diagnosis function, and weak model self-evolution ability in existing technologies.

[0064] like Figure 1 As shown, the overall process of the method provided in this embodiment includes six major steps: dataset construction, model training, fusion of detection model and tracking algorithm, offset anomaly diagnosis, system deployment, and continuous learning and updating. Figure 2 The system's overall architecture and data flow were further demonstrated, which can be logically divided into three layers: the field data acquisition layer, the edge computing layer, and the platform and scheduling system layer. Each layer works together to form a fast perception and control closed loop and a slow model optimization closed loop, ensuring that the system has real-time response and long-term adaptive capabilities.

[0065] like Figure 3 As shown, the on-site data acquisition layer is positioned above the slab conveyor rollers of the continuous casting production line, comprising six high-resolution industrial cameras installed in sections to ensure full field of view without overlap. The cameras transmit the acquired real-time video streams to the edge computing layer via industrial Ethernet, providing raw data for subsequent processing.

[0066] The edge computing layer is deployed in the edge server on site and is the core processing unit of the system. It includes a real-time detection and multi-target tracking module, a position offset anomaly diagnosis module, a difficult case collection module, and a system integration and interface module.

[0067] Specifically, the real-time detection and multi-target tracking module loads a lightweight YOLOv8 slab detection model accelerated by TensorRT, performs real-time decoding and inference on the input continuous casting production line monitoring video stream, detects the slab position, and integrates it with a multi-target tracking algorithm to generate a unique identity ID and continuous motion trajectory for each slab. The position offset anomaly diagnosis module normalizes the coordinates of detected slab targets based on a preset original image coordinate system, and calculates their position and angular offset by combining the slab ID and motion trajectory. An alarm is triggered when the number of abnormal frames reaches a set threshold. The difficult case collection module automatically captures image frames and stores them in a difficult case database according to preset rules such as low-confidence detection boxes, missing tracking detection boxes, and anomaly alarms, providing material for subsequent model optimization. The system integration and interface module reports the real-time slab position, identity ID, and anomaly diagnosis results to the slab scheduling system in real time via industrial protocols such as OPC UA or Modbus TCP.

[0068] The platform and scheduling system layer mainly includes a slab scheduling system and a continuous learning platform (model optimization module). The slab scheduling system receives real-time slab location, identification ID, and anomaly diagnosis results reported by the edge computing layer and displays them visually. Upon receiving an anomaly alarm, it issues control commands to the roller conveyor or overhead crane, or prompts manual verification, forming a control closed loop. The continuous learning platform periodically extracts data from the difficult example database of the edge server, filters high-value samples through an active learning strategy, and initiates an incremental learning process to fine-tune the model after manual review and annotation. The optimized new model is distributed back to the edge server for online updates, forming a model optimization closed loop.

[0069] The following section provides a detailed explanation of the solution and effects of this application example, with specific steps as examples.

[0070] First, data acquisition and augmentation dataset construction were carried out. Six high-resolution industrial cameras were installed in sections above the slab conveyor rollers of the continuous casting production line to ensure full coverage and no overlap in the field of view. Image data was collected over a period of one year, covering different time periods (day / night) and seasonal changes in lighting, resulting in 15,000 raw images.

[0071] To enhance the model's generalization ability in slab detection under different shooting angles, 500 images were extracted proportionally from each camera and merged to form a unified dataset. LabelImg was used to annotate this dataset, with annotation categories including: slab, crane, and crane and slab. The slab is the core detection target of this invention; the crane and the crane that grips the slab are auxiliary detection targets. Annotating them not only helps the model to more accurately locate the slab but also provides crucial contextual information for subsequent multi-object tracking tasks and interactions with the slab scheduling system.

[0072] Specifically, to address the issue of slabs appearing as small targets in images due to the camera's shooting angle, the annotation process focused on increasing the number and accuracy of annotations for such small-scale slabs to enhance the model's ability to detect small targets. After annotation, the entire dataset was divided into a training set (1800 images), a validation set (600 images), and a test set (600 images) in a 6:2:2 ratio.

[0073] In addition, to improve the robustness of the model in complex scenarios, the following two data augmentations are performed on the training set: Small Target Enhancement: Based on preliminary experiments, three scaling ratios of 30%, 50%, and 70% are selected to proportionally reduce the original image as a whole, and the width, height, and center point coordinates of the original bounding boxes are also scaled proportionally. Then, grayscale pixels are filled into the outer region of the reduced image to restore its spatial resolution to the original image input size. Simultaneously, the spatial offset of the reduced image during the size restoration process is calculated, and translation compensation is performed on the center point coordinates of the scaled bounding boxes. Two enhanced images are generated from each original image using this method to enhance the model's ability to recognize small-scale slabs.

[0074] Occlusion Enhancement: Two bounding box-safe occlusion enhancement methods are employed to generate occlusion regions only at the image level while preserving the original bounding boxes, thus simulating occlusion interference in real-world scenes. The specific operation is as follows: Mesh dropout enhancement: Divide the image into a 5×5 uniform grid, and randomly generate rectangular occlusion blocks of fixed size in each grid cell according to a preset ratio to simulate regular occlusion (such as structured device occlusion). Random cropping enhancement: Randomly generate rectangular occlusion regions in the image with varying numbers, sizes, and positions to simulate irregular occlusion (such as random occlusion by overhead cranes, temporary obstacles, etc.).

[0075] Through the above enhancements, the training set size was effectively expanded from 1,800 images to 4,500 images, significantly enhancing the model's ability to learn features from small targets and partially occluded targets, and providing a high-quality and diverse data foundation for subsequent model training.

[0076] Next, model training and optimization are performed. An improved YOLOv8n (nano version) detection model is constructed, and its structure diagram is shown below. Figure 4 As shown, the main improvements include: Backbone Network Improvements: The original backbone network in YOLOv8n is replaced with the lightweight StarNet, which adopts a four-level hierarchical design. Each level uses convolutional layers for downsampling and introduces the Star Block module. The Star Block module uses 7×7 depthwise convolutions (DWConv) to extract input features. Compared with traditional convolutions, DWConv performs convolution on each channel of the input data individually and then adds the results element-wise, effectively reducing the number of parameters and computational cost, and accelerating the training and inference processes.

[0077] Improved Neck Network: A BiFPN structure replaces the traditional FPN+PAN structure for the neck network, and an IRB_MSCB feature fusion module based on multi-scale adaptive convolutional kernels is designed on top of BiFPN. Input features are first enlarged through 1×1 pointwise convolutions, and then spatial and contextual features are extracted in parallel using depthwise convolution (DWConv) while reducing model parameters and computational complexity. Simultaneously, a channel shuffling operation is introduced to break down the information barrier caused by depthwise convolutions, and combined with a global heterogeneous kernel selection mechanism, the receptive field is dynamically allocated according to the depth of the feature pyramid (i.e., small convolutional kernels for shallow layers and large convolutional kernels for deep layers), achieving adaptive feature capture. Furthermore, while ensuring that the spatial resolution of each input feature tensor is perfectly aligned with the channel dimension, an Add operation is used for element-wise addition. This fusion method not only effectively avoids feature parameter redundancy caused by conventional Concat concatenation but also implicitly implements adaptive feature weighting, allowing the model to focus more on extracting key features of the continuously cast slab. Finally, channel dimensionality reduction is achieved through 1×1 pointwise convolutions for output. Overall, the improved neck network achieves a lightweight effect of reducing the number of model parameters by 29% and the computational cost by 13% while maintaining or even improving detection accuracy (especially for multi-scale targets).

[0078] Head Network Improvements: Occlusion-aware attention is integrated into the DyHead detection head to effectively address the challenges of detecting occlusion and small targets. Specifically, occlusion attention extracts basic features using DWConv, then enhances the dependencies between channels with 1×1 convolutions and fully connected layers. Finally, exponential normalization generates a robust attention map to strengthen the response of unoccluded regions and counteract information loss caused by occlusion.

[0079] The improved YOLOv8n detection model was trained using the enhanced training set from step S1, with SGD as the optimizer, an initial learning rate of 0.01, and 300 training epochs. All model training and performance validation experiments were conducted on a high-performance server running Linux Ubuntu 18.04.5 LTS, equipped with an Intel® Xeon® Gold6248R processor, 128 GB of DDR4 memory, and an NVIDIA GeForce RTX 3090 (24 GB) graphics card.

[0080] Experimental results show that, compared to the original YOLOv8n, this improved model reduces the number of parameters by 62.5% and the computational load by 74.6% while only decreasing mAP@0.5 by 0.1%. To further compress the model size and adapt it for future deployment on resource-constrained industrial edge devices, channel pruning was performed. Experimental results show that, while maintaining the current accuracy, the number of model parameters is further reduced by 52% and the computational load by 50%, and the detection speed meets the real-time requirement of ≥15 FPS in industrial settings. These results fully demonstrate the advantages of the improved model in terms of accuracy and efficiency, laying a solid technical foundation for its subsequent porting to edge devices.

[0081] Next, real-time detection and tracking algorithms are fused. The lightweight YOLOv8 detection model from step S2 is fused with a multi-target tracking algorithm to construct a real-time detection and multi-target tracking module for continuous casting slabs under complex operating conditions in the continuous casting production line. This module will be deployed on the edge server in step S5, responsible for processing the continuous video stream acquired by the industrial camera frame by frame to achieve real-time detection and tracking of slab targets.

[0082] Specifically, for each frame of the input video stream, the bounding box position of the slab target is output through forward inference of the lightweight YOLOv8 detection model backbone network, neck network, and detection head in step S2. The bounding box is then input into a multi-target tracking algorithm. This invention preferably employs the OCSORT algorithm, which introduces an observation-based center update strategy based on the classic SORT algorithm. This algorithm can maintain stable tracking performance under complex conditions such as occlusion, rapid target movement, and brief loss of detection boxes, and is particularly suitable for dynamic scenarios such as frequent crane operations in continuous casting and partial slab occlusion. The tracking algorithm predicts the target state between consecutive frames using Kalman filtering and uses the Hungarian algorithm to associate the detection box with existing trajectories. A globally unique ID is assigned to each slab entering the camera's field of view, and its motion trajectory in the image coordinate system is continuously recorded. It should be noted that, in order to improve the robustness of slab target tracking, the system performs fine-grained management of slab motion trajectory: when a slab is not successfully associated for several consecutive frames (e.g., 10 frames), its trajectory is not immediately terminated, but is retained for a period of time to deal with temporary occlusion; if the slab reappears in the field of view and matches the historical trajectory, its original ID is restored to ensure the continuity and consistency of the identity ID; for trajectories that leave the field of view or have not been matched for a long time, the system archives them and stops tracking.

[0083] Subsequently, a positional offset anomaly diagnosis is performed, which includes the following steps: Calibration and normalization of the coordinate system: During system initialization, the operator calibrates two edge lines five times at the beginning and end positions of each roller conveyor section via the human-machine interface. The center line of the roller conveyor is fitted by calculating the midpoint of the two edge lines, and the average of the multiple fitting results is taken to obtain the equation of the standard roller conveyor center line. The above calibration and fitting process is performed in the original image coordinate system, with the upper left corner of the image as the origin, the horizontal direction to the right as the positive X-axis, and the vertical direction downwards as the positive Y-axis. The angle between the standard roller conveyor center line and the horizontal direction is calculated. and with the image center Construct a rotation matrix for the center of rotation M Rotate the entire image and the detected slab target clockwise. Angle, establish a normalized coordinate system. In this coordinate system, the centerline of the standard roller conveyor is parallel to the X-axis, and the equation can be expressed as: .

[0084] Offset calculation: Using the multi-target tracking algorithm in step S3, obtain the center point coordinates of the same slab ID in the current frame. and the previous frame (or previous) Center point coordinates of the frame Using rotation matrices Mapping the above coordinates from the original image coordinate system to the normalized coordinate system, the transformed coordinates are as follows: , .

[0085] ① Position offset detection: Calculate the normalized ordinate of the current frame slab center point and the perpendicular distance to the roller conveyor centerline. Set the horizontal offset threshold. ,when The time was determined to be an abnormal displacement of the slab position.

[0086] ② Angle tilt detection: Based on the normalized coordinates of the center points of the current frame and the previous frame (or multiple consecutive previous frames), calculate the angle difference between the slab running direction and the roller conveyor direction. Set the angle offset threshold. ,when The slab was determined to have an abnormal tilt angle.

[0087] Anomaly detection mechanism: Set a horizontal offset threshold Pixels, angular offset threshold The system sets a time window of 3-5 seconds for trend diagnosis. An alarm is triggered only if the position or angle deviation of the same slab exceeds a threshold in all consecutive frames within the time window, thus effectively filtering out transient interference and reducing false alarm rates.

[0088] Next, the system was deployed and put into real-time operation. To verify the application performance of this technical solution in a near-real industrial environment, the optimal weight file (.pt format) of the trained lightweight YOLOv8 detection model was first exported to ONNX format, and accelerated inference was performed using the NVIDIA TensorRT framework. Subsequently, the accelerated detection model was fused with the OCSORT multi-object tracking algorithm and deployed on an edge server. The edge server used the Linux Ubuntu 18.04 operating system, equipped with an Intel® Xeon® Gold 6248R processor with a clock speed of 3.0 GHz, 128 GB of LPDDR4 memory, and an RTX A6000 graphics card with 48 GB of video memory. Six high-resolution industrial camera video streams were input to the edge server for real-time inference to detect the slab position in real time. The detection results were then input into the OCSORT multi-object tracking algorithm, which assigned a unique ID to each slab and recorded its continuous motion trajectory. Meanwhile, the edge server, via the Modbus TCP industrial protocol, promptly reports the acquired real-time slab location, identity ID, movement trajectory, and abnormal alarm information obtained in step S4 to the slab scheduling system. The slab scheduling system receives and visualizes this information, and upon receiving an abnormal alarm, issues control commands to the roller conveyor or overhead crane, or prompts manual verification. The successful implementation of this stage verifies the feasibility of the entire process from data input to tracking output, demonstrating the technical potential of this solution to achieve real-time processing on an edge server.

[0089] Finally, the model is updated online. The system background continuously runs the data collection process, storing captured images into the difficult case database according to set rules (detection box confidence below 0.6, target tracking ID loss, and "position offset anomaly" alarm triggered by step S4). The system administrator initiates the model update process quarterly. Sample selection: Using an active learning strategy, 500 of the most valuable samples were selected from the difficult example database and manually reviewed and annotated. Continuous learning and fine-tuning: During continuous learning, the "elastic weight consolidation algorithm" is used to calculate the importance matrix (Fisher information matrix) of all weight parameters in the old model as an importance measure. When fine-tuning with new data, a regularization term is added to the loss function to penalize large modifications to important old weights and mitigate catastrophic forgetting; at the same time, sample weights are designed in conjunction with the slab running process rules to enable the model to prioritize learning abnormal samples related to positional offsets and angular tilts. Training strategy: Use 500 newly labeled images mixed with 1000 images randomly selected from the old training set as the training set, fine-tune the deployed model with a low learning rate (1e-5), and continue training for 300 rounds. Validation and Deployment: In an isolated test environment, the updated model is validated using a fixed historical dataset and recent challenging examples to ensure that its mAP@0.5 and anomaly diagnosis accuracy are not lower than 98% of the old model's level. After successful validation, the new model is smoothly switched to the online service, while a copy of the old model is retained to support rapid rollback. This process ensures the security of model updates and the continuity of system services.

[0090] Through comprehensive implementation and verification in both experimental and real-world environments, the technical solution provided by this invention has been proven to effectively solve the core challenges in monitoring continuously cast slabs. Experimental data shows that this solution significantly improves the accuracy and robustness of detection and anomaly diagnosis while maintaining high real-time performance. Furthermore, the designed continuous learning mechanism provides a feasible technical path for addressing the environmental adaptability issues of industrial models. In summary, this solution has clear industrial application prospects, and those skilled in the art can deploy it on actual industrial edge devices based on the guidance of this specification, thereby achieving intelligent monitoring of the operating status of continuously cast slabs.

[0091] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features therein. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning, characterized in that, Includes the following steps: Step S1: Obtain and label the original image data of the continuous casting slab conveyor roller table, perform data augmentation on the original image data, and construct a training dataset; the data augmentation includes at least small target augmentation and occlusion augmentation; Step S2: Construct a lightweight YOLOv8 continuous casting slab detection model. The lightweight YOLOv8 continuous casting slab detection model is obtained by improving the YOLOv8 model. The improvements include: replacing the original backbone network with a lightweight backbone network StarNet; designing a feature fusion module based on multi-scale adaptive convolutional kernels on the basis of BiFPN; and integrating occlusion-aware attention into the DyHead detection head; training the constructed lightweight YOLOv8 continuous casting slab detection model based on the training dataset. Step S3: The trained lightweight YOLOv8 continuous casting slab detection model is used to detect the continuous casting slab in real time, and the detected continuous casting slab is tracked by a multi-target tracking algorithm. Each slab is assigned a unique ID, and the continuous motion trajectory of each slab is drawn. Step S4: Through multiple composite calibrations, establish the original image coordinate system of the slab conveyor rollers, and map the detected slab targets to a normalized coordinate system with the horizontal direction as the X-axis and the vertical direction as the Y-axis based on affine transformation. The center line of the slab conveyor rollers after affine transformation is parallel to the X-axis of the normalized coordinate system. Combined with the motion trajectory of the slab targets, calculate the position offset and angle offset of each slab in real time, and perform trend analysis based on continuous multi-frame data to realize multi-dimensional slab state analysis and position offset anomaly diagnosis based on spatiotemporal context.

2. The method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning according to claim 1, characterized in that, The method further includes: Step S5: Deploy the trained slab detection model and multi-target tracking algorithm to an edge server. The edge server reports the real-time location, identity ID, and anomaly diagnosis results of the slab to the slab scheduling system through an industrial communication protocol, and receives control instructions from the scheduling system.

3. The method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning according to claim 2, characterized in that, The method further includes: Step S6: After the model algorithm is deployed, continuously collect the image data corresponding to low confidence detection boxes, tracking missing detection boxes, and the abnormal diagnosis results, and re-label the slab position; when optimizing the model, start the continuous learning mechanism to update the detection model online; wherein the continuous learning mechanism integrates the slab running process rules to constrain the learning direction of the model, and adopts an anti-forgetting strategy to retain historical knowledge.

4. The method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning according to claim 1, characterized in that, Small target enhancement is performed on the original image data, including: The original image data is scaled down proportionally according to at least one preset scaling ratio of 30%, 50%, and 70%, and the bounding box coordinates of the slab target are scaled according to the preset scaling ratio. Gray pixels are filled around the scaled image to restore it to the original image input size, and the coordinates are translated and updated according to the relative offset of the image in the filled area. Occlusion enhancement of the original image data includes: using grid discard enhancement or random cropping enhancement methods to generate a rectangular occlusion region on the image while keeping the bounding box annotation of the original target unchanged.

5. The method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning according to claim 1, characterized in that, Through multiple composite calibrations, an original image coordinate system for the slab conveyor rollers is established. Then, based on affine transformation, the detected slab targets are mapped to a normalized coordinate system with the horizontal X-axis and the vertical Y-axis, including: The two edge lines of the slab conveyor roller are manually calibrated. The center line of the roller is fitted by calculating the midpoints of the corresponding positions on both sides of the roller, so as to obtain the equation of the standard roller center line and its angle with the horizontal direction of the image. Using the image center as the rotation center point, a rotation matrix is ​​constructed based on the included angle, and the entire image and the detected slab target are rotated to map the slab target to a normalized coordinate system; In the normalized coordinate system, the vertical distance from the center point of the slab to the center line of the roller conveyor is calculated as the position offset. If it exceeds the preset lateral offset threshold, it is determined that the slab position offset is abnormal. The angle difference between the running direction of the slab and the direction of the roller conveyor is calculated as the angle offset. If it exceeds the preset angle offset threshold, it is determined that the slab angle tilt is abnormal.

6. The method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning according to claim 5, characterized in that, By combining the motion trajectory of the slab target, the positional and angular offsets of each slab are calculated in real time, and trend analysis is performed based on continuous multi-frame data to achieve multi-dimensional slab state analysis and positional offset anomaly diagnosis based on spatiotemporal context, including: By tracking continuous frame data of the same slab, the positional and angular offsets of the slab are analyzed for trends, and a continuous frame window is set. The final alarm is triggered only if the abnormal state continues to exceed the window.

7. The method for monitoring the operating status of continuously cast slabs based on lightweight YOLOv8 and continuous learning according to claim 3, characterized in that, During model optimization, a continuous learning mechanism is initiated to update the detection model online, including: Based on the low-confidence detection box, the target tracking loss detection box, and the anomaly diagnosis results, image data is automatically collected to construct a difficult case database for fine-tuning the lightweight YOLOv8 continuous casting slab detection model. In the process of fine-tuning the lightweight YOLOv8 continuous casting slab detection model, sample weights or loss function terms are designed in combination with the process rules of slab operation, so that the model can learn abnormal samples related to position offset and angle tilt first. When fine-tuning the lightweight YOLOv8 continuous casting slab detection model using newly labeled data, an elastic weight consolidation algorithm or an experience replay mechanism is used to penalize the updates of key weight parameters of old tasks in order to retain historical knowledge.

8. A continuous casting slab operation status monitoring system based on lightweight YOLOv8 and continuous learning, used to implement the continuous casting slab operation status monitoring method based on lightweight YOLOv8 and continuous learning as described in any one of claims 1 to 7, characterized in that, include: The data acquisition and processing module is used to acquire and label the original image data of the continuous casting slab conveyor roller table, and perform data augmentation operations on the original image data to construct a training dataset; the data augmentation includes at least small target augmentation and occlusion augmentation; The model training module is used to construct a lightweight YOLOv8 continuous casting slab detection model. This lightweight YOLOv8 continuous casting slab detection model is obtained based on an improved YOLOv8 model. The improvements include: firstly, replacing the original backbone network with a lightweight StarNet backbone network and designing a feature fusion module based on multi-scale adaptive convolutional kernels on top of BiFPN; secondly, integrating occlusion-aware attention into the DyHead detection head. The constructed lightweight YOLOv8 continuous casting slab detection model is trained based on the training dataset. The real-time detection and tracking module is used to perform real-time detection of continuous casting slabs using the trained lightweight YOLOv8 continuous casting slab detection model, and to track the detected continuous casting slabs using a multi-target tracking algorithm. Each slab is assigned a unique ID, and the continuous motion trajectory of each slab is drawn. The position offset anomaly diagnosis module is used to establish the original image coordinate system of the slab conveyor roller through multiple composite calibrations, and to map the detected slab targets to a normalized coordinate system with the horizontal X-axis and the vertical Y-axis based on affine transformation. The center line of the slab conveyor roller after affine transformation is parallel to the X-axis of the normalized coordinate system. Combined with the motion trajectory of the slab targets, the module calculates the position offset and angular offset of each slab in real time, and performs trend analysis based on continuous multi-frame data to realize multi-dimensional slab state analysis and position offset anomaly diagnosis based on spatiotemporal context.

9. A continuous casting slab operation status monitoring system based on lightweight YOLOv8 and continuous learning as described in claim 8, characterized in that, The system also includes: The integration and deployment module is used to deploy the trained slab detection model and multi-target tracking algorithm to the edge server. The edge server reports the real-time location, identity ID and anomaly diagnosis results of the slab to the slab scheduling system through the industrial communication protocol, and receives control instructions from the scheduling system.

10. A continuous casting slab operation status monitoring system based on lightweight YOLOv8 and continuous learning as described in claim 8, characterized in that, The system also includes: The continuous learning platform is used to continuously collect low-confidence detection boxes, track missing detection boxes, and image data corresponding to the abnormal diagnosis results after the model algorithm is deployed, and to re-label the slab position; during model optimization, the continuous learning mechanism is activated to update the detection model online; wherein the continuous learning mechanism integrates slab operation process rules to constrain the learning direction of the model, and adopts an anti-forgetting strategy to retain historical knowledge.