A remote fault detection system and method for fitness equipment
By employing a hierarchical bispectral phase reconstruction and multi-resolution pyramid strategy, the problems of false alarms and missed detections of fitness equipment under high temperature and high vibration environments have been solved, achieving efficient and reliable detection of fitness equipment faults.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- DACHANG HUI AUTONOMOUS COUNTY XIADIAN JIAMEI SPORTING GOODS
- Filing Date
- 2026-03-19
- Publication Date
- 2026-06-19
AI Technical Summary
Existing fault detection technologies for fitness equipment are not reliable enough in complex physical environments. Especially under high temperature and high vibration conditions, traditional image sensors are prone to misjudgment due to optical disturbances caused by thermal convection, resulting in false alarms and missed detections.
By employing a hierarchical bispectral phase reconstruction and multi-resolution pyramid strategy, clear reconstructed images are generated through image acquisition, multi-resolution decomposition, hierarchical bispectral phase reconstruction, and fault identification models, thereby identifying the fault characteristics of fitness equipment.
It effectively suppresses random optical noise caused by thermal turbulence, accurately preserves key high-frequency details, and achieves high robustness, non-contact remote precision detection of core components of fitness equipment, reducing operation and maintenance costs.
Smart Images

Figure CN122243940A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the technical field of fault detection for fitness equipment, and more particularly to a remote fault detection system and method for fitness equipment. Background Technology
[0002] With the deep integration of IoT and AI technologies, the intelligent operation and maintenance of fitness equipment has shifted from traditional reactive repair to predictive maintenance. Non-contact monitoring of core moving parts using built-in machine vision modules has become a crucial means of identifying early-stage equipment malfunctions. However, in real-world commercial fitness scenarios, the complex physical environment poses a significant challenge to image quality, resulting in substantial reliability deficiencies in existing visual inspection technologies.
[0003] Taking widely used commercial treadmills as an example, these machines typically operate under high loads around the clock, causing a large amount of heat to accumulate inside the motor compartment. The surface temperature of core power components (such as the motor and inverter) often rises to 60-80℃. In stark contrast, to ensure user comfort, the indoor temperature of commercial gyms is usually kept constant at 20-22℃. This huge temperature difference between the inside and outside causes intense thermal convection in the heat dissipation holes and gaps of the motor cover. When a visual sensor monitors key transmission components, especially multi-wedge belts with fine grooves, through this thermal convection area, the intensely turbulent hot air causes non-uniform, random, millisecond-level spatiotemporal fluctuations in the air density and refractive index of the optical path medium.
[0004] Ordinary image sensors cannot distinguish between physical damage and optical disturbances, easily misinterpreting texture fluctuations caused by hot air as severe belt aging, thus triggering numerous false alarms. This not only leads to frequent, ineffective on-site inspections by maintenance personnel, increasing operating costs, but more seriously, raising the detection threshold to reduce false alarms often results in the missed detection of genuine, minute cracks. Traditional image denoising methods, while smoothing thermal noise, also destroy the fine texture details of the belt surface, failing to meet the requirements of high-precision flaw detection. Summary of the Invention
[0005] To address the technical problems existing in the background art, this invention proposes a remote fault detection system and method for fitness equipment, the specific solution of which is as follows: A remote fault detection method for fitness equipment includes the following steps: S1: Acquire multiple frames of images of the target component and construct an image sequence; S2: Perform multi-resolution decomposition on the images in the image sequence to construct an image pyramid, the image pyramid containing at least two resolution levels; S3: Based on the image pyramid, perform a hierarchical bispectral phase reconstruction operation to generate target phase information; S4: Based on the target phase information and the amplitude spectrum information extracted from the image sequence, generate a reconstructed image; S5: Based on the reconstructed image, identify the fault characteristics of the target component and generate detection results.
[0006] Furthermore, in S1, multiple frames of images of the target component are acquired, and an image sequence is constructed as follows: The image acquisition device continuously acquires local images of the target component at a predetermined frame rate within a preset time period to obtain a first image sequence containing N frames, where N is an integer greater than 1. For each frame in the first image sequence, generate an image portion corresponding to a preset high-risk area on the surface of the target component; An image sequence is constructed based on the sequence of image parts generated from each frame image.
[0007] Furthermore, in S2, the images in the image sequence are decomposed into multi-resolution images to construct an image pyramid, which contains at least two resolution levels, as follows: Based on each frame of the image sequence, Gaussian blur and downsampling operations are performed to generate an L-layer image set, where L is an integer greater than 1. The 0th layer is the original resolution image, and the (L-1)th layer is the lowest resolution image. The image pyramid is formed by the L-layer image set corresponding to each frame in the image sequence.
[0008] Furthermore, based on each frame of the image sequence, Gaussian blur and downsampling operations are performed to generate an L-layer image set, as follows: For each frame of the image sequence, perform the following iterative generation operation: Define the original resolution image of the current frame as the layer 0 image; For the Nth layer image, where N=0,1,...,L-2, perform a Gaussian blur operation once, and then perform a downsampling operation on the image after the Gaussian blur operation. The resulting image is defined as the N+1th layer image. After L-1 iterations of generation, an L-layer image set consisting of images from layer 0 to layer L-1 is obtained.
[0009] Furthermore, in S3, based on the image pyramid, a hierarchical bispectral phase reconstruction operation is performed to generate target phase information, as follows: Based on the image of the (L-1)th layer in the image pyramid, a basic phase is generated; Starting from the base phase, the phase reconstruction algorithm is performed layer by layer from the (L-1)th layer to the 0th layer to obtain the target phase information; For the phase reconstruction of the k-th layer, the initial value of k is L-2, which is successively decreased to 0, including: Upsample the reconstructed (k+1)th layer phase to generate the predicted phase of the kth layer. Based on the image of the k-th layer in the image pyramid, an observation phase is generated; Calculate the phase difference between the observed phase and the predicted phase, and generate a phase residual based on the phase difference; The predicted phase is added to the phase residual to generate the reconstructed phase of the k-th layer; When k=0, the reconstructed phase obtained is the target phase information.
[0010] Furthermore, based on the image of the (L-1)th layer in the image pyramid, a fundamental phase is generated as follows: Perform a two-dimensional Fourier transform on the image of the (L-1)th layer in the image pyramid; Based on the results of the two-dimensional Fourier transform, the Fourier transform of its third-order cumulant is calculated to generate a bispectrum. Based on the bispectrum, a fundamental phase is generated using a phase reconstruction algorithm; The phase reconstruction algorithm is a recursive algorithm.
[0011] Furthermore, in S4, based on the target phase information and the image sequence, a reconstructed image is generated as follows: Calculate the average amplitude spectrum of each frame in the image sequence to construct amplitude spectrum information; Based on the target phase information and the amplitude spectrum information, an inverse Fourier transform is performed to generate a reconstructed image.
[0012] Furthermore, in S5, based on the reconstructed image, the fault characteristics of the target component are identified as follows: The reconstructed image is input into a pre-trained fault identification model to obtain the identification result output by the fault identification model; The identification result includes at least one of the following: the confidence level and location of the crack on the surface of the target component, and the axial eccentricity of the target component; When the confidence level exceeds the first preset threshold, or the axial eccentricity exceeds the second preset threshold, the target component is determined to be faulty, and a corresponding detection result is generated.
[0013] Furthermore, a fault identification model is constructed: Obtain a training image sample set, which includes multiple sample images of the target component in normal condition and multiple sample images of the target component labeled with fault features; For each sample image in the training image sample set, perform steps S1 to S4 to generate the corresponding reconstructed training image; Based on the reconstructed training images and their corresponding fault feature annotations, a fault identification model is constructed using a convolutional neural network algorithm.
[0014] A remote fault detection system for fitness equipment includes: The image acquisition module is used to acquire multiple frames of images of the target component and construct an image sequence; A pyramid building module is used to perform multi-resolution decomposition on the images in the image sequence and build an image pyramid, wherein the image pyramid contains at least two resolution levels; The bispectral reconstruction module is used to perform hierarchical bispectral phase reconstruction operations based on the image pyramid to generate target phase information; An image synthesis module is used to generate a reconstructed image based on the target phase information and the amplitude spectrum information extracted from the image sequence; The fault identification module is used to identify the fault characteristics of the target component based on the reconstructed image and generate detection results.
[0015] Compared with the prior art, the present invention can achieve at least the following beneficial effects: 1. This invention effectively suppresses random optical phase noise caused by thermal turbulence due to high thermal load operation of equipment through layered dual-spectrum phase reconstruction and multi-resolution pyramid strategy. At the same time, it accurately preserves key high-frequency details such as belt groove texture and early cracks, thereby eliminating false alarms caused by thermal disturbance texture jitter at the source and preventing missed detections caused by increasing the detection threshold. It achieves highly robust, non-contact, remote, and accurate detection of micro-damage to core components of fitness equipment in complex thermodynamic environments.
[0016] 2. The layered bispectral phase reconstruction of this invention can effectively separate and suppress random phase noise that follows statistical laws, namely thermal turbulence, thereby eliminating the main source of false alarms from the imaging principle, so that maintenance personnel do not have to deal with a large number of false alarms; by combining multi-resolution pyramid and bispectral reconstruction, and adopting a coarse-to-fine strategy, it accurately preserves key high-frequency details such as belt groove texture and early cracks while filtering out noise, generating clear reconstructed images, providing high-quality input for subsequent identification, thereby significantly reducing the risk of missed detection; enabling visual predictive maintenance to be reliably applied in real industrial environments with high temperature and high vibration, reducing maintenance costs. Attached Figure Description
[0017] To more clearly illustrate the technical solutions of the embodiments of the present invention, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. Wherein: Figure 1 This is a flowchart of the method of the present invention.
[0018] Figure 2 This is a system principle block diagram of the present invention. Detailed Implementation
[0019] Embodiments of the present invention are described in detail below. Examples of these embodiments are illustrated in the accompanying drawings, wherein the same or similar symbols denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.
[0020] Please refer to Figure 1 This invention provides a remote fault detection method for fitness equipment, comprising the following steps: S1: Acquire multiple frames of images of the target component and construct an image sequence.
[0021] It should be noted that the target component specifically refers to the core moving parts of fitness equipment that are prone to fatigue damage due to high-speed rotation, continuous stress, or complex surface textures. Taking a treadmill as an example, its main target component is the multi-ribbed belt or drive roller; for other equipment, the target component can also be the flywheel of a stationary bike, the damper housing of a rowing machine, or the key load-bearing bearing of a strength trainer.
[0022] In S1, multiple frames of images of the target component are acquired and an image sequence is constructed as follows: The image acquisition device continuously acquires local images of the target component at a predetermined frame rate within a preset time period to obtain a first image sequence containing N frames, where N is an integer greater than 1. It should be noted that the preset duration should generally ensure coverage of several complete motion cycles of the target component. For example, for a multi-wedge belt moving at a linear speed of approximately 10 meters per second, the preset duration can be set to the time required for it to complete 2 to 5 full cycles through the field of view. The preset frame rate needs to be significantly higher than the motion frequency of the target component to avoid aliasing and to capture its minute surface changes; it should generally be no less than 120 frames per second. The purpose of high-speed acquisition is to acquire multiple frames of images with minute relative displacements within an extremely short time window. These displacements include both random interference caused by environmental factors such as airflow and regular abnormal modulation information caused by potential defects in the component (such as microcracks), providing a data basis for subsequent separation of the actual fault signal.
[0023] The image acquisition device is preferably a high-performance security or surveillance camera. It should be a commercial model that supports high frame rate mode (such as 60fps or above) and high resolution, and has certain anti-vibration and wide dynamic range capabilities to adapt to changes in ambient light in the gym.
[0024] For each frame in the first image sequence, generate an image portion corresponding to a preset high-risk area on the surface of the target component; An image sequence is constructed based on the sequence of image parts generated from each frame image.
[0025] It should be noted that the pre-defined high-risk areas on the surface of the target component are local areas most prone to fatigue damage, based on component structural mechanics analysis, historical failure statistics, or prior knowledge. For example, for a flywheel, high-risk areas typically include the junction of the spokes and hub, stress concentration points on the rim, and areas where the brushed surface texture may have microscopic defects due to manufacturing processes. During system initialization or installation, the coordinate range of these areas can be determined in the image coordinate system through pre-calibration or image template matching.
[0026] It should be noted that the step of generating the image portion corresponding to the preset high-risk area on the surface of the target component is usually accomplished through image cropping in practice. That is, based on the previously defined coordinates of the high-risk area, a sub-image containing only that specific area is extracted from each frame of the original first image sequence. Arranging the sub-images of all frames in chronological order constitutes the image sequence specifically processed in subsequent steps. This greatly reduces the amount of data required for subsequent image pyramid construction and bispectral analysis, focusing computational resources on the critical parts most likely to fail, thereby improving the real-time performance and efficiency of the entire detection system.
[0027] S2: Perform multi-resolution decomposition on the images in the image sequence to construct an image pyramid, which contains at least two resolution levels.
[0028] It should be noted that by providing different resolution representations of the same scene, the image pyramid enables a global and reliable phase estimate (i.e., the basic phase) to be obtained first at the top layer (L-1 layer) with the lowest resolution and the most stable data. This estimate can then be used as a constraint to gradually guide the phase unwrapping calculation of higher resolution layers, thereby effectively solving the phase entanglement problem that is very easy to occur when performing bispectral analysis directly on high-resolution images with complex textures.
[0029] In S2, the images in the image sequence are decomposed into multi-resolution components to construct an image pyramid. The image pyramid contains at least two resolution levels, as follows: Based on each frame of the image sequence, Gaussian blur and downsampling operations are performed to generate an L-layer image set, where L is an integer greater than 1. The 0th layer is the original resolution image, and the (L-1)th layer is the lowest resolution image. The image pyramid is formed by the L-layer image set corresponding to each frame in the image sequence.
[0030] It's important to note that the number of pyramid layers, L, is a positive integer that needs to be pre-defined based on the original image size and actual processing requirements. The principle for determining the number of layers L is to ensure that the image at the top of the pyramid (layer L-1) is still large enough to contain the basic outline and macroscopic features of the target component. Typically, the minimum side length of the top layer image is required to be no less than 16 to 32 pixels. For example, if the original image (layer 0) is 512x512 pixels, and it is continuously downsampled at a 1 / 2 ratio, when L=4, the image size of layer 3 (i.e., layer L-1) is 64x64 pixels, which is usually a suitable scale. Therefore, the value of L is typically between 3 and 5 layers.
[0031] Based on each frame of the image sequence, Gaussian blur and downsampling operations are performed to generate an L-layer image set, as follows: For each frame of the image sequence, perform the following iterative generation operation: Define the original resolution image of the current frame as the layer 0 image; For the Nth layer image, where N=0,1,...,L-2, perform a Gaussian blur operation once, and then perform a downsampling operation on the image after the Gaussian blur operation. The resulting image is defined as the N+1th layer image. After L-1 iterations of generation, an L-layer image set consisting of images from layer 0 to layer L-1 is obtained.
[0032] It's important to note that Gaussian blurring is a crucial preprocessing step before downsampling. Its purpose is to eliminate or reduce high-frequency noise and overly subtle textures in the image to avoid spectral aliasing during downsampling. Specifically, a two-dimensional Gaussian kernel with a standard deviation of σ is used to convolve the image. The choice of standard deviation σ is critical: if σ is too small, the anti-aliasing effect is insufficient; if σ is too large, it will lead to excessive blurring and loss of too much useful information. As a preferred balance, σ is typically taken between 0.5 and 1.5 pixels. For example, in one implementation, a Gaussian kernel with σ = 1.0 can be used.
[0033] It should be noted that downsampling typically refers to reducing the size of an image by a specific ratio. In this method, a 1 / 2 ratio is preferably used for downsampling, meaning both the image width and height are halved, and the total number of pixels becomes 1 / 4 of the original. This ratio ensures that the scale between the constructed pyramid levels is uniform, facilitating computation. The specific downsampling algorithm can be bilinear interpolation or bicubic interpolation, etc., to ensure that the downsampled image is smooth and retains the main structural information. After one iteration of Gaussian blur followed by one downsampling operation, the generated lower-layer image (layer N+1) compared to the upper-layer image (layer N) not only has a lower resolution, but also a corresponding decrease in the texture and detail frequencies it contains, and the severity of phase changes is reduced.
[0034] It should be noted that the final image pyramid is a four-dimensional data structure. For an image sequence consisting of N frames, each frame independently generates an image set containing L layers of images. Aligning all the image sets corresponding to the N frames along the time dimension forms a four-dimensional pyramid. The pyramid (where H and W are the height and width of each image layer, respectively). This structure serves as the unified data basis for subsequent steps to process each frame and each image layer in parallel or sequentially.
[0035] S3: Based on the image pyramid, perform a hierarchical bispectral phase reconstruction operation to generate target phase information.
[0036] It should be noted that the layered bispectral phase reconstruction operation is the core innovative step of this method. Its purpose is to robustly recover the phase information representing the true geometry and texture of the target component surface from image sequences disturbed by complex environments such as hot airflow. This operation combines the noise suppression capability of bispectral analysis with the phase unwrapping capability of multi-resolution pyramids. Bispectral analysis can effectively suppress random phase perturbations with an approximately Gaussian distribution caused by thermal convection, while the layered strategy solves the problem of periodic phase jumps caused by periodic high-frequency textures such as multi-wedge serrations, preventing the algorithm itself from generating false features resembling cracks.
[0037] In S3, based on the image pyramid, a hierarchical bispectral phase reconstruction operation is performed to generate target phase information, as follows: Based on the image of the (L-1)th layer in the image pyramid, a basic phase is generated; It should be noted that generating the basic phase based on the image at layer L-1 of the image pyramid is the starting point for hierarchical reconstruction. The top layer, L-1, is chosen because it has the lowest resolution, high-frequency details (including interfering textures and noise) are greatly smoothed, and the true phase change between adjacent pixels is very slow, much smaller than π. Under these conditions, bispectral analysis can yield a global, continuous, and unconvoluted phase estimate for the first time, i.e., the basic phase. Although this basic phase is coarse, it provides the correct phase topology that all subsequent fine reconstructions depend on.
[0038] Starting from the base phase, the phase reconstruction algorithm is performed layer by layer from the (L-1)th layer to the 0th layer to obtain the target phase information; For the phase reconstruction of the k-th layer, the initial value of k is L-2, which is successively decreased to 0, including: Upsample the reconstructed (k+1)th layer phase to generate the predicted phase of the kth layer. Based on the image of the k-th layer in the image pyramid, an observation phase is generated; Calculate the phase difference between the observed phase and the predicted phase, and generate a phase residual based on the phase difference; The predicted phase is added to the phase residual to generate the reconstructed phase of the k-th layer; When k=0, the reconstructed phase obtained is the target phase information.
[0039] It should be noted that the layer-by-layer phase reconstruction from layer L-1 to layer 0 is a guided unwrapping process that progresses from coarse to fine. For layer k, the predicted phase is obtained by upsampling (e.g., bilinear interpolation) of the sharp phase reconstructed in the next layer (layer k+1), providing the low-frequency trend of phase changes at this resolution. The observed phase is the initial result obtained by directly performing bispectral analysis on the layer k image; it contains high-frequency details but is also full of entanglement. The key step lies in generating the phase residual based on the phase difference: calculating the difference between the observed phase and the predicted phase. Then to Perform modulo 2π operations, i.e. Constrain its range to This operation utilizes the principle that the phase change caused by high-frequency details is small and will not exceed π, thus safely unwinding the coil and obtaining a clean phase residual.
[0040] It's important to note that adding the predicted phase to the phase residual essentially fuses the reliable phase profile (predicted phase) from the low-resolution image with the high-detail phase correction (phase residual) at the current resolution, resulting in the unwound reconstructed phase for the k-th layer. This process starts at k=L-2 and iterates upwards layer by layer (increasing resolution), with the reconstruction result of each layer serving as the phase prediction benchmark for the next higher-resolution layer. When iterating to layer 0 (original resolution), the obtained reconstructed phase is the final target phase information. It preserves the texture details of the original image to the greatest extent possible while completely eliminating random phase noise caused by airflow disturbances and phase wringing artifacts generated by calculation.
[0041] Based on the image at layer L(1) of the image pyramid, the fundamental phase is generated as follows: Perform a two-dimensional Fourier transform on the image of the (L-1)th layer in the image pyramid; Based on the results of the two-dimensional Fourier transform, the Fourier transform of its third-order cumulant is calculated to generate a bispectrum. Based on the bispectrum, a fundamental phase is generated using a phase reconstruction algorithm; The phase reconstruction algorithm is a recursive algorithm.
[0042] It's important to note that calculating the bispectrum is crucial for recovering the fundamental phase. The bispectrum is a Fourier transform of the third-order cumulant of the signal, and its mathematical properties dictate that it is blind to Gaussian random noise with zero mean (which can precisely model random phase jitter caused by turbulence). Specifically, the calculation first involves performing a two-dimensional Fourier transform on the image sequence of layer L(L-1) to obtain the complex spectrum. Then, according to the formula... Calculate its bispectral density, where It's the spectrum. This indicates conjugate. Bispectral processing preserves the phase information of the signal but filters out the phase effects of additive Gaussian noise.
[0043] It should be noted that generating the fundamental phase through the phase reconstruction algorithm refers to reconstructing the phase from the bispectral data. The Fourier phase of the original signal is obtained by solving the algorithm. This embodiment preferentially employs a recursive algorithm. This algorithm is based on the phase relationship of the bispectral spectrum. Starting from a predetermined initial point (such as zero frequency), the phase values of all frequency points are calculated step by step through a recursive relationship. Because the phase change of the L-1 layer image is gradual, the recursive algorithm can converge stably and accurately to the true fundamental phase under this condition.
[0044] S4: Generate a reconstructed image based on the target phase information and the amplitude spectrum information extracted from the image sequence.
[0045] It should be noted that a clear, interference-free reconstructed image is synthesized by utilizing the repaired target phase information and the robustly estimated amplitude spectrum information from the original data. This transforms the processed signal in the frequency domain back into the spatial domain, generating an optimized image that can be directly interpreted by subsequent fault identification models.
[0046] In S4, a reconstructed image is generated based on the target phase information and the image sequence, as follows: Calculate the average amplitude spectrum of each frame in the image sequence to construct amplitude spectrum information; It should be noted that calculating the average amplitude spectrum of each frame in the image sequence is crucial for obtaining stable amplitude information. Although airflow turbulence causes random and drastic phase changes in each frame, the amplitude spectrum of the image content (i.e., the energy intensity of each frequency component) remains relatively stable over a short period. By performing a two-dimensional Fourier transform on each of the N frames (e.g., N=30) and then averaging their amplitude spectra (i.e., the modulus of the complex spectrum) point by point, transient noise that may exist in a single frame can be effectively smoothed out, resulting in a lower-noise and more reliable average amplitude spectrum. This average amplitude spectrum represents the inherent frequency energy distribution of the target component's surface texture, unaffected by random phase disturbances, during the acquisition period.
[0047] Based on the target phase information and the amplitude spectrum information, an inverse Fourier transform is performed to generate a reconstructed image.
[0048] It should be noted that performing the inverse Fourier transform based on the target phase information and the amplitude spectrum information is the physical and mathematical foundation of image reconstruction. According to Fourier optics theory, an image is entirely determined by the amplitude spectrum and phase spectrum of its Fourier transform. The phase spectrum carries the spatial location information of edges, contours, and detailed structures in the image, and is crucial for visual perception. In this method, the target phase information is the true phase spectrum after careful restoration in step S3, eliminating airflow disturbance noise; the amplitude spectrum information is the stable average amplitude spectrum calculated above. Combining the two constructs a complex spectrum: ,in For the average amplitude spectrum, Let j be the target phase information, and j be the imaginary unit. Performing a two-dimensional inverse Fourier transform on this complex spectrum transforms it back from the frequency domain to the spatial domain, thereby generating the final reconstructed image. .
[0049] It should be noted that the final reconstructed image has the following core features: it preserves the true texture details of the target component's surface in the original image to the greatest extent possible (such as the toothed structure and surface texture of the multi-wedge band) because its phase information has been accurately recovered; at the same time, it significantly suppresses or even eliminates refractive index fluctuations and imaging distortions caused by environmental factors such as hot airflow, because the random phase noise that causes these distortions has been effectively filtered out in bispectral analysis and layered reconstruction. Therefore, It is an image that is closer to that taken in an ideal and stable optical medium, which makes it easier and more reliable to identify the real surface cracks, tooth root fractures and other fault features, whether by manual observation or AI model detection. This solves the technical problem of high false alarm rate caused by environmental interference such as thermal turbulence in the background technology.
[0050] S5: Based on the reconstructed image, identify the fault characteristics of the target component and generate detection results.
[0051] It should be noted that the purpose of S5 is to perform automated intelligent analysis on the clear reconstructed image after interference removal, to achieve accurate identification and judgment of fault features, and to form structured detection results.
[0052] In S5, based on the reconstructed image, the fault characteristics of the target component are identified as follows: The reconstructed image is input into a pre-trained fault identification model to obtain the identification result output by the fault identification model; The identification result includes at least one of the following: the confidence level and location of the crack on the surface of the target component, and the axial eccentricity of the target component; When the confidence level exceeds the first preset threshold, or the axial eccentricity exceeds the second preset threshold, the target component is determined to be faulty, and a corresponding detection result is generated.
[0053] It should be noted that the pre-trained fault identification model is a machine learning model specifically optimized for identifying surface defects and mechanical deviations in specific target components (such as multi-ribbed belts or drive rollers in a treadmill). This model takes the reconstructed image generated in the preceding steps as input, and its output is structured data. Specifically: Confidence and location of surface cracks (or tooth root fractures): If the model detects a suspected crack, it will output a confidence level between 0 and 1, indicating the reliability of the judgment, and output a bounding box or its pixel-level segmentation mask marking the defect area to indicate the location.
[0054] Anomalies in critical dimensions: For example, for drive rollers, radial runout can be calculated by analyzing the edge contours in the reconstructed image; for multi-wedge belts, tooth wear depth or pitch anomalies can be assessed. These image measurements, combined with camera calibration parameters, can be converted into actual physical deviations.
[0055] It should be noted that the first and second preset thresholds are the decision boundaries for determining faults and need to be set comprehensively based on actual operation and maintenance standards, false alarm tolerance, and component safety margins. For example, the first preset threshold (crack confidence level) can be set to 85% to achieve a balance between detection rate and false alarm rate; the second preset threshold (such as wear depth or runout) can be set according to the equipment manufacturer's technical specifications, for example, wear depth exceeding 1 mm. When any condition is triggered, the system determines that the target component has a fault and automatically generates a standardized detection result report containing fault type, level, location, and evidence images (i.e., reconstructed images) to trigger a maintenance work order or warning.
[0056] Constructing a fault identification model: Obtain a training image sample set, which includes multiple sample images of the target component in normal condition and multiple sample images of the target component labeled with fault features; For each sample image in the training image sample set, perform steps S1 to S4 to generate the corresponding reconstructed training image; Based on the reconstructed training images and their corresponding fault feature annotations, a fault identification model is constructed using a convolutional neural network algorithm.
[0057] It is important to note that building a dedicated fault identification model is a prerequisite for ensuring the effectiveness of step S5. Its training process emphasizes consistency with the front-end denoising workflow: the original sample images in the training image sample set (whether normal or faulty) need to completely simulate the online process, i.e., perform the same steps S1 to S4 to generate reconstructed training images. This is crucial because it ensures that the model learns and recognizes the image features after bispectral pyramid reconstruction, rather than the features in the original image affected by thermal disturbances, thus guaranteeing the highest recognition accuracy and generalization ability for reconstructed images produced by the same process when deployed online.
[0058] It should be noted that during training, the network is trained end-to-end using reconstructed training images and their annotations (such as defect bounding boxes and wear depth ground truth). By optimizing the loss function (such as classification loss and localization loss), the model learns to accurately identify and quantify fault features from the images.
[0059] Please refer to Figure 2 A remote fault detection system for fitness equipment, comprising: The image acquisition module is used to acquire multiple frames of images of the target component and construct an image sequence; A pyramid building module is used to perform multi-resolution decomposition on the images in the image sequence and build an image pyramid, wherein the image pyramid contains at least two resolution levels; The bispectral reconstruction module is used to perform hierarchical bispectral phase reconstruction operations based on the image pyramid to generate target phase information; An image synthesis module is used to generate a reconstructed image based on the target phase information and the amplitude spectrum information extracted from the image sequence; The fault identification module is used to identify the fault characteristics of the target component based on the reconstructed image and generate detection results.
[0060] In summary, this patent application innovatively combines the anti-random interference capability of bispectral analysis with the structural preservation capability of hierarchical processing, successfully solving the reliability bottleneck of visual inspection caused by the thermodynamic effects of equipment, which traditional methods cannot overcome, and realizing accurate and stable detection of fault characteristics in complex physical environments.
[0061] Furthermore, any content not described in detail in this specification is existing technology known to those skilled in the art.
[0062] In the embodiments provided by this invention, it should be understood that the disclosed system or method can be implemented in other ways. For example, the embodiments of the invention described above are merely illustrative; for instance, the division of modules is only a logical functional division, and there may be other division methods in actual implementation.
[0063] The modules described as separate components may or may not be physically separate. The components shown as modules may or may not be physical modules; they may be located in one place or distributed across multiple network modules. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs.
[0064] Furthermore, the functional modules in the various embodiments of the present invention can be integrated into one processing module, or each module can exist physically separately, or two or more modules can be integrated into one module. The integrated module can be implemented in hardware or in the form of hardware plus software functional modules.
[0065] For those skilled in the art, it is obvious that the present invention is not limited to the details of the above exemplary embodiments, and that the present invention can be implemented in other specific forms without departing from the basic characteristics of the present invention.
[0066] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.
Claims
1. A remote fault detection method for fitness equipment, characterized in that, Includes the following steps: S1: Acquire multiple frames of images of the target component and construct an image sequence; S2: Perform multi-resolution decomposition on the images in the image sequence to construct an image pyramid, the image pyramid containing at least two resolution levels; S3: Based on the image pyramid, perform a hierarchical bispectral phase reconstruction operation to generate target phase information; S4: Based on the target phase information and the amplitude spectrum information extracted from the image sequence, generate a reconstructed image; S5: Based on the reconstructed image, identify the fault characteristics of the target component and generate detection results.
2. The remote fault detection method for fitness equipment as described in claim 1, characterized in that: In S1, multiple frames of images of the target component are acquired and an image sequence is constructed as follows: The image acquisition device continuously acquires local images of the target component at a predetermined frame rate within a preset time period to obtain a first image sequence containing N frames, where N is an integer greater than 1. For each frame in the first image sequence, generate an image portion corresponding to a preset high-risk area on the surface of the target component; An image sequence is constructed based on the sequence of image parts generated from each frame image.
3. The remote fault detection method for fitness equipment as described in claim 1, characterized in that: In S2, the images in the image sequence are decomposed into multi-resolution components to construct an image pyramid. The image pyramid contains at least two resolution levels, as follows: Based on each frame of the image sequence, Gaussian blur and downsampling operations are performed to generate an L-layer image set, where L is an integer greater than 1. The 0th layer is the original resolution image, and the (L-1)th layer is the lowest resolution image. The image pyramid is formed by the L-layer image set corresponding to each frame in the image sequence.
4. The remote fault detection method for fitness equipment as described in claim 3, characterized in that: Based on each frame of the image sequence, Gaussian blur and downsampling operations are performed to generate an L-layer image set, as follows: For each frame of the image sequence, perform the following iterative generation operation: Define the original resolution image of the current frame as the layer 0 image; For the Nth layer image, where N=0,1,...,L-2, perform a Gaussian blur operation once, and then perform a downsampling operation on the image after the Gaussian blur operation. The resulting image is defined as the N+1th layer image. After L-1 iterations of generation, an L-layer image set consisting of images from layer 0 to layer L-1 is obtained.
5. The remote fault detection method for fitness equipment as described in claim 4, characterized in that: In S3, based on the image pyramid, a hierarchical bispectral phase reconstruction operation is performed to generate target phase information, as follows: Based on the image of the (L-1)th layer in the image pyramid, a basic phase is generated; Starting from the base phase, the phase reconstruction algorithm is performed layer by layer from the (L-1)th layer to the 0th layer to obtain the target phase information; For the phase reconstruction of the k-th layer, the initial value of k is L-2, which is successively decreased to 0, including: Upsample the reconstructed (k+1)th layer phase to generate the predicted phase of the kth layer. Based on the image of the k-th layer in the image pyramid, an observation phase is generated; Calculate the phase difference between the observed phase and the predicted phase, and generate a phase residual based on the phase difference; The predicted phase is added to the phase residual to generate the reconstructed phase of the k-th layer; When k=0, the reconstructed phase obtained is the target phase information.
6. The remote fault detection method for fitness equipment as described in claim 5, characterized in that: Based on the image at layer L(1) of the image pyramid, the fundamental phase is generated as follows: Perform a two-dimensional Fourier transform on the image of the (L-1)th layer in the image pyramid; Based on the results of the two-dimensional Fourier transform, the Fourier transform of its third-order cumulant is calculated to generate a bispectrum. Based on the bispectrum, a fundamental phase is generated using a phase reconstruction algorithm; The phase reconstruction algorithm is a recursive algorithm.
7. The remote fault detection method for fitness equipment as described in claim 1, characterized in that: In S4, a reconstructed image is generated based on the target phase information and the image sequence, as follows: Calculate the average amplitude spectrum of each frame in the image sequence to construct amplitude spectrum information; Based on the target phase information and the amplitude spectrum information, an inverse Fourier transform is performed to generate a reconstructed image.
8. The remote fault detection method for fitness equipment as described in claim 1, characterized in that: In S5, based on the reconstructed image, the fault characteristics of the target component are identified as follows: The reconstructed image is input into a pre-trained fault identification model to obtain the identification result output by the fault identification model; The identification result includes at least one of the following: the confidence level and location of the crack on the surface of the target component, and the axial eccentricity of the target component; When the confidence level exceeds the first preset threshold, or the axial eccentricity exceeds the second preset threshold, the target component is determined to be faulty, and a corresponding detection result is generated.
9. The remote fault detection method for fitness equipment as described in claim 8, characterized in that: Constructing a fault identification model: Obtain a training image sample set, which includes multiple sample images of the target component in normal condition and multiple sample images of the target component labeled with fault features; For each sample image in the training image sample set, perform steps S1 to S4 to generate the corresponding reconstructed training image; Based on the reconstructed training images and their corresponding fault feature annotations, a fault identification model is constructed using a convolutional neural network algorithm.
10. A remote fault detection system for fitness equipment, employing the remote fault detection method for fitness equipment as described in any one of claims 1-9, characterized in that, include: The image acquisition module is used to acquire multiple frames of images of the target component and construct an image sequence; A pyramid building module is used to perform multi-resolution decomposition on the images in the image sequence and build an image pyramid, wherein the image pyramid contains at least two resolution levels; The bispectral reconstruction module is used to perform hierarchical bispectral phase reconstruction operations based on the image pyramid to generate target phase information; An image synthesis module is used to generate a reconstructed image based on the target phase information and the amplitude spectrum information extracted from the image sequence; The fault identification module is used to identify the fault characteristics of the target component based on the reconstructed image and generate detection results.