Motor detection method, device and terminal equipment
By converting motor acoustic fingerprint data into graphical difference field images and constructing a normal feature library, and combining redundancy elimination regularization loss and joint similarity calculation, the problem of the dependence of contrastive learning models on abnormal data in motor fault detection is solved, and efficient motor fault detection is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SEVNCE ROBOTICS CO LTD
- Filing Date
- 2023-03-06
- Publication Date
- 2026-06-12
AI Technical Summary
Existing technologies using contrastive learning models for motor fault detection require a large amount of abnormal voiceprint data to train the model, which makes the model prone to collapse and difficult to effectively distinguish between normal and abnormal voiceprint data.
By converting normal motor acoustic fingerprint data into training graphic difference field images, a contrastive learning model is used to train the encoder model, a normal feature library is constructed, and redundancy elimination regularization loss is used for training. The joint similarity is calculated by combining negative cosine similarity and Euclidean distance to determine whether the acoustic fingerprint data of the motor under test is abnormal.
Without relying on a large amount of abnormal acoustic signature data, model collapse is effectively avoided, thus improving the accuracy and reliability of motor fault detection.
Smart Images

Figure CN116184197B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of fault detection technology, and in particular to a method, apparatus and terminal equipment for motor detection. Background Technology
[0002] When a motor malfunctions, it is usually accompanied by periodic abnormal noises. Collecting this audio data can reveal some obvious abnormal characteristics. Therefore, the commonly used sound detection method is to perform feature engineering on the audio signal and then model it to obtain discriminative features. These features are then classified to achieve the purpose of detecting abnormal sounds.
[0003] Audio feature engineering includes frequency domain features such as Short-Time Fourier Transform, Mel-frequency spectrum, Log-Mel Spectrogram features, and MFCC features, as well as cepstral features. Typically, these features can be directly modeled and analyzed using machine learning algorithms such as Support Vector Machines, Gaussian Mixture Models, and Hidden Markov Models. However, the results are often unsatisfactory. This is because directly modeling and analyzing such specific and redundant features as frequency domain features or cepstral features makes it difficult for traditional machine learning methods, which already have insufficient fitting ability, to learn the true feature differences. Therefore, contrastive learning models are used to preprocess the features. Contrastive learning models are typical discriminative self-supervised learning. Their guiding principle is to automatically construct similar and dissimilar instances to learn a representation learning model. This model ensures that similar instances are close in the projection space, while dissimilar instances are far apart. If there are no negative samples in the training data, the model will, during prediction, treat normal and abnormal instances as being very close in the projection space, making it impossible to distinguish whether the voiceprint data is normal—this is model collapse. Most contrastive learning methods add negative samples to the dataset to avoid model collapse, but in real-world scenarios, it is difficult to obtain a large amount of abnormal voiceprint data to train the model. Summary of the Invention
[0004] The main objective of this invention is to propose a motor detection method, device, and terminal equipment to solve the problem that a large amount of abnormal voiceprint data is needed to train the model when using a contrastive learning model in audio feature engineering.
[0005] To achieve the above objectives, a first aspect of the present invention provides a motor detection method, comprising:
[0006] Before performing motor testing, acquire acoustic fingerprint data of a normal motor;
[0007] After converting the normal motor acoustic data into a training graphic difference field image, it is input into the encoder model trained by the contrastive learning model to obtain high-dimensional features and construct a normal feature library.
[0008] The encoder model trained by the contrastive learning model obtains orthogonal high-dimensional features based on the training graph difference field image and constructs them into a normal feature library.
[0009] The encoder model trained by the contrastive learning model includes two branches, each branch including an encoder and a predictor.
[0010] The encoder is used to extract general low-level features and encode general detail information that is not relevant to the task, while the predictor is used to encode high-level feature information that is relevant to the task.
[0011] When performing motor testing, acquire the acoustic signature data of the motor to be predicted;
[0012] After converting the motor acoustic data to be predicted into a target graphic differential field image, the image is input into the encoder model trained by the contrastive learning model to obtain the target features encoded and output by the encoder model trained by the contrastive learning model.
[0013] Calculate the joint similarity between the target feature and existing features in the normal feature library;
[0014] If the similarity is outside the preset range, the motor voiceprint data to be predicted is abnormal, and the motor is in an abnormal working state.
[0015] In conjunction with the first aspect of the present invention, in the first embodiment of the present invention, before converting the normal motor acoustic signature data into a training graphic difference field image, the following steps are included:
[0016] Convert the normal motor acoustic fingerprint data into time-series data. ;
[0017] Where N is a positive integer.
[0018] In conjunction with the first embodiment of the first aspect of the present invention, in the second embodiment of the present invention, converting the normal motor acoustic signature data into a training graphic difference field image includes:
[0019] Set the number of training image difference fields. ,in ;
[0020] Set step size ,in , ,in, The maximum step size is defined, and any two step sizes are different.
[0021] Set time window ,in Where any two time windows have different lengths;
[0022] Based on the number of images, step size, and time window, extract any segment of the time-series data multiple times;
[0023] The results of multiple extractions are combined and transformed to obtain the training image difference field image.
[0024] In conjunction with the second embodiment of the first aspect of the present invention, in the third embodiment of the present invention, the results of multiple extractions are combined and transformed to obtain a training pattern difference field image, including:
[0025] The results of multiple extractions are combined and transformed to obtain the image difference set. The calculation formula is:
[0026] ;
[0027] in, There are a total of A graphic, This is the result of multiple extractions. For a moment Data;
[0028] A new sequence is defined based on the difference set of the graph, and zero-padding is used to make the length of the new sequence consistent with that of the difference set of the graph. The calculation formula is:
[0029] ;
[0030] Define the graphical difference field MDF n The calculation formula is:
[0031]
[0032] in, This represents (n-1) sequence sets corresponding to the step size d, enabling the graphic difference field to generate (n-1) channel images corresponding to the sequence sets;
[0033] For the i-th channel image, define an image array. for: ;
[0034] in,
[0035] Fill in the zero elements in the image array to define each channel of the graphic difference field image:
[0036]
[0037] in , It is by It came spinning. The Hadamard product is defined as the completed graphic difference field image, which is then called the training graphic difference field image.
[0038] In conjunction with the first aspect of the present invention, in the fourth embodiment of the present invention, converting the normal motor acoustic data into a training graphic difference field image and then inputting it into an encoder model trained by a contrastive learning model includes:
[0039] Input any two of the training image difference field images into the encoder model trained by the contrast learning model;
[0040] In this process, any two training pattern difference field images are encoded by the encoder and then mapped again by the predictor.
[0041] In conjunction with the first aspect of the present invention, in the fifth embodiment of the present invention, the encoder model trained by the contrastive learning model is trained using a redundancy elimination regularization loss, wherein the redundancy elimination regularization loss is:
[0042] ;
[0043] Where cr represents the correlation matrix, I represents the identity matrix, and z1 and z2 represent the features encoded by the two branches.
[0044] In conjunction with the first aspect of the present invention, in the sixth embodiment of the present invention, the joint similarity includes negative cosine similarity and Euclidean distance, and the calculation formula is as follows:
[0045] Q= ;
[0046] Among them, z a , z b This represents two features from the normal feature library.
[0047] A second aspect of the present invention provides a motor detection device, comprising:
[0048] The training data acquisition module is used to acquire normal motor acoustic fingerprint data before motor testing;
[0049] The normal feature library construction module is used to convert the normal motor acoustic data into a training graphic difference field image and then input it into the encoder model trained by the contrast learning model to obtain high-dimensional features and construct a normal feature library.
[0050] The encoder model trained by the contrastive learning model obtains orthogonal high-dimensional features based on the training graph difference field image and constructs them into a normal feature library.
[0051] The encoder model trained by the contrastive learning model includes two branches, each branch including an encoder and a predictor.
[0052] The encoder is used to extract general low-level features and encode general detail information that is not relevant to the task, while the predictor is used to encode high-level feature information that is relevant to the task.
[0053] The module for acquiring data to be predicted is used to acquire the acoustic fingerprint data of the motor to be predicted during motor testing.
[0054] The target feature conversion module is used to convert the motor acoustic data to be predicted into a target graphic differential field image and then input it into the encoder model trained by the contrastive learning model to obtain the target features encoded and output by the encoder model trained by the contrastive learning model.
[0055] The motor detection module is used to calculate the joint similarity between the target feature and existing features in the normal feature library;
[0056] If the similarity is outside the preset range, the motor voiceprint data to be predicted is abnormal, and the motor is in an abnormal working state.
[0057] A third aspect of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the steps of the method provided in the first aspect above.
[0058] A fourth aspect of the present invention provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the steps of the method provided in the first aspect above.
[0059] This invention proposes a motor detection method. An encoder model trained using a contrastive learning model outputs high-dimensional features corresponding to normal motor acoustic fingerprint data. This allows for the acquisition of remaining features to construct a normal feature library. The normal feature library includes relevant and irrelevant features. This library is used to calculate the joint similarity between the normal feature library and the target features corresponding to the acoustic fingerprint data of the motor to be predicted. By comparing these features, the method determines whether the acoustic fingerprint data of the motor to be predicted is abnormal. Therefore, this invention solves the problem of model collapse without requiring a large amount of abnormal acoustic fingerprint data to train the model. Attached Figure Description
[0060] Figure 1 This is a schematic diagram illustrating the implementation process of the motor detection method provided in an embodiment of the present invention;
[0061] Figure 2A schematic diagram of the encoder model for training a contrastive learning model provided in an embodiment of the present invention;
[0062] Figure 3 This is a schematic diagram of the composition of the motor detection device provided in an embodiment of the present invention.
[0063] The realization of the objective, functional features and advantages of the present invention will be further explained in conjunction with the embodiments and with reference to the accompanying drawings. Detailed Implementation
[0064] It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
[0065] It should be noted that, in this document, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Unless otherwise specified, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes that element.
[0066] In this document, suffixes such as "module," "part," or "unit" used to denote elements are used only for the purpose of illustrative purposes and do not have any specific meaning in themselves. Therefore, "module" and "part" can be used interchangeably.
[0067] like Figure 1 As shown, this embodiment of the invention provides a motor detection method, including but not limited to the following steps:
[0068] S101. Before performing motor testing, obtain normal motor acoustic signature data;
[0069] S102. The normal motor acoustic data is converted into a training graphic difference field image and then input into the encoder model trained by the contrastive learning model to obtain high-dimensional features and construct a normal feature library; wherein, the encoder model trained by the contrastive learning model obtains orthogonal high-dimensional features based on the training graphic difference field image and constructs them as a normal feature library.
[0070] Before converting the normal motor acoustic fingerprint data into a training graphic difference field image in step S102 above, the normal motor acoustic fingerprint data is further processed, including:
[0071] Convert the normal motor acoustic fingerprint data into time-series data. .
[0072] This invention embodiment uses a graphical difference algorithm to convert normal motor acoustic fingerprint data into a training graphical difference field image. Based on the processed normal motor acoustic fingerprint data, the step S102 of converting the normal motor acoustic fingerprint data into a training graphical difference field image includes:
[0073] Set the number of training image difference fields. ,in ;
[0074] Set step size ,in , ,in, The maximum step size is defined, and any two step sizes are different.
[0075] Set time window ,in Where any two time windows have different lengths;
[0076] Based on the number of images, step size, and time window, extract any segment of the time-series data multiple times;
[0077] The results of multiple extractions are combined and transformed to obtain the training image difference field image.
[0078] The steps for implementing the combination transformation are as follows:
[0079] The results of multiple extractions are combined and transformed to obtain the training image difference field image, including:
[0080] The results of multiple extractions are combined and transformed to obtain the image difference set. The calculation formula is:
[0081] ;
[0082] in, There are a total of A graphic, This is the result of multiple extractions. This represents the graphical difference field transformation. For time series data, time points Data;
[0083] A new sequence is defined based on the difference set of the graph, and zero-padding is used to make the length of the new sequence consistent with that of the difference set of the graph. The calculation formula is:
[0084] ;
[0085] Define the graphical difference field MDF n The calculation formula is:
[0086]
[0087] in, This represents (n-1) sequence sets corresponding to the step size d, enabling the graphic difference field to generate (n-1) channel images corresponding to the sequence sets;
[0088] For the i-th channel image, define an image array. for: ;
[0089] in,
[0090] Fill in the zero elements in the image array to define each channel of the graphic difference field image:
[0091]
[0092] in , It is by It came spinning. The Hadamard product is defined as the completed graphic difference field image, which is then called the training graphic difference field image.
[0093] In this embodiment of the invention, the encoder model trained by the contrastive learning model includes two branches, each branch including an encoder and a predictor;
[0094] The encoder is used to extract general low-level features and encode general details that are not relevant to the task, while the predictor is used to encode high-level feature information that is relevant to the task.
[0095] Therefore, in step S102 above, after the normal motor acoustic data is converted into a training graphic difference field image and input into the encoder model trained by the contrastive learning model, the high-dimensional features are output to construct a normal feature library. The normal feature library includes relevant features and irrelevant features.
[0096] It should be noted that there is more than one set of normal motor acoustic fingerprint data, so there is also more than one training graphic difference field image. Therefore, in the stage before motor detection, when using the encoder model trained by the contrastive learning model, multiple training graphic difference field images are obtained and input into the encoder model trained by the contrastive learning model.
[0097] Based on this, in step S102 above, converting the normal motor acoustic data into a training graphic difference field image and then inputting it into the encoder model trained by the contrastive learning model includes:
[0098] Input any two of the training image difference field images into the encoder model trained by the contrast learning model;
[0099] In this process, any two training pattern difference field images are encoded by the encoder and then mapped again by the predictor.
[0100] like Figure 2 As shown in the illustration, this embodiment of the invention also provides a structural diagram of the encoder model trained by the contrastive learning model. The predictor is a high-level network close to the task, encoding more information related to the contrastive learning task. For example, the predictor is a multilayer perceptron. The encoder is a low-level network that tends to extract general low-level features, which are often task-independent and highly general, thus encoding more general details unrelated to the task. For example, the encoder is a feature pyramid structure used to improve the model's encoding ability and encode features containing more rich information. In specific applications, for low-level networks, contrastive learning training of task-related features may have a negative impact. That is, if the mapping network only contains the encoder, then the feature representation will contain many pre-trained task-related features, affecting the task performance. However, the encoder model trained by the contrastive learning model in this embodiment of the invention adds a predictor, which is equivalent to increasing the network depth. These task-related features are gathered in the predictor. At this time, the encoder no longer contains pre-trained task-related features, but only more general details. Therefore, when two randomly input training image difference field images are encoded by the encoder, two mapping processes are required.
[0101] Furthermore, in this embodiment of the invention, the encoder model trained by the contrastive learning model is trained using a redundancy elimination regularization loss, wherein the redundancy elimination regularization loss is:
[0102] ;
[0103] Where cr represents the correlation matrix, I represents the identity matrix, and z1 and z2 represent the features encoded by the two branches.
[0104] In practical applications, contrastive learning models can avoid model collapse by training with a large amount of abnormal voiceprint data, but they still require PCA dimensionality reduction to remove redundant similar features. PCA dimensionality reduction depends on setting parameters, which takes a significant amount of time. In this embodiment of the invention, redundancy elimination regularization loss is used, integrating the dimensionality reduction operation into the model. This saves considerable time in setting dimensionality reduction parameters and allows the model to encode features containing less redundant information, encoding as many orthogonal features as possible. This ensures that the normal feature library contains representative and non-repeating features, thereby improving the accuracy of feature comparison.
[0105] S103. When performing motor testing, acquire the acoustic fingerprint data of the motor to be predicted.
[0106] S104. After converting the motor acoustic data to be predicted into a target graphic differential field image, input it into the encoder model trained by the contrastive learning model to obtain the target features encoded and output by the encoder model trained by the contrastive learning model.
[0107] S105. Calculate the joint similarity between the target feature and the existing features in the normal feature library.
[0108] If the similarity is outside the preset range, the motor voiceprint data to be predicted is abnormal, and the motor is in an abnormal working state.
[0109] It is conceivable that if the similarity is within a preset range, then the motor voiceprint data to be predicted is normal, and the motor is in normal working condition.
[0110] In practical applications, Euclidean distance is typically used to determine similarity. However, Euclidean distance is only a numerical determination and cannot determine similarity in feature directions. Therefore, we introduce negative cosine similarity and Euclidean distance together to form a joint similarity, that is, to integrate feature values and feature directions to comprehensively measure the similarity between features. Therefore, in this embodiment of the invention, the joint similarity in step S105 includes negative cosine similarity and Euclidean distance. For example, its calculation formula is as follows:
[0111] Q=
[0112] Among them, z a , z b This represents two features from the normal feature library.
[0113] like Figure 3 As shown, this embodiment of the invention also provides a motor detection device 30, comprising:
[0114] The training data acquisition module 31 is used to acquire normal motor acoustic fingerprint data before motor testing;
[0115] The normal feature library construction module 32 is used to convert the normal motor acoustic data into a training graphic difference field image and then input it into the encoder model trained by the contrast learning model to obtain high-dimensional features and a normal feature library.
[0116] The encoder model trained by the contrastive learning model outputs high-dimensional features based on the training graph difference field image, and the remaining features are used to construct a normal feature library.
[0117] The encoder model trained by the contrastive learning model includes two branches, each branch including an encoder and a predictor.
[0118] The encoder is used to extract general low-level features and encode general detail information that is not relevant to the task, while the predictor is used to encode high-level feature information that is relevant to the task.
[0119] The prediction data acquisition module 33 is used to acquire the motor acoustic fingerprint data to be predicted when performing motor detection.
[0120] The target feature conversion module 34 is used to convert the motor acoustic data to be predicted into a target graphic differential field image and then input it into the encoder model trained by the contrastive learning model to obtain the target features encoded and output by the encoder model trained by the contrastive learning model.
[0121] Motor detection module 35 is used to calculate the joint similarity between the target feature and existing features in the normal feature library;
[0122] If the similarity is outside the preset range, the motor voiceprint data to be predicted is abnormal, and the motor is in an abnormal working state.
[0123] This invention also provides a terminal device including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements each step of the motor detection method described in the above embodiments.
[0124] This invention also provides a storage medium, which is a computer-readable storage medium storing a computer program thereon. When the computer program is executed by a processor, it implements the various steps of the motor detection method described in the above embodiments.
[0125] The above-described embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the foregoing embodiments have described the present invention in detail, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should all be included within the protection scope of the present invention.
Claims
1. A method for detecting an electric motor, characterized in that, include: Before performing motor testing, acquire acoustic fingerprint data of a normal motor; After converting the normal motor acoustic data into a training graphic difference field image, it is input into the encoder model trained by the contrastive learning model to obtain high-dimensional features and construct a normal feature library. The encoder model trained by the contrastive learning model obtains orthogonal high-dimensional features based on the training graph difference field image and constructs them into a normal feature library. The encoder model trained by the contrastive learning model includes two branches, each branch including an encoder and a predictor. The encoder is used to extract general low-level features and encode general detail information that is not relevant to the task, while the predictor is used to encode high-level feature information that is relevant to the task. When performing motor testing, acquire the acoustic signature data of the motor to be predicted; After converting the motor acoustic data to be predicted into a target graphic differential field image, the image is input into the encoder model trained by the contrastive learning model to obtain the target features encoded and output by the encoder model trained by the contrastive learning model. Calculate the joint similarity between the target feature and existing features in the normal feature library; If the similarity is outside the preset range, the motor voiceprint data to be predicted is abnormal, and the motor is in an abnormal working state.
2. The motor testing method as described in claim 1, characterized in that, Before converting the normal motor acoustic signature data into a training graphic difference field image, the following steps are included: Convert the normal motor acoustic fingerprint data into time-series data. ; Where N is a positive integer.
3. The motor testing method as described in claim 2, characterized in that, Converting the normal motor acoustic signature data into a training graphic difference field image includes: Set the number of training image difference fields. ,in ; Set step size ,in , ,in, The maximum step size is defined, and any two step sizes are different. Set time window ,in Where any two time windows have different lengths; Based on the number of images, step size, and time window, extract any segment of the time-series data multiple times; The results of multiple extractions are combined and transformed to obtain the training image difference field image.
4. The motor testing method as described in claim 3, characterized in that, The results of multiple extractions are combined and transformed to obtain the training image difference field image, including: The results of multiple extractions are combined and transformed to obtain the image difference set. The calculation formula is: ; in, There are a total of A graphic, This is the result of multiple extractions. For time series data, time points Data; A new sequence is defined based on the difference set of the graph, and zero-padding is used to make the length of the new sequence consistent with that of the difference set of the graph. The calculation formula is: ; Define the graphical difference field MDF n The calculation formula is: in, Representative and Step Size d The corresponding (n-1) sequence sets enable the graphic difference field to generate (n-1) channel images corresponding to the sequence sets; For the i-th channel image, define an image array. for: ; in, Fill in the zero elements in the image array to define each channel of the graphic difference field image: in , It is by It came spinning. The Hadamard product is defined as the completed graphic difference field image, which is then called the training graphic difference field image.
5. The motor testing method as described in claim 1, characterized in that, The normal motor acoustic signature data is converted into a training graphic difference field image and then input into the encoder model trained by the contrastive learning model, including: Input any two of the training image difference field images into the encoder model trained by the contrast learning model; In this process, any two training pattern difference field images are encoded by the encoder and then mapped again by the predictor.
6. The motor testing method as described in claim 1, characterized in that, The encoder model trained by the contrastive learning model is trained using a redundancy elimination regularization loss, which is: ; Where cr represents the correlation matrix, I represents the identity matrix, and z1 and z2 represent the features encoded by the two branches.
7. The motor testing method as described in claim 1, characterized in that, The joint similarity includes negative cosine similarity and Euclidean distance, and the calculation formula is as follows: Q= ; Among them, z a , z b This represents two features from the normal feature library.
8. A motor testing device, characterized in that, include: The training data acquisition module is used to acquire normal motor acoustic fingerprint data before motor testing; The normal feature library construction module is used to convert the normal motor acoustic data into a training graphic difference field image and then input it into the encoder model trained by the contrast learning model to obtain high-dimensional features and construct a normal feature library. The encoder model trained by the contrastive learning model obtains orthogonal high-dimensional features based on the training graph difference field image and constructs them into a normal feature library. The encoder model trained by the contrastive learning model includes two branches, each branch including an encoder and a predictor. The encoder is used to extract general low-level features and encode general detail information that is not relevant to the task, while the predictor is used to encode high-level feature information that is relevant to the task. The module for acquiring data to be predicted is used to acquire the acoustic fingerprint data of the motor to be predicted during motor testing. The target feature conversion module is used to convert the motor acoustic data to be predicted into a target graphic differential field image and then input it into the encoder model trained by the contrastive learning model to obtain the target features encoded and output by the encoder model trained by the contrastive learning model. The motor detection module is used to calculate the joint similarity between the target feature and existing features in the normal feature library; If the similarity is outside the preset range, the motor voiceprint data to be predicted is abnormal, and the motor is in an abnormal working state.
9. A terminal device, characterized in that, It includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the computer program, it implements each step of the motor detection method as described in any one of claims 1 to 7.
10. A storage medium, said storage medium being a computer-readable storage medium having a computer program stored thereon, characterized in that, When the computer program is executed by the processor, it implements each step of the motor detection method as described in any one of claims 1 to 7.