A method for training a knock signal defect recognition model
By using a one-dimensional deep convolutional neural network for tapping signal defect identification, the problems of strong subjectivity and insufficient robustness in existing technologies are solved. It achieves automatic feature learning and stability improvement, reduces the false negative rate, and is suitable for non-destructive testing of composite materials and concrete components.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- XIAN THERMAL POWER RES INST CO LTD
- Filing Date
- 2026-03-09
- Publication Date
- 2026-06-23
Smart Images

Figure CN121809574B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of intelligent non-destructive testing and signal pattern recognition technology, specifically to a training method for a strike signal defect recognition model. Background Technology
[0002] Impact testing is a type of non-destructive testing method based on impact excitation. It applies transient impact excitation to the surface of a structure, inducing local free vibration and generating acoustic and vibrational responses. Due to changes in local stiffness, damping, and boundary constraints, defect-containing regions exhibit differences in response signals in terms of time-domain decay rate, envelope shape, peak amplitude, frequency-domain resonance peak position and bandwidth, and energy distribution across different frequency bands. These differences can be used for defect identification and location. Defect-containing regions include various types such as voids, debonding, inclusions, delamination, and honeycomb structure collapse.
[0003] In existing technologies, defect identification often relies on manual listening or empirical threshold interpretation, or on manually extracting statistics, frequency domain features, or time-frequency domain features from the impact signal before using traditional classifiers for discrimination. Statistics include root mean square, kurtosis, kurtosis, and zero crossover rate. Frequency domain features include dominant peak frequency, spectral energy, and spectral centroid. Time-frequency domain features include wavelet packet energy and short-time Fourier spectral texture. Traditional classifiers include Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Random Forests.
[0004] The above solution has the following shortcomings:
[0005] First, human interpretation is highly subjective and greatly affected by the person's experience, auditory sensitivity, environmental noise, and the consistency of the tapping.
[0006] Second, manual features are insufficient to cover complex defect morphologies and signal differences caused by different materials, thicknesses, and bonding methods, and lack robustness under conditions such as noise, changes in striking force, and changes in contact time.
[0007] Third, feature engineering and classifiers require repeated parameter tuning, and performance fluctuations are significant when dealing with scenarios involving multiple categories and subtle differences between classes.
[0008] Fourth, when the class samples are imbalanced, traditional methods are weak in identifying the minority class, which can easily lead to missed detections. Class imbalance refers to a situation where there are fewer defective samples and more normal samples.
[0009] Deep learning has end-to-end feature learning capabilities, but when used for tapping signal recognition, it still faces problems such as sample slicing and alignment, noise and tapping intensity fluctuations, inter-class boundary samples, training stability and overfitting control, interpretability of probability output and confidence calibration.
[0010] Therefore, there is an urgent need for a systematic model training method oriented towards impact signals to form a practical engineering training closed loop. Summary of the Invention
[0011] The present invention aims to at least solve one of the technical problems existing in the prior art, and to provide a training method for a knocking signal defect recognition model.
[0012] To achieve the above objectives, the present invention provides a training method for a knocking signal defect recognition model, comprising:
[0013] The knock signal samples are acquired and categorized to construct a sample set. ,in For a fixed-length sequence of knock signals, For the corresponding category index, The total number of samples, The number of categories; and the event detection, peak alignment and fixed-length segmentation of the knock signal samples to obtain fixed-length segmented samples, the mean removal and amplitude normalization of the segmented samples to obtain preprocessed samples, and the data augmentation of the preprocessed samples to obtain augmented samples;
[0014] The sample set is divided into a training set, a validation set, and a test set to construct a mini-batch training data stream;
[0015] The enhanced sample is input into a one-dimensional deep convolutional neural network, and multi-layer features are extracted sequentially through convolutional layers, batch normalization layers, activation layers and pooling layers. The feature vector is obtained by global average pooling.
[0016] The feature vector is input into the classification head to obtain the predicted output value corresponding to each category. The predicted output value is then converted into the predicted probability of each category by the Softmax classifier.
[0017] Using cross-entropy loss as the objective function, the parameters of the one-dimensional deep convolutional neural network are updated using backpropagation and optimization algorithms, and the one-dimensional deep convolutional neural network is trained until the convergence condition or early stopping condition is met.
[0018] The trained one-dimensional deep convolutional neural network is evaluated on the validation set to determine the rejection threshold and output a deployable defect recognition model.
[0019] Furthermore, the event detection includes:
[0020] The short-time energy is calculated from the original knock signal sequence. The formula for calculating the short-time energy is as follows:
[0021] ;
[0022] in, For the first The energy value of a frame. This is the original tapping signal sequence. For frame length, For frame shift;
[0023] Set adaptive threshold ,in and These are the mean and standard deviation of the short-time energy within the sliding window, respectively. For coefficients;
[0024] When the short-term energy exceeds the adaptive threshold, a knocking event is detected.
[0025] Furthermore, the peak alignment includes:
[0026] Find the peak location within the neighborhood where the tapping event was detected. ;
[0027] Using the peak position as an alignment reference, determine the starting point for a fixed-length cutoff. ,in To reserve points;
[0028] The length to be cut off from the starting point is... The segmented sample is obtained by dividing a fixed-length signal segment.
[0029] Furthermore, the calculation formula for the mean removal process is as follows:
[0030] ;
[0031] The calculation formula for the amplitude normalization process is as follows:
[0032] ;
[0033] in, For the segmented sample, For the mean-removed sample, For the normalized sample, For sample length, This is a very small constant used to prevent division by zero.
[0034] Furthermore, the data augmentation includes one or more of amplitude scaling, noise addition, time shifting, slight scaling, random occlusion, and mixup;
[0035] The noise addition is set according to the target signal-to-noise ratio, specifically as follows:
[0036] ;
[0037] in, For signal power, Noise power, noise Follows a pattern with a mean of zero and a variance of . The Gaussian distribution.
[0038] Furthermore, the calculation formula for the global average pooling is as follows:
[0039] ;
[0040] in, The first feature map of the last layer of the one-dimensional deep convolutional neural network. Each channel is located in eigenvalues at that location The length of the feature map, The first feature vector Each component.
[0041] Furthermore, the cross-entropy loss is a weighted cross-entropy loss, calculated using the following formula:
[0042] ;
[0043] in, This refers to the number of training samples in a small batch. For the number of categories, For the sample Unique hot-coded tags in category The amount on, For the sample Corresponding category The predicted probability, For category The weighting coefficients.
[0044] Furthermore, the optimization algorithm is the Adam optimization algorithm, and the update rule is:
[0045] ;
[0046] ;
[0047] in, For the first gradient of step, and These are the attenuation coefficients of the first and second moments, respectively. and These are the estimates of the first and second moments, respectively. and These are the estimates of the first and second moments after bias correction, respectively. For learning rate, To prevent division by zero constants, This indicates element-wise multiplication.
[0048] Furthermore, the early stopping conditions include:
[0049] Calculate performance metrics on the validation set, including macro-average F1 score or defect class recall.
[0050] If the performance metric does not improve within a preset number of consecutive rounds, training is stopped and the model parameters are rolled back to the optimal performance metric.
[0051] Furthermore, the method for determining the rejection threshold includes:
[0052] Calculate the maximum class prediction probability for each sample in the validation set. ;
[0053] Set rejection threshold When the maximum category prediction probability is greater than or equal to the rejection threshold, the defect identification model outputs the corresponding category prediction result; when the maximum category prediction probability is less than the rejection threshold, the defect identification model outputs a verification mark.
[0054] The beneficial effects of this invention are as follows:
[0055] This invention uses a one-dimensional deep convolutional neural network with a tapping waveform sequence as input to achieve end-to-end automatic feature learning. It eliminates the need for manual feature design and automatically learns the hierarchical feature representations required to distinguish defect categories.
[0056] This invention achieves unified modeling of multiple types of defects, such as normal, void, and debonding, by using a deep convolutional structure and Softmax probability output, and outputs the probability distribution of each category, which facilitates subsequent decision-making and confidence assessment.
[0057] This invention improves model convergence speed and reduces overfitting risk by combining batch normalization, weight decay, data augmentation, learning rate scheduling, and early stopping strategies, thus ensuring the stability of the training process.
[0058] This invention constructs a lightweight classification head through global average pooling, which significantly reduces the number of parameters and improves numerical stability, facilitating the engineering deployment of the model.
[0059] This invention effectively addresses the class imbalance problem by using a weighted cross-entropy loss function, thereby improving the ability to identify defective samples in minority classes and reducing the false negative rate.
[0060] This invention uses a rejection threshold mechanism to label low-confidence samples, thereby achieving quality control and forming an auditable and reproducible training loop. Attached Figure Description
[0061] Figure 1This is a flowchart illustrating the training method for the impact signal defect recognition model of the present invention.
[0062] Figure 2 This is a schematic diagram of the one-dimensional deep convolutional neural network structure of the present invention;
[0063] Figure 3 This diagram illustrates the loss function calculation, backpropagation, and parameter update of this invention. Detailed Implementation
[0064] To make the objectives, technical solutions, and beneficial effects of this application clearer, the following detailed description, in conjunction with the accompanying drawings and specific embodiments, further illustrates this application. It should be understood that the specific embodiments described in this specification are merely for explaining this application and are not intended to limit it.
[0065] The impact signal defect recognition model training method of this invention is applicable to non-destructive testing scenarios for various inspected objects, such as composite material structures, metal plates, and concrete components. Impact signals can be generated by a dedicated impact hammer, an automatic impact device, or manual impact, and are collected via a microphone, accelerometer, or laser vibration meter.
[0066] The key symbols are defined as follows: Indicates the input sample; This represents the original tapping signal sequence; Indicates the sample length; Indicates the number of categories; Represents the set of model parameters; Represents the eigenvector; Indicates the predicted output value; Indicates the predicted probability; Indicates category label; Indicates batch size; Represents the regularity coefficient; Indicates the learning rate; Represents a minimal constant; and These represent the current iteration step and the total number of iteration steps, respectively.
[0067] Example 1
[0068] This embodiment uses defect detection of aerospace composite skin as an application scenario. Aerospace composite skin may develop defects such as delamination, separation, and voids during manufacturing and service, requiring quality assessment through impact testing.
[0069] See Figure 1 The training method for the knocking signal defect recognition model in this embodiment includes the following steps.
[0070] Step S1: Acquisition and labeling of training data.
[0071] Acquire knock signal samples and label them by category to construct a sample set. .
[0072] The impact signal samples are collected using an automated impact detection system. This system includes an impact actuator, a signal acquisition module, and a data storage module. The impact actuator uses a standardized electromagnetically driven hammer to ensure consistent impact force and contact time for each strike. The signal acquisition module employs a high-sensitivity condenser microphone with a sampling rate of 44,100 Hz and a quantization precision of 16 bits. The data storage module stores the collected raw waveform data in a lossless format.
[0073] The category labeling is determined by experienced inspectors based on the ultrasonic C-scan results. This embodiment defines five categories: normal area, debonding defect, delamination defect, void defect, and honeycomb collapse defect. The category indexes correspond to... Number of categories .
[0074] To facilitate supervised training, the category index is encoded as a one-hot encoded label vector. :
[0075] ;
[0076] This embodiment collected a total of 10,000 impact signal samples, including 6,000 samples from normal areas, 1,500 samples from debonding defects, 1,200 samples from delamination defects, 800 samples from void defects, and 500 samples from honeycomb collapse defects. The total number of samples is [not specified]. .
[0077] Label consistency verification. Perform consistency checks on multiple tapping samples at the same location. Assume that samples at the same location... Each sample was labeled as The majority vote result was:
[0078] ;
[0079] Consistency score The calculation formula is:
[0080] ;
[0081] in, This is an indicator function that takes the value 1 when the condition inside the parentheses is true, and 0 otherwise.
[0082] This embodiment sets a consistency threshold. .like If the sample at that location is not identified, it is labeled as a suspected boundary sample. Suspected boundary samples are assigned lower loss weights or are directly removed during training to avoid interference from annotation noise on model training.
[0083] Imbalanced categorization statistics. Let the first... The number of samples in each class is The total number of samples is The proportion of each category .
[0084] In this embodiment, the sample proportions for each category are as follows: normal area 60%, debonding defect 15%, delamination defect 12%, void defect 8%, and honeycomb collapse defect 5%. Due to the obvious class imbalance, a weighted cross-entropy loss and equal sampling strategy are subsequently used for processing.
[0085] Step S2: Sample event detection, segmentation and preprocessing.
[0086] The knock signal samples are subjected to event detection, peak alignment and fixed-length segmentation to obtain fixed-length segmented samples. The segmented samples are then subjected to mean removal and amplitude normalization to obtain preprocessed samples. Finally, the preprocessed samples are subjected to data augmentation to obtain augmented samples.
[0087] Short-duration energy event detection. This involves analyzing the raw tap signal sequence. Calculate short-time energy. Assume frame length. Frame shift , No. The formula for calculating frame energy is:
[0088] ;
[0089] Set adaptive threshold .in, and Energy sequences In recent The mean and standard deviation within the frame sliding window, i.e., for the set Perform statistical calculations. When At that time, calculations are performed within the already acquired frames. This embodiment sets... ,coefficient .
[0090] When short-time energy Exceeding the adaptive threshold When a tapping event is detected, the starting frame position of the tapping event is recorded.
[0091] Peak alignment. After detecting a tap event, locate the peak position within its neighborhood:
[0092] ;
[0093] At peak position For alignment reference. Fixed length cut-off starting point:
[0094] ;
[0095] This embodiment sets the number of reserved points. This allows the captured signal segment to cover both the impact contact phase and the subsequent free decay response.
[0096] Fixed-length segmentation. Let the sample length be... From the starting point of the interception Begin extracting a fixed-length signal segment:
[0097] ;
[0098] in, This is a window function. In this embodiment, a Hanning window is used for windowing to reduce spectral leakage.
[0099] If overlapping slices are used to obtain more training samples, the step size... Setting it to 1024 satisfies the requirement. .
[0100] Mean removal processing. The mean of the split samples is removed as follows:
[0101] ;
[0102] The mean-removal process eliminates the DC component of the signal, making the signal mean zero.
[0103] Optional linear detrending processing. The least squares method is used to fit the linear trend of the signal. To obtain the trend coefficient and Then remove the linear trend:
[0104] ;
[0105] Bandpass filtering. To suppress low-frequency drift and high-frequency noise, the de-stressed signal is filtered. Bandpass filtering is performed to obtain The filter is implemented using a fourth-order Butterworth IIR filter, and its difference equation is:
[0106] ;
[0107] in, and These are the filter coefficients. In this embodiment, the passband frequency range is set to 100 Hz to 15000 Hz, and the filter coefficients are designed based on a sampling rate of 44100 Hz.
[0108] Amplitude normalization. The filtered signal is then subjected to amplitude normalization.
[0109] ;
[0110] Among them, the mean and standard deviation The calculation formula is:
[0111] ;
[0112] As a minimal constant, this embodiment sets This is used to prevent division by zero errors.
[0113] Maximum amplitude normalization. Besides mean-variance standardization, maximum amplitude normalization can also be used:
[0114] ;
[0115] This embodiment uses a concatenated approach of first standardizing the mean and variance and then normalizing the maximum amplitude, which makes the model more adaptable to differences in impact intensity.
[0116] Energy gating. Calculate the root mean square energy of the sample:
[0117] ;
[0118] Set energy threshold .like If the sample does not contain valid knocking information, it will be discarded or re-extracted.
[0119] Bandwidth energy quality control. Perform a Discrete Fourier Transform on the standardized signal:
[0120] ;
[0121] Calculate the bandwidth energy:
[0122] ;
[0123] This embodiment calculates the energy proportions of the low-frequency band (100 Hz to 1000 Hz), the mid-frequency band (1000 Hz to 5000 Hz), and the high-frequency band (5000 Hz to 15000 Hz). If the energy distribution is significantly abnormal, the sample is marked as an abnormal sample and removed.
[0124] Data augmentation. Data augmentation is performed on preprocessed samples to expand the training set and improve the model's generalization ability. There are six main types of data augmentation:
[0125] The first method is amplitude scaling: The scaling factor Random sampling was performed within the range of 0.8 to 1.2.
[0126] The second method is to add noise: Among them, noise The noise follows a Gaussian distribution. The noise level is set according to the target signal-to-noise ratio.
[0127] ;
[0128] Set the target signal-to-noise ratio to a random value within the range of 20 dB to 40 dB, and calculate the noise power. Then the noise standard deviation ,noise .
[0129] The third method is time translation: shifting the signal along the time axis. One sampling point, Random sampling is performed within the range of -50 to 50. The portion exceeding the boundary after translation is filled with zeros.
[0130] The fourth method is slight scaling: The signal is slightly scaled over time, with the scaling factor randomly sampled within the range of 0.95 to 1.05. After scaling, the signal is restored to its original length using linear interpolation. .
[0131] The fifth method is random occlusion: randomly select a continuous segment of the signal and set it to zero, with the occlusion length randomly sampled within the range of 50 to 200 sampling points.
[0132] The sixth method is Mixup enhancement: randomly selecting two different samples. and and its labels and According to the mixing coefficient Perform linear combinations:
[0133] ,
[0134] ;
[0135] in, From Beta distribution Sampling.
[0136] To enhance weight control, if the data augmentation is too large and may change the physical meaning of the sample, the augmented sample will be assigned a lower weight in the loss calculation. To avoid erroneous supervision, this embodiment sets up strongly enhanced samples. .
[0137] Step S3: Dataset partitioning, balanced sampling, and batch training data flow.
[0138] The sample set is divided into a training set, a validation set, and a test set to construct a mini-batch training data stream.
[0139] Dataset partitioning. The sample set was divided into training, validation, and test sets in a 7:1.5:1.5 ratio. To avoid information leakage, the partitioning was performed according to the detection region or the original signal sequence, ensuring that multiple taps at the same detection location would not appear simultaneously in the training, validation, or test sets.
[0140] In this embodiment, the training set contains 7000 samples, the validation set contains 1500 samples, and the test set contains 1500 samples.
[0141] Mini-batch training data stream. Set batch size. Each training step draws a mini-batch of samples from the training set. ,in This refers to the category label of the corresponding sample.
[0142] Weighted random sampling, to alleviate the class imbalance problem, involves adjusting the training set samples... Set sampling probability The specific calculation formula is as follows:
[0143] ;
[0144] in, For the sample The total number of samples in each category. Through weighted random sampling, the probability of minority class samples being drawn is increased, and the number of samples in each category is more balanced within each mini-batch.
[0145] Stratified sampling. In addition to weighted random sampling, a stratified sampling strategy can also be used to ensure that each mini-batch contains samples from each category. In this embodiment, each mini-batch contains at least two samples from each category.
[0146] Class weight calculation. To give greater weight to the minority class in the loss function, class weights are calculated. :
[0147] ;
[0148] in, The total number of samples, For the number of categories, For the first Number of samples per class.
[0149] Another way to calculate category weights is:
[0150] ;
[0151] in, As a smoothing constant, this embodiment sets .
[0152] In this embodiment, the weights for each category are as follows: normal area 0.333, debonding defect 1.333, delamination defect 1.667, void defect 2.5, and honeycomb collapse defect 4.0.
[0153] Step S4: One-dimensional deep convolutional neural network and forward computation.
[0154] See Figure 2 The enhanced samples are input into a one-dimensional deep convolutional neural network, and multi-layer features are extracted sequentially through convolutional layers, batch normalization layers, activation layers and pooling layers. The feature vector is obtained through global average pooling.
[0155] One-dimensional deep convolutional neural network It consists of stacked convolutional and pooling modules, connected to a global average pooling layer and a classification head. The network structure can be a naive convolutional structure, a residual connection structure, or a dense connection structure.
[0156] Input tensor. Input sample. ,in For sample length, Number of input channels. Single-channel input. When using multi-channel input .
[0157] Convolutional layer. For input channels With output channel Let the result of zero-padding the input sequence be . The formula for calculating the output of a one-dimensional convolution is:
[0158] ;
[0159] in, For convolution kernel weights, For bias terms, The kernel size is [size]. Step size, This represents the void ratio.
[0160] The formula for calculating the output length of a convolution is:
[0161] ;
[0162] in, For the input length, For fill size, This indicates rounding down to the nearest integer.
[0163] The effective kernel length of dilated convolution is .
[0164] Batch normalization layer, for features of small batches of training samples Perform batch normalization. Calculate the batch mean. and batch variance Normalizing the features yields And through learnable scaling parameters and offset parameters Perform affine transformation:
[0165] ;
[0166] in, .
[0167] During the inference phase, the batch normalization layer uses the moving mean and moving variance calculated during training, rather than the statistics for the current batch.
[0168] Activation layer. The ReLU activation function is used to introduce nonlinearity:
[0169] ;
[0170] Alternatively, the LeakyReLU activation function can be used: ,in .
[0171] Alternatively, the GELU activation function can be used: ,in This is the cumulative distribution function of the standard normal distribution.
[0172] Pooling layer. Downsampling is performed using either max pooling or average pooling. The pooling window size is set. Step length .
[0173] The formula for calculating max pooling is:
[0174] ;
[0175] The formula for calculating average pooling is:
[0176] ;
[0177] This embodiment uses max pooling, with a pooling window size of... Step length .
[0178] The network provides a sense of wilderness, for the first Layered convolution, with kernel size set. Step length Let the cumulative stride be... The recursive formula for calculating the receptive field is:
[0179] ;
[0180] The network design in this embodiment ensures that the receptive field of the last convolutional layer covers the main temporal range of the input signal.
[0181] Residual connections are used to alleviate the gradient vanishing problem in deep networks.
[0182] ;
[0183] in, It is a combination of several convolutional layers, batch normalization layers, and activation layers. It can be an identity mapping or Convolution is used to match the number of channels, or to match the size by padding with zeros or downsampling.
[0184] The network structure configuration in this embodiment is as follows. The network contains 16 convolutional layers, divided into 4 stages, each containing 4 residual blocks. The number of output channels in each stage is 64, 128, 256, and 512, respectively. The first convolutional stride in the first stage is 1, and the first convolutional stride in the subsequent three stages is 2 for downsampling. All convolutional kernels have a size of 3, and the padding size is 1.
[0185] Global average pooling. This is applied to the feature maps of the last convolutional layer. Perform global average pooling, where Channel index:
[0186] ;
[0187] in, For the feature map length, The eigenvector after global average pooling is the th Each component.
[0188] Global average pooling aggregates the feature maps of each channel into a single value, significantly reducing the number of parameters and improving numerical stability. In this embodiment, the feature vector output by global average pooling has a dimension of 512.
[0189] Classification head. Let the feature vectors be... As the output of global average pooling, the classification head calculates the predicted output value for each category through a linear transformation:
[0190] ;
[0191] in, This is the weight matrix. For bias vectors, For the number of categories, This refers to the dimension of the feature vector. (This is from an example.) , .
[0192] Dropout regularization. A Dropout layer is added between the global average pooling layer and the classification head for regularization. During training, the feature vectors are... Apply random mask :
[0193] ;
[0194] in, For the discard rate, This indicates element-wise multiplication. This embodiment sets... Dropout is not used during inference.
[0195] Step S5: Softmax probability output and numerical stabilization processing.
[0196] The feature vector is input into the classification head to obtain the predicted output value corresponding to each category. The predicted output value is then converted into the predicted probability of each category by the Softmax classifier.
[0197] The Softmax classifier will predict the output value. Convert to probability distribution :
[0198] ;
[0199] in, For the first The predicted probability of a class satisfies and .
[0200] Numerical stabilization is employed. To avoid overflow during exponential operations, the numerically stable Softmax calculation method is used. Let... ,but:
[0201] ;
[0202] This calculation method is mathematically equivalent to the original Softmax formula, but avoids the overflow problem caused by large numerical exponents.
[0203] Temperature scaling, used to calibrate model confidence, allows temperature parameters to be introduced during the inference phase. :
[0204] ;
[0205] Temperature parameters To make the probability distribution smoother. This makes the probability distribution sharper. The temperature parameter can be optimized on the validation set by minimizing the negative log-likelihood. In this embodiment, the optimal temperature parameter is determined on the validation set through grid search. .
[0206] Step S6: Definition and calculation of the loss function.
[0207] Network parameters are optimized using cross-entropy loss as the objective function.
[0208] The formula for calculating cross-entropy loss is:
[0209] ;
[0210] in, For small batch sample size, For the number of categories, For the sample Unique hot-coded tags in category The amount on, For the sample Corresponding category The predicted probability.
[0211] Weighted cross-entropy loss is used to address the class imbalance problem.
[0212] ;
[0213] in, For category The weighting coefficients.
[0214] Focus loss can be used to further focus on hard-to-classify samples.
[0215] ;
[0216] in, This is the focus parameter, controlling the degree of attention given to difficult-to-classify samples. The time-focus loss degenerates into a weighted cross-entropy loss, as set in this embodiment. .
[0217] Regularization loss: To prevent overfitting, a weight decay regularization term is added to the total loss.
[0218] ;
[0219] in, The regularization coefficient is... For the model's learnable parameters, for Norm.
[0220] use When regularizing: The gradient of the regularization term is .
[0221] use When regularizing: The gradient of the regularization term is .
[0222] This embodiment adopts Regularization, regularization coefficient .
[0223] Total loss function. The total loss is the sum of the classification loss and the regularization loss:
[0224] ;
[0225] Gradient calculation is simplified. With the combination of Softmax and cross-entropy loss, the gradient at the predicted output value has a concise form:
[0226] ;
[0227] In the weighted case:
[0228] ;
[0229] Gradient of classification head parameters. Let For the first The gradient vector of each sample at the predicted output value is then the gradient of the classification head parameters is:
[0230] ;
[0231] Backpropagation of the global average pooling layer. Let... ,but:
[0232] ;
[0233] Step S7: Optimize and train strategies.
[0234] The parameters of a one-dimensional deep convolutional neural network are updated using backpropagation and optimization algorithms, and the network is trained until the convergence condition or early stopping condition is met.
[0235] Gradient clipping. To prevent gradient explosion, gradients are clipped before parameter updates. Let the gradient vector... pruning threshold .like ,but:
[0236] ;
[0237] This embodiment sets a gradient clipping threshold. .
[0238] Momentum stochastic gradient descent. The parameter update rule for the Momentum-SGD optimization algorithm is:
[0239] ;
[0240] in, The momentum coefficient, For learning rate, For the first The gradient of the step.
[0241] Adam optimization algorithm. This embodiment uses the Adam optimization algorithm for parameter updates. The update rule of the Adam optimization algorithm is as follows:
[0242] ,
[0243] ,
[0244] ,
[0245] ,
[0246] ;
[0247] in, For the first gradient of step, and These are the attenuation coefficients of the first and second moments, respectively. and These are the estimates of the first and second moments, respectively. and These are the estimates of the first and second moments after bias correction, respectively. For learning rate, To prevent division by zero constants, This indicates element-wise multiplication.
[0248] The hyperparameters of the Adam optimization algorithm in this embodiment are set as follows: , , Initial learning rate .
[0249] Learning rate warm-up. To avoid gradient oscillations in the early stages of training, set a warm-up number of steps. During the preheating phase At that time, the learning rate increases linearly:
[0250] ;
[0251] This embodiment sets the number of preheating steps. .
[0252] Cosine annealing learning rate scheduling. After the warm-up phase, the learning rate is adjusted using a cosine annealing strategy:
[0253] ;
[0254] in, To minimize the learning rate, The initial learning rate, Total training steps This represents the current step number.
[0255] This embodiment sets a minimum learning rate. Total training steps .
[0256] Training configuration. Batch size. Total training steps During training, record hyperparameter configurations and model versions to ensure the experiments are reproducible.
[0257] Early stopping strategy. If the performance metrics on the validation set are continuously... If no improvement is achieved within a single evaluation period, training will be stopped and the system will be rolled back to the optimal model parameters. . For patience parameters.
[0258] This embodiment sets the patience parameter. Each evaluation period consists of 500 training steps. The performance metric is the macro-average F1 score on the validation set.
[0259] Exponential moving average. To improve the stability of model inference, an exponential moving average of the parameters is maintained during training:
[0260] ;
[0261] in, This is the attenuation coefficient. In this embodiment, it is set... When reasoning, prioritize using .
[0262] Hard sample mining. An optional online hard sample mining strategy can be introduced. The loss for each sample is calculated for each mini-batch. Select the one with the greatest loss proportional sample set Participating in backpropagation:
[0263] ;
[0264] in, Selected within the range of 0.5 to 1.0. This embodiment sets... This strategy allows the model to focus on boundary samples and easily confused categories, thereby improving its discriminative ability.
[0265] Step S8: Evaluation metrics, threshold selection, and model output.
[0266] The performance metrics of the trained one-dimensional deep convolutional neural network are evaluated on the validation set to determine the rejection threshold and output a deployable defect recognition model.
[0267] Confusion matrix. Definition The real category is The predicted category is Given the sample size, the formula for calculating the overall accuracy is:
[0268] ;
[0269] Category performance metrics. The formulas for calculating precision, recall, and F1 score for each class are as follows:
[0270] ;
[0271] The formula for calculating the macro average F1 value is:
[0272] ;
[0273] The performance metrics of this embodiment on the test set are as follows: overall accuracy 95.2%, macro average F1 score 92.8%, and recall rates for each category are 97.5% for normal area, 93.2% for debonding defects, 91.5% for delamination defects, 89.8% for void defects, and 86.5% for honeycomb collapse defects.
[0274] Determine the rejection threshold. Calculate the maximum class prediction probability for each sample in the validation set. Set a rejection threshold. ,when Output the corresponding category if it is valid, otherwise output the "requires review" flag.
[0275] Threshold search under recall constraints. Using a rejection threshold. As the independent variable, under the condition of satisfying the defective class recall rate Choice under constraints To maximize precision or F1 score, this embodiment sets a recall constraint. The optimal rejection threshold is determined by grid search on the validation set. .
[0276] Expected calibration error. Predictions are categorized by confidence level. each interval Calculate the average confidence level within each interval. With accuracy The formula for calculating the expected calibration error ECE is:
[0277] ;
[0278] in, This represents the number of samples in the test set. This embodiment sets... The ECE value on the test set was 3.2%, indicating that the model confidence was well calibrated.
[0279] Model Output. The trained one-dimensional deep convolutional neural network parameters are saved as a deployable defect recognition model file. The model file includes the network structure definition, post-trained parameter weights, preprocessing parameter configuration, and rejection threshold setting.
[0280] Example 2
[0281] This embodiment uses defect detection in concrete bridge structures as an application scenario. Concrete bridges may develop internal defects such as voids, honeycombing, and cracks during service, requiring quality assessment through impact testing.
[0282] See Figure 1 The training method for the knocking signal defect recognition model in this embodiment includes the following steps.
[0283] Step S1: Acquisition and labeling of training data.
[0284] Acquire knock signal samples and label them by category to construct a sample set. .
[0285] Impact signal samples were collected using a portable impact testing device. The portable impact testing device includes an impact execution module, a signal acquisition module, and a data storage module. The impact execution module uses a standardized impact hammer with a nylon head, a 25mm diameter head, and a 300mm handle, ensuring consistent impact excitation on the concrete surface without damaging the structure. The signal acquisition module uses a high-sensitivity electret microphone with a sampling rate of 22050 Hz and a quantization precision of 16 bits. The data storage module stores the acquired raw waveform data in lossless WAV format in the built-in solid-state memory.
[0286] The category labeling was determined by experienced bridge inspection engineers based on core sampling verification results and ground-penetrating radar scan results. This embodiment defines four categories: normal areas, void defects, honeycomb defects, and crack defects, with corresponding category indices. Number of categories .
[0287] This embodiment collected a total of 8000 impact signal samples, including 5000 samples from normal areas, 1200 samples from void defects, 1000 samples from honeycomb defects, and 800 samples from crack defects. Total number of samples. .
[0288] The method for verifying label consistency and constructing one-hot encoded label vectors is the same as in Example 1. This example sets a consistency threshold. If the consistency score is below the threshold, the sample at that location is marked as a suspected boundary sample and is given a lower loss weight or is directly removed during training. The sample percentages for each category are as follows: normal region 62.5%, void defect 15%, honeycomb defect 12.5%, and crack defect 10%. Due to the obvious class imbalance, a weighted cross-entropy loss and equal sampling strategy are subsequently used for processing.
[0289] Step S2: Sample event detection, segmentation and preprocessing.
[0290] The knock signal samples are subjected to event detection, peak alignment and fixed-length segmentation to obtain fixed-length segmented samples. The segmented samples are then subjected to mean removal and amplitude normalization to obtain preprocessed samples. Finally, the preprocessed samples are subjected to data augmentation to obtain augmented samples.
[0291] The method for detecting short-time energy events is the same as in Example 1. Because the frequency components of the concrete impact signal are low and attenuate slowly, the frame length is set... Frame shift This embodiment sets up a sliding window. ,coefficient When the short-term energy exceeds the adaptive threshold, a knocking event is detected, and the starting frame position of the knocking event is recorded.
[0292] The peak alignment method is the same as in Example 1. This example sets the number of reserved points. This allows the captured signal segment to cover both the impact contact phase and the subsequent free decay response. Due to the damping characteristics of concrete structures, the decay response lasts for a relatively long time, requiring a greater number of pre-defined points.
[0293] Fixed-length segmentation with sample length set This embodiment employs a Hanning window for windowing to reduce spectral leakage. Since the concrete impact signal attenuates slowly, the sample length is set to 4096 to fully cover the main response process of the signal. If an overlapping slicing method is used to obtain more training samples, the step size... Set it to 2048.
[0294] The mean removal, linear detrending, and amplitude normalization methods are the same as in Example 1. This example sets a minimal constant to prevent division by zero. This embodiment employs a concatenated approach of first standardizing the mean and variance, then normalizing the maximum amplitude, to make the model more adaptable to differences in impact intensity.
[0295] Bandpass filtering is applied to the de-stressed signal to suppress low-frequency drift and high-frequency noise. A fourth-order Butterworth IIR filter is used. In this embodiment, the passband frequency range is set to 50 Hz to 8000 Hz, and the filter coefficients are designed based on a sampling rate of 22050 Hz. Since the impact response frequency of concrete structures is relatively low, the lower cutoff frequency is set to 50 Hz to preserve low-frequency information.
[0296] Energy gating sets energy thresholds If the root mean square energy is below the threshold, the sample is considered not to contain valid impact information and is discarded or re-extracted. Since the energy distribution of concrete impact signals is relatively dispersed, the energy threshold is appropriately lowered.
[0297] Frequency band energy quality control performs a discrete Fourier transform on the standardized signal to calculate the energy proportions in the low-frequency band (50 Hz to 500 Hz), the mid-frequency band (500 Hz to 2000 Hz), and the high-frequency band (2000 Hz to 8000 Hz). If the energy distribution is significantly abnormal, the sample is marked as an anomaly and removed.
[0298] The data augmentation methods are the same as in Example 1, including amplitude scaling, noise addition, time shifting, slight scaling, random occlusion, and Mixup enhancement. Amplitude scaling coefficients. Random sampling was performed within the range of 0.85 to 1.15. Due to the high ambient noise at the concrete testing site, the target signal-to-noise ratio was set to a random value within the range of 15 to 35 dB to simulate actual working conditions. Time shift. Random sampling is performed within the range of -100 to 100, and random sampling is performed within the range of 100 to 400 sampling points for occlusion length. This embodiment sets loss weights for strongly enhanced samples. .
[0299] Step S3: Dataset partitioning, balanced sampling, and batch training data flow.
[0300] The sample set is divided into a training set, a validation set, and a test set to construct a mini-batch training data stream.
[0301] The sample set was divided into training, validation, and test sets in a 7:1.5:1.5 ratio. To avoid information leakage, the groups were divided according to the detection area or the original signal sequence to ensure that multiple impact samples of the same bridge component would not appear simultaneously in the training, validation, or test sets.
[0302] In this embodiment, the training set contains 5600 samples, the validation set contains 1200 samples, and the test set contains 1200 samples. Batch size .
[0303] The weighted random sampling and class weight calculation methods are the same as in Example 1. This example uses a stratified sampling strategy, with each mini-batch containing at least two samples from each class. The smoothing constant in the class weight calculation... The weights for each category are as follows: normal area 0.4, void defect 1.667, honeycomb defect 2.0, and crack defect 2.5.
[0304] Step S4: One-dimensional deep convolutional neural network and forward computation
[0305] See Figure 2 The enhanced samples are input into a one-dimensional deep convolutional neural network, where they sequentially pass through convolutional layers, batch normalization layers, activation layers, and pooling layers to extract multi-layer features. Global average pooling is then used to obtain the feature vector. The network structure employs a residual connection structure to mitigate the gradient vanishing problem in deep networks.
[0306] Input Sample ,in For sample length, single-channel input .
[0307] The calculation methods for convolutional layers, batch normalization layers, activation layers, pooling layers, and residual connections are the same as in Example 1.
[0308] The network structure configuration in this embodiment is as follows: Since the input sample length is 4096, the network contains 20 convolutional layers, divided into 5 stages, each containing 4 residual blocks. The number of output channels in each stage is 32, 64, 128, 256, and 512, respectively. The first convolutional stride in the first stage is 1, and the first convolutional stride in the subsequent four stages is 2 for downsampling. All convolutional kernels have a size of 3, and the padding size is 1.
[0309] Global average pooling aggregates the feature maps of each channel into a single value, resulting in an output feature vector with a dimension of 512. (Classification head) , Dropout regularization drop rate Dropout is not used during inference.
[0310] Step S5: Softmax probability output and numerical stabilization processing.
[0311] The feature vector is input into the classification head to obtain the predicted output value corresponding to each category. The predicted output value is then converted into the predicted probability of each category using a Softmax classifier. The Softmax classifier and numerical stabilization method are the same as in Example 1. In this example, the optimal temperature parameter is determined through grid search on the validation set. .
[0312] Step S6: Definition and calculation of the loss function.
[0313] Network parameter optimization is performed using cross-entropy loss as the objective function. The calculation methods for cross-entropy loss, weighted cross-entropy loss, and focus loss are the same as in Example 1. This example sets the focus parameter. Regularity coefficient The total loss is the sum of the classification loss and the regularization loss.
[0314] Step S7: Optimize and train strategies.
[0315] The parameters of the one-dimensional deep convolutional neural network are updated using backpropagation and optimization algorithms, and the network is trained until the convergence condition or early stopping condition is met. The optimization algorithm and training strategy are the same as in Example 1.
[0316] The hyperparameters of the Adam optimization algorithm in this embodiment are set as follows: , , Initial learning rate Gradient clipping threshold Preheating steps During the warm-up phase, the learning rate increases linearly.
[0317] Cosine annealing learning rate scheduling sets minimum learning rate Total training steps Training batch size configuration Total training steps Setting patience parameters for early stop strategy. Each evaluation period consists of 500 training steps, and the performance metric is the macro-average F1 score on the validation set. The exponential moving average decay coefficient is also used. .
[0318] Step S8: Evaluation metrics, threshold selection, and model output.
[0319] The performance metrics of the trained one-dimensional deep convolutional neural network are evaluated on a validation set to determine the rejection threshold and output a deployable defect recognition model. The evaluation metric calculation method is the same as in Example 1.
[0320] The performance metrics of this embodiment on the test set are as follows: overall accuracy 93.2%, macro average F1 score 89.5%, and recall rates for each category are 96.8% for normal areas, 90.2% for void defects, 87.5% for honeycomb defects, and 83.6% for crack defects.
[0321] The method for determining the rejection threshold is the same as in Example 1. This example sets a recall rate constraint. The optimal rejection threshold is determined by grid search on the validation set. Set the interval number. The expected calibration error (ECE) on the test set is 3.5%, indicating that the model confidence calibration is good.
[0322] The model output and model compression methods are the same as in Example 1.
[0323] Example 3
[0324] This embodiment uses the detection of welding defects in metal plates as an application scenario. During the welding process, defects such as porosity, slag inclusions, lack of fusion, and cracks may occur in the weld seams of metal plates, requiring quality screening through tapping inspection.
[0325] See Figure 1 The training method for the impact signal defect recognition model in this embodiment includes the following steps:
[0326] Step S1: Training Data Acquisition and Labeling
[0327] Acquire knock signal samples and label them by category to construct a sample set. .
[0328] The impact signal samples are collected using an automated impact detection system. This system comprises an impact execution module, a signal acquisition module, and a data storage module. The impact execution module employs a pneumatic impact device with a hard alloy impact head, 8 mm in diameter, capable of generating high-frequency pulse excitation. The signal acquisition module uses a piezoelectric accelerometer with a sampling rate of 50,000 Hz and a quantization precision of 24 bits to ensure accurate capture of the high-frequency response characteristics of metallic materials. The data storage module stores the collected raw waveform data in an industrial-grade solid-state memory.
[0329] The category labeling is determined by the welding quality inspection engineer based on the results of X-ray flaw detection and ultrasonic testing. This embodiment defines five categories: acceptable weld, porosity defect, slag inclusion defect, lack of fusion defect, and crack defect. The category indexes correspond to... Number of categories .
[0330] This embodiment collected a total of 12,000 tapping signal samples, including 7,000 qualified weld samples, 1,800 porosity defect samples, 1,500 slag inclusion defect samples, 1,000 lack of fusion defect samples, and 700 crack defect samples. Total number of samples. .
[0331] The method for verifying label consistency and constructing one-hot encoded label vectors is the same as in Example 1. This example sets a consistency threshold. The sample percentages for each category were as follows: qualified welds 58.3%, porosity defects 15%, slag inclusion defects 12.5%, lack of fusion defects 8.3%, and crack defects 5.8%. Since crack defects had the lowest sample percentage, a weighted cross-entropy loss and balanced sampling strategy were subsequently used for processing.
[0332] Step S2: Sample event detection, segmentation and preprocessing.
[0333] The knock signal samples are subjected to event detection, peak alignment and fixed-length segmentation to obtain fixed-length segmented samples. The segmented samples are then subjected to mean removal and amplitude normalization to obtain preprocessed samples. Finally, the preprocessed samples are subjected to data augmentation to obtain augmented samples.
[0334] The method for detecting short-time energy events is the same as in Example 1. Because the frequency components of the metal impact signal are high and attenuate rapidly, the frame length is set... Frame shift This embodiment sets up a sliding window. ,coefficient .
[0335] The peak alignment method is the same as in Example 1. This example sets the number of reserved points. Because metallic materials have lower damping and faster signal attenuation, the number of reserved points is reduced accordingly.
[0336] Fixed-length segmentation with sample length set This embodiment uses a Hanning window for windowing. Since the metal impact signal attenuates rapidly, a sample length of 2048 is sufficient to cover the main response process. If an overlapping slicing method is used to obtain more training samples, the step size... Set it to 1024.
[0337] The mean removal, linear detrending, and amplitude normalization methods are the same as in Example 1. This example sets a minimal constant to prevent division by zero. .
[0338] Bandpass filtering is used to suppress low-frequency drift and high-frequency noise on the de-stressed signal. A fourth-order Butterworth IIR filter is employed. In this embodiment, the passband frequency range is set to 200 Hz to 20000 Hz, and the filter coefficients are designed based on a sampling rate of 50000 Hz. Since metallic materials have a high impact response frequency, the upper cutoff frequency is set to 20000 Hz to preserve high-frequency information.
[0339] Energy gating sets energy thresholds Because the energy concentration of metal impact signals is high, the energy threshold should be appropriately increased.
[0340] Frequency band energy quality control performs a discrete Fourier transform on the standardized signal to calculate the energy proportions in the low-frequency band (200 Hz to 2000 Hz), the mid-frequency band (2000 Hz to 8000 Hz), and the high-frequency band (8000 Hz to 20000 Hz). If the energy distribution is significantly abnormal, the sample is marked as an abnormal sample and removed.
[0341] The data augmentation method is the same as in Example 1. Amplitude scaling factor. Random sampling was performed within the range of 0.9 to 1.1. Due to the low ambient noise in automated detection environments, the target signal-to-noise ratio was set to a random value within the range of 25 to 45 dB. Time shift. Random sampling is performed within the range of -50 to 50, and random sampling is performed within the range of 50 to 200 sampling points for occlusion length. This embodiment sets a loss weight for strongly enhanced samples. .
[0342] Step S3: Dataset partitioning, balanced sampling, and batch training data flow.
[0343] The sample set is divided into a training set, a validation set, and a test set to construct a mini-batch training data stream.
[0344] The sample set was divided into training, validation, and test sets in a 7:1.5:1.5 ratio. To avoid information leakage, the samples were grouped according to weld batches to ensure that multiple tapping samples of the same weld do not appear simultaneously in the training, validation, or test sets.
[0345] In this embodiment, the training set contains 8400 samples, the validation set contains 1800 samples, and the test set contains 1800 samples. Batch size .
[0346] The weighted random sampling and class weight calculation methods are the same as in Example 1. This example uses a stratified sampling strategy, with each mini-batch containing at least 3 samples from each class. The smoothing constant in the class weight calculation... The weights for each category are as follows: qualified weld 0.343, porosity defect 1.333, slag inclusion defect 1.6, lack of fusion defect 2.4, and crack defect 3.429.
[0347] Step S4: One-dimensional deep convolutional neural network and forward computation.
[0348] See Figure 2 The enhanced samples are input into a one-dimensional deep convolutional neural network, where they sequentially pass through convolutional layers, batch normalization layers, activation layers, and pooling layers to extract multi-layer features. Global average pooling is then used to obtain the feature vector. The network structure employs a residual connection structure to mitigate the gradient vanishing problem in deep networks.
[0349] Input Sample ,in For sample length, single-channel input .
[0350] The calculation methods for convolutional layers, batch normalization layers, activation layers, pooling layers, and residual connections are the same as in Example 1. In this example, the LeakyReLU activation function can also be used for the activation layer. .
[0351] The network structure in this embodiment is configured as follows: The network contains 16 convolutional layers, divided into 4 stages, each containing 4 residual blocks. The number of output channels in each stage is 64, 128, 256, and 512, respectively. The first convolutional stride in the first stage is 1, and the first convolutional stride in the subsequent three stages is 2 for downsampling. All convolutional kernels have a size of 3, and the padding size is 1.
[0352] Global average pooling aggregates the feature maps of each channel into a single value, significantly reducing the number of parameters and improving numerical stability, resulting in an output feature vector with a dimension of 512. (Classification head) , Dropout regularization drop rate Dropout is not used during inference.
[0353] Step S5: Softmax probability output and numerical stabilization processing.
[0354] The feature vector is input into the classification head to obtain the predicted output value corresponding to each category. The predicted output value is then converted into the predicted probability of each category using a Softmax classifier. The Softmax classifier and numerical stabilization method are the same as in Example 1. In this example, the optimal temperature parameter is determined through grid search on the validation set. .
[0355] Step S6: Definition and calculation of the loss function.
[0356] Network parameter optimization is performed using cross-entropy loss as the objective function. The calculation methods for cross-entropy loss, weighted cross-entropy loss, and focus loss are the same as in Example 1. This example sets the focus parameter. Due to the large training sample size, the regularization coefficient is appropriately reduced, and set to... The total loss is the sum of the classification loss and the regularization loss.
[0357] Step S7: Optimize and train strategies.
[0358] The parameters of the one-dimensional deep convolutional neural network are updated using backpropagation and optimization algorithms, and the network is trained until the convergence condition or early stopping condition is met. The optimization algorithm and training strategy are the same as in Example 1.
[0359] The hyperparameters of the Adam optimization algorithm in this embodiment are set as follows: , Initial learning rate Gradient clipping threshold Preheating steps During the warm-up phase, the learning rate increases linearly.
[0360] Cosine annealing learning rate scheduling sets minimum learning rate Total training steps Training batch size configuration Total training steps Setting patience parameters for early stop strategy. Each evaluation period consists of 600 training steps, and the performance metric is the macro-average F1 score on the validation set. The exponential moving average decay coefficient is also used. .
[0361] Optionally, an online hard sample mining strategy can be introduced to select the top samples with the largest loss for each mini-batch. A proportional sample is used in backpropagation; this embodiment sets... This strategy allows the model to focus on boundary samples and easily confused categories, thereby improving its discriminative ability.
[0362] Step S8: Evaluation metrics, threshold selection, and model output.
[0363] The performance metrics of the trained one-dimensional deep convolutional neural network are evaluated on a validation set to determine the rejection threshold and output a deployable defect recognition model. The evaluation metric calculation method is the same as in Example 1.
[0364] The performance metrics of this embodiment on the test set are as follows: overall accuracy 94.5%, macro average F1 value 91.8%, and recall rates for each category are 97.2% for qualified welds, 92.8% for porosity defects, 91.2% for slag inclusion defects, 88.5% for non-fusion defects, and 85.8% for crack defects.
[0365] The method for determining the rejection threshold is the same as in Example 1. This example sets a recall rate constraint. The optimal rejection threshold is determined by grid search on the validation set. Set the interval number. The expected calibration error (ECE) on the test set is 2.9%, indicating that the model confidence is well calibrated.
[0366] The model output and model compression methods are the same as in Example 1. Monte Carlo Dropout can be used to estimate prediction uncertainty. During inference, Dropout is kept on, and forward calculations are repeated multiple times to obtain multiple sets of prediction probabilities. The mean and variance of the probabilities are calculated. If the variance exceeds a preset threshold, a verification or resampling prompt is given.
[0367] In summary, the embodiments disclosed herein have at least the following technical effects:
[0368] This invention uses a one-dimensional deep convolutional neural network with a tapping waveform sequence as input to achieve end-to-end automatic feature learning. It eliminates the need for manual feature design and automatically learns the hierarchical feature representations required to distinguish defect categories.
[0369] This invention achieves unified modeling of multiple types of defects, such as normal, void, and debonding, by using a deep convolutional structure and Softmax probability output, and outputs the probability distribution of each category, which facilitates subsequent decision-making and confidence assessment.
[0370] This invention improves model convergence speed and reduces overfitting risk by combining batch normalization, weight decay, data augmentation, learning rate scheduling, and early stopping strategies, thus ensuring the stability of the training process.
[0371] This invention constructs a lightweight classification head through global average pooling, which significantly reduces the number of parameters and improves numerical stability, facilitating the engineering deployment of the model.
[0372] This invention effectively addresses the class imbalance problem by using a weighted cross-entropy loss function, thereby improving the ability to identify defective samples in minority classes and reducing the false negative rate.
[0373] This invention uses a rejection threshold mechanism to label low-confidence samples, thereby achieving quality control and forming an auditable and reproducible training loop.
[0374] It is understood that the above embodiments are merely exemplary implementations used to illustrate the principles of the present invention, and the present invention is not limited thereto. For those skilled in the art, various modifications and improvements can be made without departing from the spirit and essence of the present invention, and these modifications and improvements are also considered to be within the scope of protection of the present invention.
Claims
1. A training method for a strike signal defect recognition model, characterized in that, include: The knock signal samples are acquired and categorized to construct a sample set. ,in For a fixed-length sequence of knock signals, For the corresponding category index, The total number of samples, The number of categories; and the event detection, peak alignment, and fixed-length segmentation of the knock signal samples to obtain fixed-length segmented samples, the mean removal and amplitude normalization of the segmented samples to obtain preprocessed samples, and the data augmentation of the preprocessed samples to obtain augmented samples; wherein, the event detection includes: The short-time energy is calculated from the original knock signal sequence. The formula for calculating the short-time energy is as follows: ; in, For the first The energy value of a frame. This is the original tapping signal sequence. For the index of the position, For frame length, For frame shift; Set adaptive threshold ,in and These are the mean and standard deviation of the short-time energy within the sliding window, respectively. For coefficients; When the short-term energy exceeds the adaptive threshold, it is determined that a knocking event has been detected; The peak alignment includes: Find the peak location within the neighborhood where the tapping event was detected. ; Using the peak position as an alignment reference, determine the starting point for a fixed-length cutoff. ,in To reserve points; The length to be cut off from the starting point is... The segmented sample is obtained by dividing a fixed-length signal segment; The sample set is divided into a training set, a validation set, and a test set to construct a mini-batch training data stream; The enhanced sample is input into a one-dimensional deep convolutional neural network, and multi-layer features are extracted sequentially through convolutional layers, batch normalization layers, activation layers and pooling layers. The feature vector is obtained by global average pooling. The feature vector is input into the classification head to obtain the predicted output value corresponding to each category. The predicted output value is then converted into the predicted probability of each category by the Softmax classifier. Using cross-entropy loss as the objective function, the parameters of the one-dimensional deep convolutional neural network are updated by backpropagation and Adam optimization algorithm, and the one-dimensional deep convolutional neural network is trained until the convergence condition or early stopping condition is met. The trained one-dimensional deep convolutional neural network is evaluated on the validation set to determine the rejection threshold and output a deployable defect recognition model. The cross-entropy loss is a weighted cross-entropy loss, calculated using the following formula: ; in, This refers to the number of training samples in a small batch. For the number of categories, For the sample Unique hot-coded tags in category The amount on, For the sample Corresponding category The predicted probability, For category The weighting coefficients.
2. The training method for the impact signal defect recognition model according to claim 1, characterized in that, The formula for the mean removal process is as follows: ; The calculation formula for the amplitude normalization process is as follows: ; in, For the segmented sample, For the mean-removed sample, For the normalized sample, For sample length, This is a very small constant used to prevent division by zero.
3. The training method for the impact signal defect recognition model according to claim 2, characterized in that, The data augmentation includes one or more of amplitude scaling, noise addition, time shifting, slight scaling, random occlusion, and Mixup. The noise addition is set according to the target signal-to-noise ratio, specifically as follows: ; in, For signal power, Noise power, noise Follows a pattern with a mean of zero and a variance of . The Gaussian distribution.
4. The training method for the impact signal defect recognition model according to claim 3, characterized in that, The formula for calculating global average pooling is: ; in, The first feature map of the last layer of the one-dimensional deep convolutional neural network. Each channel is located in eigenvalues at that location The length of the feature map, The first feature vector Each component.
5. The training method for the impact signal defect recognition model according to any one of claims 1 to 4, characterized in that, The early termination conditions include: Calculate performance metrics on the validation set, including macro-average F1 score or defective class recall rate; when the performance metrics do not improve within a preset number of consecutive rounds, stop training and roll back to the model parameters corresponding to the optimal performance metrics.
6. The training method for the impact signal defect recognition model according to any one of claims 1 to 4, characterized in that, The methods for determining the rejection threshold include: Calculate the maximum class prediction probability for each sample in the validation set. ;in, The predicted probability of each category The corresponding probability value; setting the rejection threshold When the maximum category prediction probability is greater than or equal to the rejection threshold, the defect identification model outputs the corresponding category prediction result; when the maximum category prediction probability is less than the rejection threshold, the defect identification model outputs a verification mark.