A method and system for detecting the purity of a chemical intermediate based on chromatographic characteristics
By employing a chromatographic feature-based method for detecting the purity of chemical intermediates, and utilizing continuous wavelet transform and watershed segmentation algorithms, the problem of distinguishing impurity signals in the purity detection of chemical intermediates has been solved, achieving more accurate purity detection and improving the safety and quality control of chemical products.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHAANXI DAMEI CHEM TECH CO LTD
- Filing Date
- 2026-03-09
- Publication Date
- 2026-06-23
AI Technical Summary
Existing methods for detecting the purity of chemical intermediates based on one-dimensional time-domain signals are difficult to accurately distinguish between real weak impurity signals and random noise under low signal-to-noise ratio conditions, leading to deviations in the purity calculation results of chemical intermediates, which may cause safety issues or production accidents.
A method for detecting the purity of chemical intermediates based on chromatographic features is adopted. A one-dimensional discrete time series is mapped into a two-dimensional time-frequency coefficient matrix through continuous wavelet transform, which is then converted into a grayscale chromatographic image. The region of interest is segmented using morphological gradient maps and watershed segmentation algorithms. The confidence index is determined by combining scale energy decay rate and horizontal skewness, and spurious signals are eliminated to calculate the purity of the chemical intermediate sample.
It enables more accurate detection of the purity of chemical intermediates, improves the sensitivity and accuracy of impurity detection, avoids misjudgment and missed detection, and ensures the safety and quality control of chemical products.
Smart Images

Figure CN121805484B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of image processing technology, and in particular to a method and system for detecting the purity of chemical intermediates based on chromatographic features. Background Technology
[0002] Pyrazolone compounds are a class of intermediates that play an important role in organic synthesis. Their molecular structure contains active functional groups such as pyrazolium ring and ketone group, which enable them to participate in a variety of reactions such as addition, substitution, and condensation. They have applications in the fields of medicine, dyes, pesticides and functional materials.
[0003] In the production and R&D of chemical intermediates, chromatographic analysis technology is the core means to confirm the purity and composition of products. For example, it can be used to detect the purity of pyrazolone, a chemical intermediate. In related technologies, chromatographic data processing workstations usually adopt one-dimensional signal peak detection algorithms, which identify chromatographic peaks by setting thresholds, calculate peak areas using integral algorithms, and then use area normalization methods to determine the content of main components and impurities.
[0004] However, in the complex chemical production environment, the synthesis reaction liquid of chemical intermediates is often complex in composition, accompanied by a large number of by-products, isomers and unreacted raw materials, which often leads to baseline drift, noise interference and severe chromatographic peak overlap in the chromatogram. Existing detection methods based on one-dimensional time domain signals are difficult to distinguish between real weak impurity signals and random noise under low signal-to-noise ratio conditions.
[0005] If these weak or overlapping impurity signals cannot be accurately identified and removed, it will directly lead to a deviation between the calculated purity of chemical intermediates and the actual purity, which may cause safety problems or performance defects in the final drug or chemical product, or even cause serious production safety accidents and economic losses. Therefore, it is necessary to achieve more accurate detection of the purity of chemical intermediates based on chromatographic characteristics. Summary of the Invention
[0006] To achieve more accurate detection of the purity of chemical intermediates, this application provides a method and system for detecting the purity of chemical intermediates based on chromatographic characteristics.
[0007] According to a first aspect of the embodiments of this application, a method for detecting the purity of chemical intermediates based on chromatographic features is provided, comprising: acquiring a chromatographic detection signal of a chemical intermediate sample; performing baseline correction on the chromatographic detection signal to obtain a one-dimensional discrete time series; mapping the one-dimensional discrete time series into a two-dimensional time-frequency coefficient matrix using continuous wavelet transform; converting the two-dimensional time-frequency coefficient matrix into a grayscale chromatographic image; calculating the morphological gradient map of the grayscale chromatographic image; and segmenting the morphological gradient map using a watershed segmentation algorithm to obtain multiple regions of interest; determining the scale energy decay rate and horizontal skewness of each region of interest; the scale energy decay rate is determined by the ratio of the sum of pixel energy in the region of interest within a range less than a preset scale threshold to the sum of pixel energy within a range greater than or equal to the preset scale threshold; determining a confidence index of the region of interest based on the scale energy decay rate and horizontal skewness; removing regions of interest with a confidence index lower than a preset threshold; and determining the purity of the chemical intermediate sample based on the integral area of the one-dimensional discrete time series corresponding to the retained regions of interest.
[0008] This allows for more accurate detection of the purity of chemical intermediates.
[0009] Optionally, a one-dimensional discrete time series can be mapped to a two-dimensional time-frequency coefficient matrix using continuous wavelet transform, including: selecting the Mexican hat wavelet as the mother wavelet function, performing convolution operation on the one-dimensional discrete time series within a preset scale range; and determining the coefficient values at different coordinate points in the two-dimensional time-frequency coefficient matrix by calculating the integral over the entire time domain of the complex conjugate product of the one-dimensional discrete time series and the mother wavelet function after time displacement and translation scale transformation.
[0010] Optionally, converting the two-dimensional time-frequency coefficient matrix into a grayscale chromatographic image includes: obtaining the minimum and maximum absolute values in the two-dimensional time-frequency coefficient matrix; for each coefficient in the two-dimensional time-frequency coefficient matrix, determining a first difference between the absolute value and the minimum value, and determining a second difference between the maximum value and the minimum value, using the ratio of the first difference to the second difference as a normalization coefficient; multiplying the normalization coefficient by a preset upper limit value for grayscale levels and rounding down to obtain the grayscale value of the corresponding pixel in the grayscale chromatographic image.
[0011] In this way, by mapping the wavelet coefficients to the standard grayscale space through extreme value normalization, not only are the dimensions of the data unified and the influence of the order of magnitude difference in the response values of different detectors eliminated, but the contrast of the image is also enhanced, so that even weak impurity peaks can present visible texture features in the grayscale image, which is convenient for subsequent image segmentation processing.
[0012] Optionally, calculating the morphological gradient map of the grayscale chromatographic image includes: performing a dilation operation on the grayscale chromatographic image using a preset structuring element to obtain a dilated image; performing an erosion operation on the grayscale chromatographic image using the structuring element to obtain an eroded image; and subtracting the grayscale value of the corresponding pixel in the eroded image from the grayscale value of each pixel in the dilated image to obtain the gradient value of the corresponding point in the morphological gradient map.
[0013] In this way, it can accurately capture the edge regions in grayscale images where grayscale values change drastically. In time-frequency images, the edges often correspond to the start and end points of chromatographic peaks or the bottom of valleys between overlapping peaks, thus providing precise dam boundaries for the watershed algorithm and helping to prevent over-segmentation or under-segmentation.
[0014] Optionally, the scale energy decay rate of the region of interest is determined as follows: ,in, Let be the scale energy decay rate of the region of interest. For the first A set of pixels representing a region of interest; Here are the coordinates of the pixel in the grayscale image. For time displacement parameters, For scale parameters; For pixels grayscale value; The preset scale threshold; To prevent positive numbers with a denominator of zero.
[0015] In this way, by quantifying the energy ratio of the region of interest at small-scale high frequencies to large-scale low frequencies, noise and real signals can be effectively distinguished based on the frequency domain attenuation characteristics of the signal.
[0016] Optionally, the horizontal skewness of the region of interest is determined as follows: ,in, The horizontal skewness of the region of interest. For region of interest The first central moment; The zeroth moment of the region of interest; The second-order central moments of the region of interest; The third central moment of the region of interest; order central moments It is determined by calculating the weighted sum of the products of the distances of the grayscale coordinates of all pixels within the region of interest to the centroid coordinates, raised to powers of each other.
[0017] In this way, by using image moment features to calculate the skewness of the time-frequency region, the symmetry characteristics of chromatographic peaks on the time axis can be captured. True chromatographic peaks usually exhibit specific tailing or forward extension morphology, while false artifacts or noise patches often have random morphological characteristics. By constraining the skewness features, the specificity of target recognition is further improved.
[0018] Optionally, the confidence index is determined as follows: ,in, The confidence index for the region of interest. The scale energy decay rate of the region of interest; The reference scale energy decay rate is taken as the scale energy decay rate of the image region corresponding to the autonomous peak. The horizontal skewness of the region of interest; Preset skewness; The preset tolerance variance, It is a natural exponential function.
[0019] Optionally, after generating the grayscale chromatographic image, the method further includes: acquiring a standard fingerprint image and determining the reference centroid coordinates of the main peak in the standard fingerprint image; calculating the image centroid coordinates of the main peak in the current grayscale chromatographic image; calculating the time axis translation based on the reference centroid coordinates and the image centroid coordinates; and performing a rigid transformation alignment on the current grayscale chromatographic image according to the time axis translation.
[0020] This ensures that chromatographic data collected from different batches or different devices have a consistent coordinate benchmark in the time and frequency domain, which facilitates subsequent feature extraction based on a fixed window or horizontal comparison with historical data.
[0021] Optionally, the purity of a chemical intermediate sample is determined based on the integral area of the one-dimensional discrete time series corresponding to the retained region of interest, including: obtaining the integral area of the main peak; determining the integral areas of other peaks in the retained region of interest (excluding the main peak) on the one-dimensional discrete time series; taking the sum of the integral areas of all other peaks as the total impurity area, determining the sum of the integral area of the main peak and the total impurity area, and taking the ratio of the integral area of the main peak to the sum as the purity.
[0022] According to a second aspect of the embodiments of this application, a chemical intermediate purity detection system based on chromatographic features is provided, comprising: a processor and a memory, wherein the memory stores computer program instructions, and the computer program instructions, when executed by the processor, implement the steps of the chemical intermediate purity detection method based on chromatographic features provided in the first aspect of this application.
[0023] The technical solutions provided by the embodiments of this application may include the following beneficial effects: the one-dimensional chromatographic signal of the chemical intermediate sample is enhanced to the two-dimensional time-frequency domain; the multi-scale analysis capability of continuous wavelet transform can effectively separate chromatographic peaks with specific scale characteristics from broadband noise; through image morphological gradient and watershed algorithms, overlapping peaks can be accurately segmented based on the time-frequency topology; and the confidence index constructed by combining scale energy decay rate and horizontal skewness can physically distinguish between real chromatographic peaks and false signals, thereby improving the sensitivity and accuracy of impurity detection in complex samples and achieving a more accurate determination of the purity of chemical intermediates.
[0024] It should be understood that the above general description and the following detailed description are exemplary and explanatory only, and do not limit this application. Attached Figure Description
[0025] Figure 1 This is a flowchart illustrating a method for detecting the purity of chemical intermediates based on chromatographic features, according to an exemplary embodiment.
[0026] Figure 2 This is a schematic diagram of a grayscale chromatographic image in an embodiment of this application;
[0027] Figure 3 This is a schematic diagram of a one-dimensional discrete time series corresponding to the region of interest;
[0028] Figure 4 This is a schematic diagram illustrating the structure of a chemical intermediate purity detection system based on chromatographic features, according to an exemplary embodiment. Detailed Implementation
[0029] To achieve more accurate detection of the purity of chemical intermediates, embodiments of this application provide a method and system for detecting the purity of chemical intermediates based on chromatographic characteristics. Figure 1 This is a flowchart illustrating a method for detecting the purity of chemical intermediates based on chromatographic features, according to an exemplary embodiment. Figure 1 As shown, the method includes the following steps.
[0030] In step S101, the chromatographic detection signal of the chemical intermediate sample is acquired, and the chromatographic detection signal is baseline-corrected to obtain a one-dimensional discrete time series, so as to map the one-dimensional discrete time series into a two-dimensional time-frequency coefficient matrix using continuous wavelet transform.
[0031] In one embodiment, mapping a one-dimensional discrete time series to a two-dimensional time-frequency coefficient matrix using continuous wavelet transform includes: selecting a Mexican hat wavelet as the mother wavelet function, performing convolution operations on the one-dimensional discrete time series within a preset scale range; and determining the coefficient values at different coordinate points in the two-dimensional time-frequency coefficient matrix by calculating the integral over the entire time domain of the complex conjugate product of the one-dimensional discrete time series and the mother wavelet function after time displacement and translation scale transformation.
[0032] In actual chemical intermediate detection processes, raw voltage signals can be acquired from chromatographic detectors such as ultraviolet detectors or diode array detectors using high-precision analog-to-digital converters. The sampling frequency can be set from 10 Hz to 100 Hz to ensure that transient chromatographic peak details are captured.
[0033] The original signal is often superimposed with periodic noise caused by mobile phase pulsation and low-frequency baseline drift caused by column bleed or gradient elution. Before transformation, the original signal can be preprocessed first, and the baseline trend can be estimated by using an asymmetric least squares smoothing algorithm or a polynomial fitting algorithm. The trend can then be subtracted from the original signal to obtain a pure baseline-corrected one-dimensional discrete time series.
[0034] When performing continuous wavelet transform, this embodiment can select the Mexican cap wavelet. The Mexican cap wavelet function is mathematically the negative of the second derivative of the Gaussian function. Its waveform characteristics are characterized by a large positive peak in the center and a smaller negative valley on each side. This geometric feature is similar to the Gaussian chromatographic peak under ideal conditions.
[0035] In convolution operations, a scale range can be set, such as a scale parameter. The value changes from 1 to 64, with a step size of 0.5, for each specific time displacement point. and each scale After scaling and translating the mother wavelet function, an inner product operation is performed with a one-dimensional discrete-time series. The Mexican cap wavelet has a zero response to constant signals and linear signals, and has the ability to eliminate residual baseline drift.
[0036] Multiscale analysis can expand a one-dimensional time series to a two-dimensional time-frequency plane, such as the scale-time plane. In the two-dimensional time-frequency plane, the true chromatographic peaks appear as continuous ridges spanning multiple scales, while high-frequency random noise appears as isolated spots scattered in small-scale regions.
[0037] Mapping not only visually demonstrates the energy distribution of chromatographic peaks at different scales, but also effectively separates noise from signal in the scale domain, significantly improving the signal-to-noise ratio. This enables subsequent steps to use image processing techniques to precisely identify subtle impurities.
[0038] In step S102, the two-dimensional time-frequency coefficient matrix is converted into a grayscale chromatographic image, the morphological gradient map of the grayscale chromatographic image is calculated, and the watershed segmentation algorithm is used to segment the morphological gradient map to obtain multiple regions of interest.
[0039] In one embodiment, converting a two-dimensional time-frequency coefficient matrix into a grayscale chromatographic image includes: obtaining the minimum and maximum absolute values of the two-dimensional time-frequency coefficient matrix; for each coefficient in the two-dimensional time-frequency coefficient matrix, determining a first difference between the absolute value and the minimum value of the coefficient, and determining a second difference between the maximum value and the minimum value, using the ratio of the first difference to the second difference as a normalization coefficient; multiplying the normalization coefficient by a preset upper limit value of grayscale level and rounding down to obtain the grayscale value of the pixel corresponding to the coefficient in the grayscale chromatographic image.
[0040] Since the coefficient matrix after continuous wavelet transform contains positive and negative values and the numerical range may span multiple orders of magnitude, it is difficult to perform direct image processing. Furthermore, the coefficient values generated by samples of different concentrations are not comparable. Therefore, dynamic range normalization can be performed.
[0041] Traverse the entire two-dimensional time-frequency coefficient matrix and search for the global minimum value among all the absolute values of the coefficients. and global maximum value For any coordinate point in the matrix coefficient at the location Calculate its normalization coefficient. .
[0042] The normalization coefficients are mapped to a standard 8-bit grayscale space, with pixel values ranging from 0 to 255. In the generated grayscale chromatographic image, the higher the brightness, for example, the region closer to 255, the larger the wavelet coefficient modulus at that location, meaning the higher the matching degree between the signal and the wavelet function, corresponding to the energy center of the chromatographic peak; the lower the brightness, for example, the region closer to 0, corresponds to the baseline or background noise.
[0043] Figure 2 This is a schematic diagram of a grayscale chromatographic image in an embodiment of this application, such as... Figure 2 As shown, the conversion from one-dimensional chromatographic detection signal of chemical intermediate sample to two-dimensional image is realized, which can facilitate the more intuitive and accurate determination of noise signal or ghost peak and other interference signals in two-dimensional image.
[0044] By employing a linear stretching transform based on extrema, the dynamic range of gray levels can be maximized and the contrast of the image can be enhanced. For impurity peaks that are small relative to the main peak, such as impurities with a content of 0.05%, if this dynamic range normalization is not performed, they may appear as dark areas that are not visible to the naked eye in the image. This transformation converts a complex signal processing problem into a computer vision problem, and image segmentation techniques such as the watershed algorithm can be used to handle chromatographic overlap problems.
[0045] In one embodiment, after generating a grayscale chromatographic image, a standard fingerprint image can be acquired, and the reference centroid coordinates of the main peak in the standard fingerprint image can be determined. The image centroid coordinates of the main peak in the current grayscale chromatographic image can be calculated. The time axis shift is calculated based on the reference centroid coordinates and the image centroid coordinates, and the current grayscale chromatographic image is rigidly transformed and aligned according to the time axis shift.
[0046] A standard fingerprint image of a chemical intermediate generated under ideal conditions, such as using a standard, can be stored in advance. When processing the current sample to be tested, the image is first binarized to identify the region with the highest brightness and the largest connected area as the main peak region, and its centroid coordinates are calculated using the image moment algorithm.
[0047] Read the reference centroid of the main peak in the standard spectrum and calculate the deviation of the centroid coordinates from the reference centroid along the time axis. Keeping the scale axis unchanged, shift all pixels of the current image along the time axis. Pixels that overflow after translation are truncated, and the empty areas are filled with zeros.
[0048] Because the retention time of a high-performance liquid chromatograph can be affected by various physical factors such as the aging of the chromatographic column, changes in ambient temperature, mobile phase ratio, or minor fluctuations, the elution time of the chromatographic peak of the same substance may drift. This drift can interfere with subsequent position-based feature comparisons.
[0049] By using rigid transformation alignment correction, it can be ensured that the features extracted subsequently are performed under the same time domain reference, which improves the comparability of data between different batches of samples and facilitates the rapid location of specific known impurity regions, such as locking specific key impurities based on retention time windows.
[0050] In one embodiment, calculating the morphological gradient map of a grayscale chromatographic image includes: performing a dilation operation on the grayscale chromatographic image using a preset structuring element to obtain a dilated image; performing an erosion operation on the grayscale chromatographic image using the structuring element to obtain an eroded image; and subtracting the grayscale value of the corresponding pixel in the eroded image from the grayscale value of each pixel in the dilated image to obtain the gradient value of the corresponding point in the morphological gradient map.
[0051] After obtaining the aligned grayscale image, you can select, for example... Perform morphological dilation on square or circular structural elements to expand the pixels. The grayscale value of the pixel is replaced with the maximum grayscale value within the structuring element's coverage area, causing the highlighted area to expand outward and fill the tiny holes; a morphological erosion operation is performed to replace the grayscale value of the pixel with the minimum grayscale value within the structuring element's coverage area, causing the highlighted area to shrink inward and eliminate isolated noise.
[0052] By subtracting the eroded image from the dilated image, a morphological gradient map is obtained. In the gradient map, regions with higher pixel values correspond to the boundaries of the original image where grayscale changes drastically, i.e., the outlines of the highlighted areas. Regions with gentle grayscale changes, such as the inside of peaks or flat backgrounds, are represented by low values in the gradient map.
[0053] In the case of overlapping chromatographic peaks, there is usually a saddle with a lower gray value between the two peaks. The morphological gradient can keenly capture the gradient change of this saddle, thereby constructing a steeper watershed boundary. This step improves the accuracy of segmentation and can successfully separate the fused shoulder peaks or mother-daughter peaks in the image domain to obtain independent regions of interest. Each region of interest represents a potential chromatographic peak, such as a main peak, impurity, or noise patch.
[0054] In step S103, the scale energy decay rate and horizontal skewness of each region of interest are determined.
[0055] The scale energy decay rate is determined by the ratio of the sum of pixel energy in the region of interest within a range less than a preset scale threshold to the sum of pixel energy within a range greater than or equal to the preset scale threshold.
[0056] In one embodiment, the scale energy decay rate of the region of interest is determined by the following expression: ,in, Let be the scale energy decay rate of the region of interest. For the first A set of pixels representing a region of interest; Here are the coordinates of the pixel in the grayscale image. For time displacement parameters, For scale parameters; For pixels grayscale value; The preset scale threshold; To prevent positive numbers with a denominator of zero.
[0057] For each region of interest segmented by the watershed algorithm It focuses not only on its location but also on its internal energy distribution structure; a scale threshold can be set. ,For example or The scale threshold can be determined based on experience or noise band testing, and is used to distinguish between noise-dominated small-scale regions and signal-dominated large-scale regions.
[0058] The numerator of the expression for the scale energy decay rate of the region of interest is summed up by all scales smaller than 1000 within the region. The squared gray value of the pixel, the denominator is accumulated by the factor with a scale greater than or equal to 1. The squared gray value of the pixel. It is a positive number, for example This is used to ensure numerical stability.
[0059] After continuous wavelet transform, the amplitude of white noise coefficients decays rapidly with increasing scale, indicating that the energy is mainly concentrated at the smallest scale, i.e., the numerator is larger and the denominator is smaller. In contrast, a real chromatographic peak is a low-frequency signal with a certain width, whose energy is mainly concentrated at the medium and large scales, and has strong coherence on the scale axis, i.e., the denominator is much larger than the numerator.
[0060] noise patches The values are usually larger, while the true chromatographic peaks are... The value is usually smaller. The scale energy decay rate of the region of interest, as an indicator based on frequency domain physical properties, can effectively identify and eliminate false regions formed by high-frequency random noise or ghost peaks, thereby reducing the false positive detection rate.
[0061] In one embodiment, the horizontal skewness of the region of interest is determined by the following expression: ,in, The horizontal skewness of the region of interest. For region of interest The first central moment; The zeroth moment of the region of interest; The second-order central moments of the region of interest; The third central moment of the region of interest; order central moments It is determined by calculating the weighted sum of the products of the distances of the grayscale coordinates of all pixels within the region of interest to the centroid coordinates, raised to powers of each other.
[0062] The total grayscale quality or total energy of the region. This reflects the extent or variance of the image in the horizontal direction, i.e., along the time axis. This reflects the asymmetry of the image in the horizontal direction. The first moment can be calculated first to determine the centroid coordinates, and then the central moments of each order can be calculated based on the centroid.
[0063] Ideal chromatographic peaks are usually symmetrical Gaussian shapes. However, in actual chemical analysis, due to column adsorption effects, injection volume overload, or solvent effects, real impurity peaks often exhibit tailing or forward-protruding shapes with a specific range of skewness. Meanwhile, false regions caused by baseline fluctuations, bubble interference, or electron artifacts are often irregular in shape or perfectly symmetrical circular spots.
[0064] By introducing morphological features as an auxiliary criterion and quantifying the asymmetry of the region of interest on the time axis, it is possible to distinguish between real substance peaks with specific chromatographic behavior and accidentally formed interference patches.
[0065] In step S104, the confidence index of the region of interest is determined based on the scale energy decay rate and the horizontal skewness; regions of interest with confidence indices lower than a preset threshold are removed, and the purity of the chemical intermediate sample is determined based on the integral area of the one-dimensional discrete time series corresponding to the retained regions of interest.
[0066] In one embodiment, the confidence index is determined by the following expression: ,in, The confidence index for the region of interest. The scale energy decay rate of the region of interest; The reference scale energy decay rate is taken as the scale energy decay rate of the image region corresponding to the autonomous peak. The horizontal skewness of the region of interest; Preset skewness; The preset tolerance variance, It is a natural exponential function.
[0067] When comprehensively considering energy and morphological characteristics The term is a relative ratio, since the main peak is the actual material peak. The value represents the standard energy decay characteristic of the real signal under the analysis conditions.
[0068] If the region to be judged is a genuine impurity peak, its energy decay pattern should be similar to that of the main peak, then the ratio It should be close to 1; if the area is noisy, then Usually much larger This leads to a significant deviation in the ratio; the exponent term in the latter part of the confidence index expression is a weighting factor in the form of a Gaussian function, used to measure the skewness of the region to be judged. skewness as expected The degree of closeness, For example, it can be set to 0, or the typical skewness of impurities can be set based on historical data.
[0069] and The greater the difference between them, the closer the value of the exponential function is to 0. The final confidence index is a comprehensive index that integrates frequency domain features and spatial domain features. The higher the value of the confidence index, the higher the credibility of the region being a true chromatographic peak.
[0070] A single feature may carry the risk of misjudgment. For example, some broadband noise may pass the energy decay test, but it does not conform to the chromatographic peak characteristics in terms of morphology. Through this non-linear weighted fusion, a multi-dimensional feature space is constructed, which can evaluate the region of interest. By setting a threshold, low-confidence false signals can be automatically eliminated in batches, while retaining the true impurity peaks.
[0071] In one embodiment, determining the purity of a chemical intermediate sample based on the integral area of the one-dimensional discrete time series corresponding to the retained region of interest includes: obtaining the integral area of the main peak; determining the integral area of each of the other peaks in the retained region of interest on the one-dimensional discrete time series; taking the sum of the integral areas of all other peaks as the total impurity area, determining the sum of the integral area of the main peak and the total impurity area, and taking the ratio of the integral area of the main peak to the sum as the purity.
[0072] After the above steps of screening, the regions of interest that remain correspond to the actual chromatographic peaks. Based on the left and right boundaries of these regions on the image time axis, we return to the original or baseline-corrected one-dimensional discrete time series and perform definite integral calculations within the corresponding start and end times to obtain the peak area of each peak.
[0073] For example, the area of the main peak is The area of each impurity peak is Purity can be calculated using the area normalization method, i.e., purity equals... .
[0074] Because the watershed algorithm provides accurate peak-valley segmentation points, it solves the area allocation error problem in the traditional vertical cutting method when dealing with overlapping peaks. By eliminating a large number of noise signals that would otherwise be mistakenly integrated as impurities through the confidence index, it also finds out the tiny impurity signals that were originally masked by baseline noise or swallowed by the main peak, making the final calculated purity value closer to the true value of the sample, thus providing accurate data support for quality control in chemical production.
[0075] For example, the data segment contains 8.2, 3.5 and 9.1, which represent the integral areas of different impurities. If noise is mistakenly included, the purity will be inaccurate, which may cause qualified products to be rejected as unqualified products. Conversely, if overlapping impurities are missed, the actual unqualified products may be shipped as qualified products. This application effectively avoids these two types of errors and improves the robustness of the test results.
[0076] Figure 3 This is a schematic diagram of a one-dimensional discrete-time series corresponding to the region of interest, such as... Figure 3 As shown, the technical solution provided by the embodiments of this application can identify possible interference signals such as ghost peak signals or noise signals, as well as effectively identify actual impurities and main peak signals, so as to detect the purity of chemical intermediates while avoiding interference from these interference signals and obtain more accurate purity detection results.
[0077] Figure 4 This is a schematic diagram illustrating the structure of a chemical intermediate purity detection system 1000 based on chromatographic features, according to an exemplary embodiment. (Refer to...) Figure 4 The chemical intermediate purity detection system 1000 based on chromatographic features includes a processor 1100 and a memory 1200. The memory 1200 stores computer program instructions, which, when executed by the processor 1100, implement all or part of the steps of the chemical intermediate purity detection method based on chromatographic features in this application.
[0078] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein. The specification and embodiments are to be considered exemplary only.
[0079] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope.
Claims
1. A method for determining the purity of chemical intermediates based on chromatographic characteristics, characterized in that, include: The chromatographic detection signal of the chemical intermediate sample is obtained, and the baseline correction of the chromatographic detection signal is performed to obtain a one-dimensional discrete time series. The one-dimensional discrete time series is then mapped into a two-dimensional time-frequency coefficient matrix using continuous wavelet transform. The two-dimensional time-frequency coefficient matrix is converted into a grayscale chromatographic image, the morphological gradient map of the grayscale chromatographic image is calculated, and the watershed segmentation algorithm is used to segment the morphological gradient map to obtain multiple regions of interest. Determine the scale energy decay rate for each region of interest. and horizontal skew ; ,in, For the first A set of pixels representing a region of interest; These are the coordinates of the pixel in the grayscale image. For time displacement parameters, For scale parameters; For pixels grayscale value; The preset scale threshold; To prevent positive numbers with a denominator of zero; ,in, Region of Interest The central moment of the first order; The zeroth moment of the region of interest; The second central moment of the region of interest; The third central moment of the region of interest; order central moments It is determined by calculating the weighted sum of the products of the distances of the gray values of all pixels within the region of interest to the centroid coordinates, raised to powers of each other. The scale energy decay rate is determined by the ratio of the sum of pixel energy in the region of interest within a range less than a preset scale threshold to the sum of pixel energy within a range greater than or equal to the preset scale threshold. Confidence index for determining region of interest based on scale energy decay rate and horizontal skewness , ,in, The reference scale energy decay rate is taken as the scale energy decay rate of the image region corresponding to the autonomous peak. Preset skewness; Let exp be the preset tolerance variance, and let exp be the natural exponential function. Regions of interest with confidence indices below a preset threshold are removed, and the purity of the chemical intermediate sample is determined based on the integral area of the one-dimensional discrete time series corresponding to the retained regions of interest.
2. The method for detecting the purity of chemical intermediates based on chromatographic characteristics according to claim 1, characterized in that, The continuous wavelet transform is used to map a one-dimensional discrete-time series into a two-dimensional time-frequency coefficient matrix, including: The Mexican hat wavelet is selected as the mother wavelet function, and convolution operation is performed on the one-dimensional discrete time series within a preset scale range. By calculating the integral of the complex conjugate product of the one-dimensional discrete time series and the mother wavelet function after time displacement and translation scale transformation over the entire time domain, the coefficient values at different coordinate points in the two-dimensional time-frequency coefficient matrix are determined respectively.
3. The method for detecting the purity of chemical intermediates based on chromatographic characteristics according to claim 1, characterized in that, Converting a two-dimensional time-frequency coefficient matrix into a grayscale chromatographic image includes: Obtain the minimum and maximum absolute values in the two-dimensional time-frequency coefficient matrix; for each coefficient in the two-dimensional time-frequency coefficient matrix, determine the first difference between the absolute value and the minimum value, and determine the second difference between the maximum value and the minimum value, and use the ratio of the first difference to the second difference as the normalization coefficient; multiply the normalization coefficient by the preset gray level upper limit and round down to obtain the gray value of the corresponding pixel in the grayscale color spectrum image.
4. The method for detecting the purity of chemical intermediates based on chromatographic characteristics according to claim 1, characterized in that, Calculate the morphological gradient map of a grayscale chromatographic image, including: A dilated image is obtained by performing a dilation operation on a grayscale chromatographic image using a preset structuring element. Erosion is performed on a grayscale colorimetric image using structuring elements to obtain an eroded image. The grayscale value of each pixel in the dilated image is subtracted from the grayscale value of the corresponding pixel in the eroded image to obtain the gradient value of the corresponding point in the morphological gradient map.
5. The method for detecting the purity of chemical intermediates based on chromatographic characteristics according to claim 1, characterized in that, After generating the grayscale chromatographic image, the method further includes: Acquire a standard fingerprint image and determine the reference centroid coordinates of the main peak in the standard fingerprint image. Calculate the image centroid coordinates of the main peak in the current grayscale chromatographic image. The time axis translation is calculated based on the reference centroid coordinates and the image centroid coordinates, and the current grayscale chromatographic image is rigidly transformed and aligned according to the time axis translation.
6. The method for detecting the purity of chemical intermediates based on chromatographic characteristics according to claim 1, characterized in that, The purity of a chemical intermediate sample is determined based on the integral area of the one-dimensional discrete-time series corresponding to the retained region of interest, including: Obtain the integral area of the main peak; for the other peaks in the retained region of interest besides the main peak, determine the integral area of each peak on the one-dimensional discrete time series. The sum of the integrated areas of all other peaks is taken as the total impurity area. The sum of the integrated area of the main peak and the total impurity area is determined, and the ratio of the integrated area of the main peak to the sum is taken as the purity.
7. A purity detection system for chemical intermediates based on chromatographic characteristics, characterized in that, include: A processor and a memory, the memory storing computer program instructions that, when executed by the processor, implement the method for detecting the purity of chemical intermediates based on chromatographic features according to any one of claims 1-6.