Machine learning based raman spectroscopy method for urinary iodine concentration detection
By combining multi-scale wavelet decomposition and the characteristic vibrational bands of urinary iodine molecules, a fused feature vector is constructed, which solves the problem of insufficient spectral feature extraction in Raman spectroscopy for urinary iodine concentration detection, and achieves high accuracy and stability in urinary iodine concentration detection.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- SHENMIN (SHANGHAI) BIOTECHNOLOGY CO LTD
- Filing Date
- 2026-04-02
- Publication Date
- 2026-06-12
Smart Images

Figure CN122201728A_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of Raman spectroscopy for detecting urinary iodine concentration, and specifically to a Raman spectroscopy method for detecting urinary iodine concentration based on machine learning. Background Technology
[0002] Urinary iodine concentration is an important indicator for assessing iodine nutritional status and thyroid function in the human body. Commonly used clinical detection methods include arsenic-cerium catalytic spectrophotometry and inductively coupled plasma mass spectrometry. These methods generally suffer from problems such as complex detection procedures, cumbersome sample pretreatment steps, long detection times, and the need for specialized experimental equipment. In contrast, Raman spectroscopy has become an alternative for urinary iodine detection due to its advantages of being rapid, non-destructive, and requiring no complex pretreatment.
[0003] Among related technologies, machine learning has shown great potential in the field of spectral data analysis. Typical methods include feature dimensionality reduction based on principal component analysis (PCA), linear modeling based on partial least squares regression (PLS), and traditional algorithms such as support vector machines (SVM). These methods typically model raw spectra or data after simple preprocessing (such as baseline correction), and their core assumption is that there is a stable and resolvable mathematical relationship between spectral features and target concentrations.
[0004] However, traditional machine learning methods still have significant limitations when dealing with complex biological samples such as urine: 1) Relying solely on pre-defined single spectral features (such as peak intensity) or simple statistics (such as the mean and variance of the entire spectrum) makes it difficult to effectively extract hierarchical feature information from the spectrum. 2) Treating the spectrum merely as a purely mathematical signal processing object lacks a robust mechanism for handling the compositional differences in urine samples. Summary of the Invention
[0005] (a) Technical problems to be solved To address the shortcomings of existing technologies, this invention provides a Raman spectroscopy method for detecting urinary iodine concentration based on machine learning. This method solves the technical problem that related technologies rely solely on single spectral features or simple statistics and fail to distinguish the interference characteristics of different components in urine samples.
[0006] (II) Technical Solution To achieve the above objectives, the present invention provides the following technical solution: A machine learning-based Raman spectroscopy method for detecting urinary iodine concentration includes: Acquire and preprocess the raw Raman spectra of urine samples to obtain standardized spectral data; The standardized spectral data is subjected to multi-scale wavelet decomposition to extract energy features, mean features and standard deviation features at different decomposition levels, thus forming multi-scale statistical features. Based on the characteristic vibrational bands of urinary iodine molecules, physical prior features of the corresponding wavenumber intervals are extracted, including the average intensity and maximum intensity of the peak region. The multi-scale statistical features and the physical prior features are concatenated to construct a fused feature vector; The fused feature vector is used as input to a pre-trained machine learning model to obtain a predicted value of urinary iodine concentration.
[0007] Preferably, the preprocessing is a combination of one or any of the following operations: Baseline correction, smoothing and denoising, wavenumber axis alignment and normalization.
[0008] Preferably, the step of performing multi-scale wavelet decomposition on the standardized spectral data to extract energy features, mean features, and standard deviation features at different decomposition levels, thereby constructing multi-scale statistical features, includes: Multi-level decomposition is performed using Daubechies wavelet basis functions to obtain high-frequency detail coefficients and low-frequency approximation coefficients for each level; Extract the corresponding energy features, first mean features, and first standard deviation features from each layer of high-frequency detail coefficients; Extract the second mean feature and the second standard deviation feature from the low-frequency approximation coefficients of the final level; The energy features, first mean features, and first standard deviation features corresponding to the high-frequency detail coefficients of each layer, as well as the second mean features and second standard deviation features of the final layer, are combined to form multidimensional, multi-scale statistical features.
[0009] Preferably, the number of decomposition levels is five.
[0010] Preferably, the characteristic vibrational bands of the urinary iodine molecules are one or any combination of the following characteristic bands: , and .
[0011] Preferably, the machine learning model is a machine learning regression model, which is a support vector regression model, a random forest regression model, or a neural network regression model.
[0012] A machine learning-based Raman spectroscopy system for detecting urinary iodine concentration includes: The data acquisition and preprocessing module is used to acquire and preprocess the raw Raman spectra of urine samples to obtain standardized spectral data. The multi-scale statistical feature extraction module is used to perform multi-scale wavelet decomposition on the standardized spectral data and extract energy features, mean features and standard deviation features at different decomposition levels to form multi-scale statistical features. The physical feature extraction module is used to extract the physical prior features of the corresponding wavenumber interval based on the characteristic vibrational bands of urinary iodine molecules, including the average intensity and maximum intensity of the peak region. The feature fusion module is used to concatenate the multi-scale statistical features and the physical prior features to construct a fused feature vector. The concentration prediction module is used to take the fused feature vector as input to a pre-trained machine learning model to obtain a predicted value of urinary iodine concentration.
[0013] A computer program product comprising computer program code, which, when run on a computer, causes the computer to implement the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration as described above.
[0014] A storage medium storing a computer program, wherein the computer program causes a computer to perform the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration as described above.
[0015] An electronic device, the electronic device comprising: Processor and memory; The memory stores program instructions; The processor is configured to run the program instructions to execute the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration as described above.
[0016] (III) Beneficial Effects This invention provides a machine learning-based Raman spectroscopy method for detecting urinary iodine concentration. Compared with existing technologies, it has the following advantages: In this invention, standardized spectral data is first acquired. Then, multi-scale wavelet decomposition is performed on the standardized spectral data to extract energy features, mean features, and standard deviation features at different decomposition levels, forming multi-scale statistical features. Based on the characteristic vibrational bands of urinary iodine molecules, physical prior features of the corresponding wavenumber intervals are extracted, including the average intensity and maximum intensity of the peak region. Next, the multi-scale statistical features and physical prior features are fused to construct a fused feature vector. Finally, the fused feature vector is used as input to a pre-trained machine learning model to obtain the predicted value of urinary iodine concentration. This method extracts multi-scale statistical features from Raman spectra through multi-scale wavelet decomposition and combines them with the physical prior features of the characteristic vibrational bands of urinary iodine to construct a fused feature vector with stronger representational capabilities, thereby significantly improving the accuracy and stability of urinary iodine concentration detection using machine learning techniques. Attached Figure Description
[0017] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0018] Figure 1 A flowchart of a machine learning-based Raman spectroscopy method for detecting urinary iodine concentration is provided for embodiments of the present invention; Figure 2 This is a complete flowchart of the training and testing of a machine learning-based Raman spectroscopy method for detecting urinary iodine concentration, provided for an embodiment of the present invention.
[0019] Figure 3 This is a schematic diagram of the structure of an electronic device provided in an embodiment of the present invention.
[0020] Component labeling explanation: 100 - electronic device, 101 - memory, 102 - processor, 103 - display. Detailed Implementation
[0021] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention are described clearly and completely. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0022] While Raman spectroscopy is renowned for its non-destructive testing, rapid response, and suitability for small sample volumes, it still faces three major technical bottlenecks in practical applications: Raman spectral signals are susceptible to instrument noise, sample matrix variations, and environmental factors, leading to baseline drift, noise interference, and peak shift. Furthermore, current machine learning-based Raman spectroscopy methods often employ single spectral features or simple statistical features for modeling, failing to fully extract structural information at different scales within the spectrum. They also do not fully utilize the physical characteristics of urinary iodine-related molecular vibrations (such as the characteristic vibrational frequency range of urinary iodine molecules and their frequency shift differences from interfering components), thus affecting the stability and accuracy of urinary iodine concentration prediction models.
[0023] To address the aforementioned issues, this invention provides a machine learning-based Raman spectroscopy method for detecting urinary iodine concentration. This method extracts multi-scale statistical features from the Raman spectrum through multi-scale wavelet decomposition and combines this with the extraction of physical prior features from the characteristic vibrational bands of urinary iodine. A feature representation integrating multi-scale spectral structural information and characteristic peak information is then constructed. Finally, a machine learning regression model is used to establish a urinary iodine concentration prediction model, enabling rapid and accurate detection of urinary iodine concentration.
[0024] To better understand the above technical solutions, the following will provide a detailed explanation of the technical solutions in conjunction with the accompanying drawings and specific implementation methods. Example 1
[0025] like Figure 1 As shown, this embodiment of the invention provides a machine learning-based Raman spectroscopy method for detecting urinary iodine concentration, comprising: S1. Obtain and preprocess the raw Raman spectra of the urine sample to obtain standardized spectral data; S2. Perform multi-scale wavelet decomposition on the standardized spectral data to extract energy features, mean features and standard deviation features at different decomposition levels, thus forming multi-scale statistical features. S3. Based on the characteristic vibrational bands of urinary iodine molecules, extract the physical prior features of the corresponding wavenumber intervals, including the average intensity and maximum intensity of the peak area. S4. Concatenate the multi-scale statistical features with the physical prior features to construct a fused feature vector; S5. Use the fused feature vector as input to the pre-trained machine learning model to obtain the predicted value of urinary iodine concentration.
[0026] Compared with related technologies, the embodiments of the present invention can effectively improve the ability to express spectral features, enhance the model's ability to identify urinary iodine-related spectral information, thereby improving the accuracy and stability of urinary iodine concentration prediction, and providing a new technical means for rapid detection of urinary iodine levels.
[0027] It should be noted that the above technical solution describes the inference stage of the Raman spectroscopy method for detecting urinary iodine concentration, i.e., the actual application stage after the pre-trained model is deployed. The machine learning model has already undergone parameter optimization during the training stage and does not require updating during inference; spectral preprocessing and feature extraction both employ fixed computational processes to ensure real-time performance.
[0028] To fully illustrate the technical solutions of the embodiments of the present invention, the following will begin with the model training and testing phases, such as... Figure 2The paper describes in detail how to improve the accuracy of urinary iodine concentration prediction during the inference stage by following four steps: data acquisition and preprocessing, feature extraction, model training and model testing. This is based on a dual optimization strategy of data-driven (multi-scale wavelet decomposition) and physical prior (physical prior features based on the vibration bands of urinary iodine).
[0029] In the data acquisition and preprocessing steps, the raw Raman spectra of the urine samples are collected and preprocessed to obtain standardized spectral data.
[0030] In an optional implementation, the preprocessing includes baseline correction, smoothing and denoising, wavenumber axis alignment, and normalization.
[0031] For example, 200 raw Raman spectra corresponding to urine samples were collected. The true urinary iodine concentration of each sample was determined by chemical analysis, ranging from 50 to 600 μg / L, and the wavenumber range of the Raman spectra was [not specified]. The resolution is (Each spectral sample contains 701 wavenumber points).
[0032] Furthermore, the acquired raw Raman spectra within the target wavenumber range were analyzed. Preprocess using the following parameters: 1) Baseline Correction: Baseline correction is performed using the asymmetric least squares method, where the smoothing parameter λ is taken as... The penalty factor p is set to 0.01.
[0033] 2) Smoothing and denoising: The Savitzky-Golay filtering method is used for smoothing, with the window width set to 11 sampling points and the polynomial order set to 3.
[0034] 3) Spectral Alignment: Create a common coordinate axis, from Align the Raman spectral data to this common coordinate axis.
[0035] 4) Normalization: Apply maximum value normalization to each spectrum to map the spectral intensity to the [0,1] interval.
[0036] In the feature extraction step, the preprocessed standardized spectral data is decomposed into multi-scale wavelet decomposition to extract multi-scale statistical features. The physical prior peak intensity features are extracted by combining the characteristic vibration bands of urinary iodine. The above features are fused to construct a feature vector to characterize urinary iodine concentration information.
[0037] The preprocessed standardized spectral data is divided into training and testing sets according to a preset ratio. First, multi-scale statistical features are extracted based on the spectral samples of the training set. In an optional implementation, the standardized spectral data undergoes multi-scale wavelet decomposition to extract energy features, mean features, and standard deviation features at different decomposition levels, constituting multi-scale statistical features, including: S10. Multi-level decomposition is performed using Daubechies wavelet basis functions to obtain high-frequency detail coefficients and low-frequency approximation coefficients for each level.
[0038] Specifically, let a spectral sample be denoted as The high-frequency detail coefficients and low-frequency approximation coefficients of each layer can be expressed as: ; in, Represents spectral samples The Intensity values at each wavenumber sampling point Indicates the number of wavenumber sampling points; The wavelet decomposition level is denoted as . For the first High-frequency detail coefficients of the layer For the first Low-frequency approximation coefficients of the layer.
[0039] S20. Extract the corresponding energy features, first mean features, and first standard deviation features from the high-frequency detail coefficients of each layer, which can be expressed as follows: (1) Energy characteristics ; in, Indicates the first Energy characteristics corresponding to high-frequency detail coefficients of a layer; Indicates the th wavelet after wavelet decomposition High-frequency detail factor The sequence index, for example, if the level 3 detail coefficients There are 100 discrete values, then Take numbers from 1 to 100 and sum them up. The square of the value is energy .
[0040] (2) First mean characteristic ; in, Indicates the first The first mean feature corresponding to the high-frequency detail coefficients of the layer; Indicates the first The total number of discrete data points for the high-frequency detail coefficients of a layer decreases as the number of decomposition layers increases. For example, if a spectral sample has 512 points, after the first layer of decomposition... It could be 256 points, on the second level. The answer is 128 points, and so on.
[0041] (3) Characteristics of the first standard deviation ; in, Indicates the first The first standard deviation feature corresponding to the high-frequency detail coefficients of the layer.
[0042] S30. Extract the second mean feature and the second standard deviation feature from the low-frequency approximation coefficients of the final level: .in, They represent the first The second mean and second standard deviation characteristics of the low-frequency approximation coefficients of the layer.
[0043] S40. The energy features, first mean features, and first standard deviation features corresponding to the high-frequency detail coefficients of each layer, as well as the second mean features and second standard deviation features of the final layer, are combined to form a multi-dimensional, multi-scale statistical feature, expressed as: ; Wherein, multi-scale statistical characteristics are represented. The dimension depends on the number of layers in the wavelet decomposition, and the specific value is... .
[0044] Understandably, in this embodiment of the invention, after performing multi-scale wavelet decomposition on standardized spectral data, statistical features of high-frequency detail coefficients and low-frequency approximation coefficients are extracted hierarchically. Specifically, this includes: extracting energy features, a first mean, and a first standard deviation from the high-frequency detail coefficients at each decomposition level to characterize local signal abrupt changes, noise intensity, and fluctuation characteristics; and extracting a second mean and a second standard deviation from the low-frequency approximation coefficients at the final decomposition level to quantify the overall spectral trend and baseline stability. Through differentiated processing of high and low frequency features, the invention balances the needs of spectral detail resolution and global analysis, providing a more robust multi-scale statistical feature expression for urinary iodine detection, etc.
[0045] Continuing with the previous example, the preprocessed standardized spectral data is divided into a training set and a test set in an 8:2 ratio. The training set contains 160 spectral data points, and the test set contains 40 spectral data points. For each spectral sample in the training set, multi-scale statistical feature extraction is performed, as follows: Considering that discrete wavelet transform achieves signal decomposition through layer-by-layer binary downsampling, with each additional layer (j layers), the frequency resolution decreases accordingly, while the signal scale expands to the original spectrum. The decomposition results using a 5-level wavelet decomposition method, targeting the need for urinary iodine concentration detection, are as follows: Clear scale classification can be established: The corresponding minimum scale (2 times) mainly reflects high-frequency noise and extremely narrow spectral peaks; Captures higher frequency information (4 times), including narrow peak structures; Characterized at a medium scale (8x), it best matches typical Raman characteristic peaks; and These correspond to a wider (16x) and a wider (32x) scale, respectively, reflecting broad peaks and gradually changing structures; This preserves the lowest frequency background trend. This configuration effectively distinguishes noise through scale separation. Characteristic peaks and background Three types of key information.
[0046] Furthermore, the choice of 5-level wavelet decomposition is based on a balance between spectral characteristics and decomposition effect: the Raman peaks of urinary iodine-related molecules are mostly distributed in the mid-to-low frequency range. Need to deal with high frequency noise Clear separation is crucial. Insufficient decomposition (e.g., 3 layers) leads to noise and characteristic peaks overlapping, while excessive decomposition (e.g., 7 layers) may fragment the effective signal. Ultimately, from... The extracted 17-dimensional multi-scale features are processed through the S10-S40 process, which not only preserves the independence of information at different scales but also avoids feature distortion caused by excessive decomposition, achieving dual optimization of signal-to-noise separation and feature stability.
[0047] Based on the above analysis, a 5-level wavelet decomposition is preferred here, followed by the processing flow of S10~S40 described above, ultimately forming 17-dimensional multi-scale statistical features: ; After completing the construction of multi-scale statistical features, this embodiment of the invention combines the characteristic vibration bands of urinary iodine to extract the physical prior peak region intensity features.
[0048] In an optional implementation, multiple feature band sets are constructed, and physical prior features of the corresponding wavenumber intervals are extracted, including the average intensity of the peak region and the maximum intensity of the peak region.
[0049] Specifically, definition A set of characteristic wavenumber intervals: ; in, Indicates the first One characteristic band, These represent the lower limit and upper limit of wavenumber sampling points for this characteristic band, respectively.
[0050] For each of the above characteristic bands, spectral intensity features are extracted, specifically including: (1) Average intensity of the peak region: ; in, Indicates the first Average intensity of the peak region of each characteristic band; Indicates the first The number of wavenumber sampling points for each characteristic band; Indicates the first Characteristic bands Any wavenumber feature in it; express The intensity value.
[0051] (2) Maximum intensity in the peak region: ; in, Indicates the first The maximum intensity of the peak region of each characteristic band; This is the maximization function.
[0052] Obtain the physical prior features: ; Among them, multi-scale statistical characteristics The dimension depends on the number of feature bands, and the specific value is... .
[0053] Understandably, the embodiments of this invention use the average intensity and maximum intensity of the peak region as physical prior features, balancing noise robustness, feature discrimination, and computational efficiency: the average intensity reduces the variance of random noise through statistical averaging, retains stable spectral characteristics, and is suitable for quantifying concentration correlations; the maximum intensity is sensitive to local signal abrupt changes and needs to be combined with baseline correction to eliminate interference, and its amplitude can reflect the relative intensity of chemical bond vibrations (under consistent testing conditions). The combination of these two features enhances the ability to distinguish different samples, maintains computational efficiency through simple statistics, and provides clear physical interpretation, making it both practical and theoretically interpretable in spectral analysis, especially suitable for applications such as urinary iodine detection that require rapid and accurate identification of characteristic peaks.
[0054] Continuing with the example above, considering that Raman spectral characteristic peaks essentially reflect the vibrational modes of molecular chemical bonds, different substances exhibit stable characteristic responses within specific wavenumber ranges. Urinary iodine exists in samples in various chemical forms (such as iodides and iodates), and its molecular vibrational information is mainly enriched in three key wavebands: , and Because complex components in the urine matrix (such as urea and salts) can cause peak broadening or shift, traditional single wavenumber point feature extraction is easily interfered with. By analyzing the spectral data of the training set and verifying it with vibrational mode theory, it was confirmed that the above three bands have a significant response to changes in urinary iodine concentration and can cover the vibrational characteristics of the main chemical forms.
[0055] Based on the above analysis, we will prioritize selecting [the appropriate option] here. , and Three characteristic bands. That is: ; After executing the above processing steps, the final physical prior features are constituted in 6 dimensions: ; Finally, the multi-scale statistical features and the physical prior features are concatenated to construct a fused feature vector. Its dimensions are .
[0056] Continuing with the example above, the final result is the fused feature vector. The dimension is 17 + 6 = 23.
[0057] It should be noted that the feature extraction step in this embodiment satisfies the following characteristics in its technical implementation: 1) Order independence Multiscale statistical characteristics and physical prior characteristics The extraction order can be adjusted flexibly, with no fixed requirement, and may include the following situations: extraction can be performed first. Extract again (As described in the previous section of the instruction manual); extraction can be performed first. Extract again Two features can also be extracted simultaneously in parallel.
[0058] Special attention should be paid to physical prior characteristics. The extraction is not limited by data acquisition and preprocessing steps, and can be applied to directly acquire raw spectral data, preprocessed standardized spectral data, or other derived data.
[0059] 2) Technical equivalence Regardless of the processing order used, the final fused feature vector is obtained It exhibits consistency in mathematical representation (the extraction order of multi-scale statistical features and physical prior features does not affect the technical effect of the final fusion result), such as in the example above where its feature dimension is always 23.
[0060] 3) Execution method description In practical applications, those skilled in the art can select the optimal implementation method based on hardware resources, such as sequential execution: suitable for serial processors (such as MCUs), saving memory overhead; or parallel execution: suitable for GPU / FPGA acceleration scenarios, improving real-time performance.
[0061] In the model training step, the fused feature vector and concentration label are used as input to build a machine learning model to obtain the predicted value of urinary iodine concentration.
[0062] In an optional implementation, the machine learning model is a machine learning regression model. During training, the fused feature vector is used as input, combined with the true urinary iodine concentration label, and training continues until the model converges. The predicted urinary iodine concentration value output during the training phase is represented as follows: ; in, This represents the fusion feature vector corresponding to any spectral sample. As input, model Predicted urinary iodine concentration.
[0063] It should be noted that the aforementioned machine learning regression model can be a support vector regression model (SVR), a random forest regression model (RFR), a neural network regression model (NNR), or other regression models. This embodiment of the invention does not impose a single limitation, and those skilled in the art can choose according to actual needs. Furthermore, before inputting the fused feature vector into the machine learning model, feature selection (such as recursive feature elimination, RFE) can be performed to perform secondary optimization on the fused feature vector; that is, the model input is replaced with the optimized fused feature vector. This eliminates potentially redundant features, further ensuring the model's generalization ability and feature interpretability.
[0064] Continuing with the example above, and specifically using a support vector regression model, the hyperparameters are set as follows: using the RBF kernel function, with a penalty coefficient C of 100, and a kernel coefficient of... The tolerance is 0.01. The value is set to 0.1 to support model training using 160 spectral samples until convergence, and to save the model parameters.
[0065] In the model testing step, the fused feature vectors corresponding to the preprocessed test set data are input into the trained machine learning model to obtain the test results.
[0066] In an optional implementation, the predicted urinary iodine concentration output by the model during the testing phase is represented as follows: ; in, With the first The fusion feature vector corresponding to each test sample The model after convergence, with input as input. Predicted urinary iodine concentration.
[0067] Furthermore, it is preferable to use the root mean square error (RMSE) and the coefficient of determination. As an evaluation indicator: ; in, The physical meaning of RMSE is to characterize the average deviation between the model's predicted value and the actual value; the smaller the better (RMSE=0 indicates a perfect prediction). Indicates the first The true urinary iodine concentration value of each test sample; Indicates the total number of samples in the test set; This represents the average value predicted by the model.
[0068] The physical meaning of is to characterize the model's ability to explain the variance (variation) of the data. This indicates that the model perfectly fits the data. This indicates that the model is equivalent to directly predicting the mean. This indicates that the model's performance is worse than the baseline mean prediction.
[0069] It should be noted that the joint validation logic of the above evaluation indicators in the embodiments of this invention is as follows: 1) RMSE priority: ensuring that the prediction error meets the requirements of clinical testing; 2) Auxiliary: Verify whether the model makes full use of spectral features.
[0070] Continuing with the example above, the test set containing 40 test samples has a root mean square error (RMSE) of 18.7 μg / L and a coefficient of determination. The value of 0.92 indicates that the constructed model can effectively predict urinary iodine concentration.
[0071] This concludes the complete workflow of the Raman spectroscopy-based urinary iodine concentration detection method based on machine learning. This invention achieves highly sensitive and robust urinary iodine concentration prediction through a feature construction strategy guided by multi-scale statistical feature extraction and physical priors. During the training phase, feature engineering, by combining the physical properties of Raman spectroscopy with multi-level information mining, significantly improves the model's ability to accurately represent iodine concentration information. Simultaneously, a support vector regression model is selected as the prediction model to accurately establish the mapping relationship between Raman spectral features and urinary iodine concentration. During the inference phase, a fixed feature calculation process and pre-trained model parameters significantly shorten the detection time for a single sample while ensuring the accuracy of the urinary iodine concentration detection results.
[0072] Example 2: This invention provides a machine learning-based Raman spectroscopy system for detecting urinary iodine concentration, comprising: The data acquisition and preprocessing module is used to acquire and preprocess the raw Raman spectra of urine samples to obtain standardized spectral data. The multi-scale statistical feature extraction module is used to perform multi-scale wavelet decomposition on the standardized spectral data and extract energy features, mean features and standard deviation features at different decomposition levels to form multi-scale statistical features. The physical feature extraction module is used to extract the physical prior features of the corresponding wavenumber interval based on the characteristic vibrational bands of urinary iodine molecules, including the average intensity and maximum intensity of the peak region. The feature fusion module is used to concatenate the multi-scale statistical features and the physical prior features to construct a fused feature vector. The concentration prediction module is used to take the fused feature vector as input to a pre-trained machine learning model to obtain a predicted value of urinary iodine concentration.
[0073] Compared with related technologies, the embodiments of the present invention can effectively improve the ability to express spectral features, enhance the model's ability to identify urinary iodine-related spectral information, thereby improving the accuracy and stability of urinary iodine concentration prediction, and providing a new technical means for rapid detection of urinary iodine levels.
[0074] In an optional implementation, the preprocessing is one or any combination of the following operations: Baseline correction, smoothing and denoising, wavenumber axis alignment and normalization.
[0075] In an optional implementation, the step of performing multi-scale wavelet decomposition on the standardized spectral data to extract energy features, mean features, and standard deviation features at different decomposition levels, thereby constructing multi-scale statistical features, includes: Wavelet basis functions are used for multi-level decomposition to obtain high-frequency detail coefficients and low-frequency approximation coefficients for each level; Extract the corresponding energy features, first mean features, and first standard deviation features from each layer of high-frequency detail coefficients; Extract the second mean feature and the second standard deviation feature from the low-frequency approximation coefficients of the final level; The energy features, first mean features, and first standard deviation features corresponding to the high-frequency detail coefficients of each layer, as well as the second mean features and second standard deviation features of the final layer, are combined to form multidimensional, multi-scale statistical features.
[0076] In an optional implementation, the wavelet basis function is the Daubechies wavelet.
[0077] In one optional implementation, the number of decomposition levels is five.
[0078] In an optional embodiment, the characteristic vibrational band of the urinary iodine molecule is one or any combination of the following characteristic bands: 1260-1280cm⁻¹, 1295-1315cm⁻¹ and 1650-1670cm⁻¹.
[0079] In an optional implementation, the machine learning model is a machine learning regression model, which may be a support vector regression model, a random forest regression model, or a neural network regression model.
[0080] Example 3: This invention provides a computer program product, which includes computer program code. When the computer program code is run on a computer, the computer implements the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration as provided in any embodiment of this invention.
[0081] Those skilled in the art will understand that all or part of the processes in the methods of the above embodiments can be implemented by a computer program instructing related hardware. In the embodiments of the present invention, the computer program can be stored in a non-volatile computer-readable storage medium, and when executed, the computer program can include the processes of the embodiments of the methods described above.
[0082] Example 4: This invention provides a storage medium storing a computer program that causes a computer to execute a machine learning-based Raman spectroscopy method for detecting urinary iodine concentration, as provided in any embodiment of this invention.
[0083] In embodiments of the present invention, any combination of one or more storage media may be used. The storage medium may be a computer-readable signal medium or a computer-readable storage medium. A computer-readable storage medium may be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media (a non-exhaustive list) include: an electrical connection having one or more wires, a portable computer disk, a hard disk, RAM, ROM, an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination thereof. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that may be used by or in connection with an instruction execution system, apparatus, or device.
[0084] Example 5: This invention provides an electronic device. Figure 3 The diagram shown is a structural schematic of the electronic device 100 provided in an embodiment of the present invention. In some embodiments, the electronic device may be a mobile phone, tablet computer, wearable device, in-vehicle device, augmented reality (AR) / virtual reality (VR) device, laptop computer, ultra-mobile personal computer (UMPC), netbook, personal digital assistant (PDA), or other terminal device. Furthermore, the Raman spectroscopy-based urinary iodine concentration detection method based on machine learning provided in this embodiment of the present invention can also be applied to databases, servers, and service response systems based on terminal artificial intelligence. This embodiment of the present invention does not limit the specific application scenarios of the Raman spectroscopy-based urinary iodine concentration detection method based on machine learning.
[0085] like Figure 3 As shown, the electronic device 100 provided in this embodiment of the invention includes a memory 101 and a processor 102.
[0086] The memory 101 is used to store computer programs; preferably, the memory 101 includes various media that can store program code, such as ROM, RAM, magnetic disk, USB flash drive, memory card or optical disk.
[0087] Specifically, memory 101 may include computer system readable media in the form of volatile memory, such as random access memory (RAM) and / or cache memory. Electronic device 100 may further include other removable / non-removable, volatile / non-volatile computer system storage media. Memory 101 may include at least one program product having a set (e.g., at least one) of program modules configured to perform the functions of the embodiments of the present invention.
[0088] The processor 102 is connected to the memory 101 and is used to execute the computer program stored in the memory 101 so that the electronic device 100 executes the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration provided in any embodiment of the present invention.
[0089] In an optional implementation, the processor 102 may be a general-purpose processor, including a central processing unit (CPU), a network processor (NP), etc.; it may also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, evaluation index gate or transistor logic devices, or evaluation index hardware components.
[0090] In an optional embodiment of the present invention, the electronic device 100 may further include a display 103. The display 103 is communicatively connected to the memory 101 and the processor 102, and is used to display the relevant GUI interface of the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration.
[0091] It is understood that the Raman spectroscopy urinary iodine concentration detection system, storage medium and electronic device based on machine learning provided in the embodiments of the present invention correspond to the Raman spectroscopy urinary iodine concentration detection method based on machine learning provided in the embodiments of the present invention. The explanation, examples and beneficial effects of the relevant contents can be referred to the corresponding parts of the method, and will not be repeated here.
[0092] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
[0093] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A machine learning-based Raman spectroscopy method for detecting urinary iodine concentration, characterized in that, include: Acquire and preprocess the raw Raman spectra of urine samples to obtain standardized spectral data; The standardized spectral data is subjected to multi-scale wavelet decomposition to extract energy features, mean features and standard deviation features at different decomposition levels, thus forming multi-scale statistical features. Based on the characteristic vibrational bands of urinary iodine molecules, physical prior features of the corresponding wavenumber intervals are extracted, including the average intensity and maximum intensity of the peak region. The multi-scale statistical features and the physical prior features are concatenated to construct a fused feature vector; The fused feature vector is used as input to a pre-trained machine learning model to obtain a predicted value of urinary iodine concentration.
2. The Raman spectroscopy method for detecting urinary iodine concentration as described in claim 1, characterized in that, The preprocessing is one or any combination of the following operations: Baseline correction, smoothing and denoising, wavenumber axis alignment and normalization.
3. The Raman spectroscopy method for detecting urinary iodine concentration as described in claim 1, characterized in that, The standardized spectral data is subjected to multi-scale wavelet decomposition to extract energy features, mean features, and standard deviation features at different decomposition levels, forming multi-scale statistical features, including: Multi-level decomposition is performed using Daubechies wavelet basis functions to obtain high-frequency detail coefficients and low-frequency approximation coefficients for each level; Extract the corresponding energy features, first mean features, and first standard deviation features from each layer of high-frequency detail coefficients; Extract the second mean feature and the second standard deviation feature from the low-frequency approximation coefficients of the final level; The energy features, first mean features, and first standard deviation features corresponding to the high-frequency detail coefficients of each layer, as well as the second mean features and second standard deviation features of the final layer, are combined to form multidimensional, multi-scale statistical features.
4. The Raman spectroscopy method for detecting urinary iodine concentration as described in claim 1, characterized in that, The decomposition hierarchy has five levels.
5. The Raman spectroscopy method for detecting urinary iodine concentration as described in claim 1, characterized in that, The characteristic vibrational bands of the urinary iodine molecules are one or any combination of the following characteristic bands: , and .
6. The Raman spectroscopy method for detecting urinary iodine concentration as described in claim 1, characterized in that, The machine learning model is a machine learning regression model, which can be a support vector regression model, a random forest regression model, or a neural network regression model.
7. A Raman spectroscopy-based urinary iodine concentration detection system based on machine learning, characterized in that, include: The data acquisition and preprocessing module is used to acquire and preprocess the raw Raman spectra of urine samples to obtain standardized spectral data. The multi-scale statistical feature extraction module is used to perform multi-scale wavelet decomposition on the standardized spectral data and extract energy features, mean features and standard deviation features at different decomposition levels to form multi-scale statistical features. The physical feature extraction module is used to extract the physical prior features of the corresponding wavenumber interval based on the characteristic vibrational bands of urinary iodine molecules, including the average intensity and maximum intensity of the peak region. The feature fusion module is used to concatenate the multi-scale statistical features and the physical prior features to construct a fused feature vector. The concentration prediction module is used to take the fused feature vector as input to a pre-trained machine learning model to obtain a predicted value of urinary iodine concentration.
8. A computer program product, characterized in that, The computer program product includes computer program code, which, when run on a computer, enables the computer to implement the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration as described in any one of claims 1 to 6.
9. A storage medium, characterized in that, It stores a computer program, wherein the computer program causes the computer to perform the Raman spectroscopy method for detecting urinary iodine concentration based on machine learning as described in any one of claims 1 to 6.
10. An electronic device, characterized in that, The electronic device includes: Processor and memory; The memory stores program instructions; The processor is configured to run the program instructions to execute the machine learning-based Raman spectroscopy method for detecting urinary iodine concentration as described in any one of claims 1 to 6.