Myopia risk assessment method and system based on multi-modal sensing

By fusing and dynamically correcting features from multimodal sensor data, behavioral trend curves and risk scores are generated, solving the problem that existing technologies cannot analyze eye-use behavior and environmental interaction in real time, and enabling accurate assessment and prediction of myopia risk.

CN122245749APending Publication Date: 2026-06-19WEIFANG NURSING VOCATIONAL COLLEGE

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
WEIFANG NURSING VOCATIONAL COLLEGE
Filing Date
2026-03-06
Publication Date
2026-06-19

Smart Images

  • Figure CN122245749A_ABST
    Figure CN122245749A_ABST
Patent Text Reader

Abstract

This invention relates to the field of myopia risk assessment technology, and discloses a myopia risk assessment method and system based on multimodal sensing. The method includes acquiring terminal behavior data and environmental data, and extracting behavioral features, illumination, and screen display features; generating a behavior trend curve based on the behavioral features and calculating a risk score to determine the risk evolution type; when the risk score exceeds a preset threshold, fusing environmental features to generate an evolution coefficient, and correcting the evolution coefficient based on superimposed features and deviation parameters; verifying abnormal nodes in historical behavior sequences, screening effective attention points, and weighted fusing them to generate a set of fused points to determine the warning interval; within the warning interval, performing trend analysis on the superimposed features, and outputting the myopia risk prediction result. This method can achieve dynamic assessment and early warning of user eye load.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of intelligent health monitoring and vision protection technology, and in particular to a method and system for myopia risk assessment based on multimodal sensing. Background Technology

[0002] Currently, with the widespread use of electronic terminal devices and smart wearable devices, integrating multiple sensing units on the terminal side and cooperating with smart chips for data processing allows for the perception and analysis of users' eye-use behavior and environmental conditions, providing a new technological foundation for myopia risk assessment.

[0003] In existing technologies, myopia risk assessment or early warning is typically achieved by collecting single behavioral and environmental data such as user screen time, viewing posture, or ambient lighting, and transmitting this data to a terminal device or cloud processing unit with an integrated smart chip for analysis. However, in practical applications, existing technologies often rely on static data or offline training results to determine myopia risk, failing to adequately consider the dynamic changes in users' daily behaviors and environmental conditions over time. Especially in scenarios with frequent changes in lighting conditions and significant differences in screen usage habits, it is difficult to analyze the interaction between multimodal behavioral and environmental data, and it is also difficult to reflect the evolution of risk status in a timely manner.

[0004] In summary, existing technologies cannot perform real-time fusion analysis of the dynamic interaction between user eye behavior and environmental conditions based on multimodal sensor data, making it difficult to accurately depict the evolution trend of myopia risk. Summary of the Invention

[0005] This invention provides a myopia risk assessment method and system based on multimodal sensing, which can achieve deep feature fusion and dynamic correction of anomalies in user eye behavior and environmental interaction data, thereby improving the accuracy and stability of myopia risk evolution trend prediction in complex and ever-changing scenarios.

[0006] In a first aspect, to address the aforementioned technical problems, this invention provides a myopia risk assessment method based on multimodal sensing, comprising: Acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, parse the environmental data and extract features to obtain illumination features and screen display features; Trend analysis is performed on the behavioral feature set to generate a behavioral trend curve reflecting changes in eye use behavior. User individual parameters corresponding to the behavioral trend curve are obtained. A risk score is calculated based on the behavioral trend curve and the user individual parameters. The risk evolution type is determined based on the risk score. When the risk score corresponding to the risk evolution type exceeds a preset risk threshold, the illumination feature and the screen display feature are fused to obtain the evolution coefficient. Based on the behavioral feature set and the evolution coefficients, superimposed features are generated, and deviation parameters are generated according to the matching relationship between the superimposed features and the evolution coefficients. The evolution coefficients are then updated based on the deviation parameters to obtain the corrected evolution coefficients. Based on the deviation parameter, the consistency of the historical behavior sequence in the behavior trend curve is checked to identify abnormal nodes; Based on the corrected evolution coefficient and the screen display characteristics, the abnormal nodes are screened and removed to obtain effective points of concern. Multiple effective points of concern are weighted and fused to obtain a set of fused points. The warning interval is determined according to the number and distribution of the fused points. Within the warning range, trend analysis is performed on the superimposed features to generate risk prediction results.

[0007] Secondly, the present invention provides a myopia risk assessment system based on multimodal sensing, comprising: The data acquisition module is used to acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, and parse the environmental data and extract features to obtain illumination features and screen display features. The behavioral trend curve analysis module is used to perform trend analysis on the behavioral feature set, generate a behavioral trend curve reflecting changes in eye use behavior, obtain user individual parameters corresponding to the behavioral trend curve, calculate a risk score based on the behavioral trend curve and the user individual parameters, and determine the risk evolution type based on the risk score. An interactive feature processing module is used to fuse the illumination feature and the screen display feature to obtain an evolution coefficient when the risk score corresponding to the risk evolution type exceeds a preset risk threshold. The superimposed feature generation module is used to generate superimposed features based on the behavioral feature set and the evolution coefficients, generate deviation parameters according to the matching relationship between the superimposed features and the evolution coefficients, and update the evolution coefficients based on the deviation parameters to obtain the corrected evolution coefficients. The abnormal node module is used to perform consistency verification on the historical behavior sequence in the behavior trend curve based on the deviation parameter, and to determine abnormal nodes. The early warning interval module is used to filter and remove abnormal nodes based on the corrected evolution coefficient and the screen display characteristics to obtain effective attention points, perform weighted fusion on multiple effective attention points to obtain a fused point set, and determine the early warning interval according to the number and distribution of the fused point set. The risk prediction module is used to perform trend analysis processing on the superimposed features within the warning interval to generate risk prediction results.

[0008] Compared with the prior art, the present invention has the following beneficial effects: (1) This invention acquires behavioral and environmental data collected by the terminal, and further extracts behavioral features, illumination features and screen display features. By generating behavioral trend curves and combining them with individual user parameters to calculate risk scores, it realizes the transformation from single-point-of-time data perception to time-series dynamic evolution analysis. It can accurately distinguish the risk evolution type based on the physiological differences of different individuals, effectively solves the problem that traditional detection methods do not adequately consider the dynamic changes in users' daily eye use behavior, and improves the pertinence and accuracy of preliminary risk assessment.

[0009] (2) This invention obtains the evolution coefficient by fusing the characteristics of illumination and screen display when the risk score exceeds the limit, and introduces superimposed features and deviation parameters to dynamically update the evolution coefficient. This achieves in-depth analysis of the interaction between behavioral features and environmental elements, effectively capturing the disturbance of frequent changes in illumination conditions and screen usage habits on eye risk. The closed-loop correction mechanism reduces the assessment error under the interference of multiple environmental factors, ensuring the ability of the evolution coefficient to represent the actual risk state.

[0010] (3) This invention uses deviation parameters to verify the consistency of historical behavior sequences to identify and remove abnormal nodes, and combines screen display features to filter effective points of interest for weighted fusion. This achieves automatic cleaning of data noise and accurate extraction of core risk information. It can adaptively determine the warning interval based on the distribution of fusion points and perform trend analysis within the interval, avoiding false alarms and missed alarms caused by historical data fluctuations. This improves the data robustness and real-time warning of myopia risk prediction results in complex interactive scenarios. Attached Figure Description

[0011] Figure 1 This is a schematic diagram of the myopia risk assessment method based on multimodal sensing provided in the first embodiment of the present invention; Figure 2 This is a schematic diagram of the myopia risk assessment system based on multimodal sensing provided in the second embodiment of the present invention. Detailed Implementation

[0012] The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.

[0013] Reference Figure 1 The first embodiment of the present invention provides a myopia risk assessment method based on multimodal sensing, comprising the following steps: S11, acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, parse the environmental data and extract features to obtain illumination features and screen display features; S12, perform trend analysis on the behavioral feature set, generate a behavioral trend curve reflecting changes in eye use behavior, obtain user individual parameters corresponding to the behavioral trend curve, calculate a risk score based on the behavioral trend curve and the user individual parameters, and determine the risk evolution type based on the risk score; S13, when the risk score corresponding to the risk evolution type exceeds the preset risk threshold, the illumination feature and the screen display feature are fused to obtain the evolution coefficient; S14, Based on the behavioral feature set and the evolution coefficient, generate superimposed features, and generate deviation parameters according to the matching relationship between the superimposed features and the evolution coefficients. Update the evolution coefficients based on the deviation parameters to obtain the corrected evolution coefficients. S15, Based on the deviation parameter, perform consistency verification on the historical behavior sequence in the behavior trend curve to determine abnormal nodes; S16, Based on the modified evolution coefficient and the screen display characteristics, the abnormal nodes are screened and removed to obtain effective attention points. Multiple effective attention points are weighted and fused to obtain a fused point set. The warning interval is determined according to the number of fused point sets and the point distribution. S17, within the warning interval, perform trend analysis processing on the superimposed features to generate risk prediction results.

[0014] In step S11, it is necessary to acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, and parse the environmental data and extract features to obtain illumination features and screen display features, including: Acquire behavioral data and environmental data collected by the terminal; wherein, the behavioral data includes a continuous video frame sequence of user facial information, and the environmental data includes ambient light sensor data and screen display status data; Based on the behavioral data, a continuous video frame sequence is parsed to extract behavioral information reflecting changes in eye usage time, and a behavioral feature set is generated based on the behavioral information. Based on the ambient light sensing data, illumination information reflecting changes in ambient brightness is extracted, and illumination features are generated based on the illumination information. Based on the screen display status data, screen brightness change information and display duration information are parsed, and screen display features are generated based on the screen brightness change information and display duration information.

[0015] It should be noted that the continuous video frame sequence is collected by the front-facing camera of the terminal at a preset sampling frequency, and each frame is marked with the collection time, forming a video frame sequence in chronological order to reflect the user's eye behavior.

[0016] The preset sampling frequency is determined based on the physiological characteristics of the human eye, such as blinking, and the constraints of terminal computing power and power consumption. Statistical analysis of eye-use video data from multiple users in natural terminal usage scenarios revealed that the duration of most blink closures is concentrated in the 0.1–0.3 second range. Therefore, the preset sampling frequency needs to be no less than 15 frames per second to stably capture changes in eye state. Experimental tests show that when the sampling frame rate is higher than 30 frames per second, the improvement in recognition accuracy is limited, but power consumption increases significantly. Therefore, this embodiment sets the preset sampling frequency to the range of 15–30 frames per second. In scenarios with strong computing power or high accuracy requirements, a frequency closer to 30 frames per second can be selected, while in power-sensitive scenarios, a frequency closer to 15 frames per second can be selected.

[0017] When analyzing the behavioral data, the user's eye region is first located in the continuous video frame sequence, and the eye state in each frame is identified to determine whether it is open or closed. Following the chronological order of the video frames, a temporal analysis of the eye state results is performed, and the duration of the open state is accumulated and statistically analyzed within a preset time window to obtain behavioral information on changes in eye usage time. Based on this information, eye usage time features for each time window are generated, and the change features are obtained by comparing the eye usage time of adjacent windows. The eye usage time features and change features within each time window are combined to form a behavioral feature vector, and multiple vectors are arranged chronologically to form the behavioral feature set. The length of the preset time window is determined based on experimental statistics. Through comparative analysis of eye usage judgment results under different time window lengths, the time window is ultimately set to a range of 20 to 40 seconds.

[0018] In this embodiment, the environmental data includes ambient light sensing data collected by the terminal's built-in ambient light sensor. The ambient light sensor collects ambient light intensity according to a preset sampling period and adds a corresponding time marker. The preset sampling period is determined based on the rate of change of ambient light and the terminal's resource consumption, and is 1 to 2 seconds. Within each time window, the ambient light sensing data is parsed. First, the average illumination value within the time window is calculated. Then, the difference between the maximum and minimum illumination values ​​is calculated to obtain the illumination fluctuation value. Next, based on the human eye's visual adaptation characteristics and historical measurement data, the average illumination value is mapped to three illuminance ranges: low (0–300 lux), medium (301–1000 lux), and high (greater than 1000 lux), constructing an illumination feature vector [low, medium, high], and adding the illumination fluctuation value as an additional dimension. Finally, an illumination feature vector is generated for each time window, containing both illumination level information and illumination fluctuation information, which can be directly used for subsequent behavior analysis and risk assessment.

[0019] It is worth noting that the screen display status data includes screen brightness change data and screen display duration. When parsing the screen display status data, the continuous display process corresponding to the screen's transition from an on state to an off state is first identified, and the start and end times of this continuous display process are recorded to determine the screen display duration. Simultaneously, screen brightness is sampled during this continuous display process to obtain screen brightness change data. Subsequently, a brightness feature reflecting the screen display intensity level is generated based on the screen brightness change data, and this brightness feature is combined with the corresponding screen display duration to generate the screen display feature, which comprehensively reflects the impact of screen display intensity and usage time on the user's eye-use behavior.

[0020] In step S12, trend analysis is performed on the behavioral feature set to generate a behavioral trend curve reflecting changes in eye-use behavior. User-specific parameters corresponding to the behavioral trend curve are obtained. A risk score is calculated based on the behavioral trend curve and the user-specific parameters. The risk evolution type is determined based on the risk score, including: The behavioral feature set is input into a preset random forest model, and a behavioral state score is output. The behavioral state score is smoothed by a sliding window averaging method to generate a behavioral trend curve that reflects changes in eye use behavior. The user individual parameters corresponding to the behavior trend curve are obtained from the preset user database, and the behavior trend curve and the user individual parameters are concatenated using a preset support vector machine model to form a multidimensional trend feature vector. A risk score is calculated based on the multidimensional trend feature vector, and the risk score is compared with a preset risk evolution threshold to obtain the user's risk evolution type.

[0021] It should be noted that, based on the behavioral feature set, the behavioral feature vectors for each time window are sequentially input into a preset random forest model. The preset random forest model evaluates the input vectors and outputs the corresponding numerical behavioral state score. The behavioral feature vectors are 3-dimensional vectors, including eye usage duration, blink frequency, and continuous fixation time.

[0022] The pre-defined random forest model is trained using historical users' long-term eye-use behavior data. The training supervision signal comes from the eye-load label corresponding to each time window. This label is calculated by weighting and summing the actual eye-use duration, blink frequency, and continuous fixation time within each time window according to a pre-defined weight, with a range of 0 to 1. The normalization parameter is calculated using the minimum and maximum values ​​of historical samples. The pre-defined weights are determined based on historical data statistics. Specifically, the average contribution ratio of each indicator to the overall eye-load is calculated for all historical samples, and this ratio is used as the weight of the corresponding indicator. For example, actual eye-use duration accounts for 50%, reflecting the main contribution of prolonged fixation to eye load; blink frequency accounts for 30%, reflecting eye rest and fatigue relief; and continuous fixation time accounts for 20%, reflecting the impact of momentary high-intensity eye use on load. These weights can be fine-tuned according to user group characteristics or experimental results to ensure that the label accurately reflects the comprehensive eye-load intensity within each time window.

[0023] Specifically, the behavioral state score is smoothed by averaging the scores within a preset sliding window to obtain a trend score for that time position. A behavioral trend curve is generated by connecting the trend scores at each time position sequentially, with time as the horizontal axis and the trend score as the vertical axis. This behavioral trend curve characterizes the overall trend of user eye behavior over time; its rising, falling, or flattening states intuitively reflect the increase, decrease, or stabilization of user eye load. The preset sliding window is adaptively set according to the user's behavior cycle and application scenario. Preferably, the window length of the preset sliding window is set to 20 to 40 seconds, and the sliding step is 5 seconds, to achieve a balance between suppressing short-term random noise and maintaining response sensitivity.

[0024] After generating the behavior trend curve, user-specific parameters corresponding to the behavior trend curve are retrieved from a preset user database. The preset user database is established by collecting basic information provided by users, such as age, occupation type, and past vision health status. Missing value imputation, numerical normalization, and categorical encoding are performed on this basic information. Missing value imputation uses linear interpolation, numerical normalization uses Min-Max normalization, and categorical encoding uses One-hot encoding or Label encoding to form a set of user-specific parameters that can be used for calculation. These user-specific parameters include age, occupation type, and past vision health status after numerical or categorical encoding processing.

[0025] Specifically, the trend scores in the behavioral trend curve are sequentially expanded and arranged over time, and then the individual user parameters are appended to the end of the vector for concatenation to obtain the multidimensional trend feature vector. The vector dimension is the sum of the number of time windows and the dimension of the individual user parameters. It should be noted that the preset support vector machine model is trained using historical user samples. The input to the training samples is a combination of the behavioral trend curve and the individual user parameters, and the training label is the corresponding eye risk level.

[0026] It is worth noting that the multidimensional trend feature vector is substituted into the RBF kernel function of a preset support vector machine model for calculation to obtain the corresponding decision function output value. In one embodiment, the risk score is obtained by applying a Sigmoid mapping to the decision function output value, wherein the input to the Sigmoid function is the decision function output value corresponding to the multidimensional trend feature vector, and the output is the risk score. The risk score is compared with a preset risk evolution threshold to determine the user's risk evolution type. The preset risk evolution threshold is determined based on the statistical results of the risk score distribution of historical user samples and is used to distinguish different risk evolution states.

[0027] When the risk score reaches or exceeds the preset risk evolution threshold, the risk evolution type is determined to be high-risk; when the risk score is below the preset risk evolution threshold, the risk evolution type is determined to be low-risk. The determined risk evolution type is recorded as the analysis result and used for subsequent risk alerts, intervention strategy generation, and long-term eye use behavior monitoring and analysis.

[0028] In step S13, the illumination features and the screen display features are fused to obtain evolution coefficients, including: Obtain the ambient light intensity sequence corresponding to the illumination features and the screen brightness change sequence corresponding to the screen display features; The ambient light intensity sequence and the screen brightness change sequence are mapped to a preset visual comfort interaction matrix to obtain the photoelectric interaction difference value and the visual interference intensity index. The environmental pressure load value is obtained by fusing the photoelectric interaction difference value with the visual interference intensity index. Extract pupil response features corresponding to the risk evolution type from the behavioral data, and determine dynamic weights based on the pupil response features; The environmental pressure load value and the risk score are weighted and calculated based on the dynamic weights to obtain the evolution coefficients.

[0029] It should be noted that the ambient light intensity sequence is continuously collected by the terminal's built-in ambient light sensor during user operation. Each sample value carries a clear time marker to reflect changes in ambient brightness over time. Simultaneously, a screen brightness change sequence corresponding to the screen display characteristics is acquired. This screen brightness change sequence is provided by the terminal display control module or operating system interface, recording changes in screen output brightness during continuous use, and is also formed in chronological order.

[0030] In this embodiment, linear interpolation is used to synchronize sampling points, matching screen brightness and ambient light at the same timestamp. The timestamp synchronization accuracy is ±50 milliseconds. This accuracy is determined by measuring the terminal sensor sampling frequency and system processing delay. The ambient light sensor sampling frequency is 20Hz, meaning it samples once every 50 milliseconds. The screen brightness acquisition interface response delay is approximately 5–10 milliseconds. After interpolation and timestamp correction, it can be ensured that the time error between ambient light and screen brightness data does not exceed ±50 milliseconds, thereby guaranteeing the effectiveness of subsequent interactive analysis.

[0031] The preset visual comfort interaction matrix is ​​a two-dimensional mapping table, constructed based on balanced sample data from no fewer than 1,000 users. Specifically, visual behavior data of users under different ambient light intensities and screen brightness conditions are collected. The visual behavior data includes pupil diameter changes, blink frequency, and single fixation duration, and the visual behavior data is associated with the corresponding ambient light range and screen brightness range.

[0032] The ambient light range and the screen brightness range are divided using an equal-division method. For example, the ambient light range is divided into 10 equal intervals from 0 to 1000 lux, and the screen brightness range is divided into 10 equal intervals from 0 to 500 nits. For each combination of ambient light range and screen brightness range, the mean and variance of the visual behavior data under that combination are calculated, and the mean and variance are summed to obtain the statistical value of human eye load under the corresponding combination conditions.

[0033] Subsequently, the maximum and minimum load values ​​are determined from all the aforementioned human eye load statistics. Based on the maximum and minimum load values, the human eye load statistics under each combination condition are subjected to maximum and minimum normalization processing. The normalization results are used as matrix element values ​​of the corresponding combination of ambient light range and screen brightness range, thereby forming the preset visual comfort interaction matrix.

[0034] It should be noted that the data at each time point in the ambient light intensity sequence and the screen brightness change sequence are mapped to the corresponding area of ​​the visual comfort interaction matrix to obtain the human eye load value at each time point. When calculating the photoelectric interaction difference value, the screen brightness change value at each time point is first compared with the ideal matching brightness under the ambient light conditions at that time point, and the difference between the two is calculated to obtain the brightness deviation.

[0035] The ideal matching brightness is obtained based on statistical analysis of historical user experimental data. Under different ambient lighting conditions, physiological indicators such as changes in pupil diameter, blink frequency, and fixation duration are collected simultaneously. Statistical mean analysis is used to determine the screen brightness range with the minimum visual load, and the median of this range is taken as the ideal matching brightness. For example, when the ambient light is 300 lux, statistical analysis determines that the screen brightness range corresponding to the minimum visual load is 210–230 nits. Therefore, the median value of 220 nits is taken as the ideal matching brightness under this lighting condition.

[0036] Subsequently, the brightness deviation at each time point is multiplied by the corresponding human eye load value to obtain the photoelectric interaction difference value at that time point, thereby quantifying the degree of inconsistency between ambient light and screen brightness at that time point.

[0037] Furthermore, the photoelectric interaction difference values ​​across the entire time series are integrated to obtain a visual interference intensity index, which reflects the overall interference strength of the incoordination state on the visual system. During the integration process, the photoelectric interaction difference values ​​at each time point can be weighted. The weights are set based on the user's gaze time and application scenario type (such as text reading, video playback, and gaming), and are assigned based on user operation log classifications. For example, the weight of time windows with gaze times exceeding the average is 1.2, while the weight of other time windows is 1.0.

[0038] In one possible implementation, the photoelectric interaction difference value and the visual interference intensity index are first subjected to maximum and minimum normalization processing. The normalized photoelectric interaction difference value and the visual interference intensity index are then linearly weighted and summed according to preset weights, wherein the preset weights are determined by the degree of contribution of each index in the experimental data to the user's visual burden.

[0039] For example, statistical analysis revealed that the immediate visual load caused by the incoordination between the environment and screen brightness accounts for 0.6% of the impact on eye fatigue, while the overall interference intensity caused by changes in illumination accounts for 0.4%. Therefore, the environmental stress load value can be expressed as a weighted sum of the two, that is, the normalized photoelectric interaction difference value multiplied by 0.6, plus the normalized visual interference intensity index multiplied by 0.4, to obtain the environmental stress load value.

[0040] Furthermore, the pupil response features are derived from user eye videos captured by the front-facing camera. Specifically, the eye videos are analyzed frame by frame. Within a preset time window, the eye region in each video frame is located, and the pupil contour is extracted within the eye region. The corresponding pupil diameter parameter is calculated based on the pupil contour. In consecutive video frames covered by the preset time window, the change in pupil diameter parameter between adjacent video frames is calculated to obtain the pupil diameter change per unit time, and this change is used as the pupil's response speed to changes in illumination or screen brightness. Simultaneously, the difference between the maximum and minimum values ​​of the pupil diameter parameter is calculated within the preset time window to obtain the pupil's response amplitude. Based on the minimum and maximum values ​​obtained from historical user samples, the response speed and response amplitude are respectively subjected to maximum-minimum normalization processing and linearly combined according to preset weights to obtain the pupil response features, which range from 0 to 1. The larger the value, the weaker the user's pupil's ability to adjust to changes in illumination and brightness. The preset weights are determined based on the average contribution rate of the two types of indicators to changes in visual load in historical samples.

[0041] Specifically, the dynamic weights are determined by calculating the deviation ratio between the current user's pupil response index and the historical average index of similar users. For example, when the response speed is 20% lower than the average, the environmental dynamic weight increases by 0.1, and the behavioral dynamic weight decreases by 0.1 accordingly. When the response amplitude is normal, the default weights are maintained, for example, the environmental dynamic weight is 0.6, and the behavioral dynamic weight is 0.4. The dynamic weights must ensure that the sum of the environmental weight and the behavioral weight is always 1.

[0042] After determining the dynamic weights, the environmental pressure load value and the risk score are respectively subjected to Min-Max normalization processing. The maximum and minimum values ​​of historical data are used to map the two to the [0,1] interval to eliminate the difference in dimensions. Subsequently, the normalized environmental pressure load value and the normalized risk score are weighted and summed to obtain the evolution coefficient.

[0043] In step S14, based on the behavioral feature set and the evolution coefficients, superimposed features are generated, and deviation parameters are generated according to the matching relationship between the superimposed features and the evolution coefficients. The evolution coefficients are then updated based on the deviation parameters to obtain corrected evolution coefficients, including: Based on the evolution coefficients, retrieve the corresponding behavioral feature sequences from the behavioral feature set, and parse the behavioral feature sequences to obtain the eye posture vector sequence and the eye usage duration sequence; The eye posture vector sequence and the eye duration sequence are combined to obtain the superimposed feature; The matching degree is calculated based on the matching relationship between the superposition features and the evolution coefficients; When the matching degree is higher than the preset matching degree threshold, the deviation parameter is calculated based on the degree of deviation of the matching degree from the preset matching degree threshold; When the matching degree is not higher than the preset matching degree threshold, a preset deviation stability value is extracted from the preset deviation library as a deviation parameter. It should be noted that the evolution coefficient corresponds to the risk level over a continuous time period obtained from the previous stage of risk assessment. Its time coverage is consistent with the time axis of the behavioral data stored in the behavioral feature set. When parsing the behavioral feature sequence, the behavioral feature sequence is split into two sub-sequences: an eye posture vector sequence and an eye usage duration sequence. The eye posture vector sequence consists of multiple posture-related components, including head tilt angle, facial orientation changes, and eyelid opening and closing states. The eye usage duration sequence records the duration of each fixation segment within a corresponding time period. Each fixation segment is identified by its start and end times and may include information about the screen area corresponding to that segment.

[0044] Specifically, the eye posture vector sequence is first segmented according to a preset time window (20 to 40 seconds) consistent with the behavior feature set processing, resulting in multiple consecutive posture state segments, each corresponding to a time interval. Within the same time interval, the duration of all fixation segments falling within that time interval is extracted from the eye usage duration sequence, and these durations are accumulated to obtain the interval eye usage duration value.

[0045] Subsequently, numerical combinations are performed on each time interval to generate interval posture duration combination values. First, all posture vectors within the posture state segment are subjected to feature scalarization processing. The head tilt angle, facial orientation change amplitude, and eyelid opening and closing status scores are mapped to the [0,1] interval using the Min-Max normalization method. Then, based on the correlation analysis results of each feature parameter and eye load level in historical user data, a weighted sum is performed on the normalized feature values ​​to calculate the comprehensive posture deviation score of a single posture vector. Next, the arithmetic mean of all comprehensive posture deviation scores within the time interval is calculated to obtain the interval posture intensity value. This interval posture intensity value is used to comprehensively reflect the overall deviation degree of the user's head and eyelid posture during this period. Finally, the interval posture intensity value is multiplied by the corresponding interval eye usage duration value, and then multiplied by a preset synergistic gain coefficient to obtain the interval posture duration combination value. The synergistic gain coefficient is determined based on the regression analysis results of the interaction term between posture intensity and eye usage duration in historical samples on the risk evolution coefficient, and is used to quantify the synergistic enhancement effect of posture factors and duration factors in eye usage risk.

[0046] Furthermore, the complete time period corresponding to the evolution coefficient currently being evaluated is taken as the superimposed feature analysis period. The combined values ​​of posture duration across all continuous time intervals within this analysis period are averaged to obtain the superimposed feature. The magnitude of this superimposed feature is used to comprehensively reflect the amplification of eye-use risk caused by the combined effect of continuous deviation in user posture and eye-use duration within the analysis period.

[0047] It should be noted that the high-risk risk evolution feature template is obtained from the statistical analysis of superimposed features of historically marked high-risk users in continuous eye use scenarios. Specifically, under the condition of the same continuous eye use time period length, the median value of the superimposed features is calculated as the reference feature value for that eye use time period. Using the continuous eye use time period consistent with the superimposed feature generation process as the analysis window, the average value of the superimposed features within that time period is calculated as the current representative value of the superimposed features. The difference between the representative value of the superimposed features and the median value of the corresponding time period in the high-risk risk evolution feature template is calculated to obtain the feature deviation value. At the same time, the first-order difference calculation is performed on the representative values ​​of the superimposed features for the current eye use time period and at least two adjacent eye use time periods to obtain the difference sequence, and the difference sequence is accumulated to obtain the trend cumulative value. By summing the feature deviation value and the trend cumulative value, the matching score is obtained. Based on the maximum and minimum values ​​among all historical matching scores, the current matching score is subjected to maximum-minimum normalization processing to obtain the matching degree, which is used to characterize the support strength of the current behavior superimposed state for the existing evolution coefficient.

[0048] Furthermore, the preset matching degree threshold is determined through statistical analysis of the matching degree distribution in historical user behavior data. For example, the lower quartile of the matching degree of high-risk samples is taken as the preset matching degree threshold. When the matching degree is higher than the preset matching degree threshold, the numerical difference between the matching degree and the preset matching degree threshold is calculated as a deviation parameter; when the matching degree is not higher than the preset matching degree threshold, a preset deviation stability value is extracted from the preset deviation library as a deviation parameter. The preset deviation library is generated statistically from the changes in evolution coefficients of historical users during low-relevance behavior phases and is periodically updated based on new user data. The deviation stability value ranges from 0.01 to 0.05.

[0049] After determining the deviation parameter, the deviation parameter is directly added to the original evolution coefficient, and then constrained to the [0,1] interval by the Sigmoid function to obtain the corrected evolution coefficient. For example, if the original evolution coefficient is 0.78 and the deviation parameter is +0.14, they are first added to obtain 0.92, and then the final corrected evolution coefficient is obtained by mapping through the Sigmoid function.

[0050] In step S15, based on the deviation parameter, historical behavior sequences are extracted from the behavior trend curve, and the consistency of the historical behavior sequences and the preset benchmark comparison sequence is checked to determine abnormal nodes, including: Extract abnormal values ​​from the behavior trend curve that deviate from the coverage range of the deviation parameter, and combine the abnormal values ​​into a historical behavior sequence. The historical behavior sequence is discretized to generate a set of behavior feature vectors; By comparing the behavioral feature vector group with a preset benchmark comparison sequence, abnormal behavior segments can be identified; The risk level data in the abnormal behavior segment is analyzed. If the risk level data conflicts with the state of the early warning record in the same period, the time coordinate of the conflict is extracted.

[0051] It should be noted that, firstly, the behavioral trend curve is divided into continuously sampled time periods. For each time period, the average trend score within that period is calculated as a reference value, and the absolute difference between the trend score and the reference value for that period is calculated. Further, the absolute difference is standardized based on the standard deviation of the trend scores for that period to obtain the standardized deviation. If the standardized deviation exceeds a preset deviation parameter threshold, that period is marked as an outlier. Then, all the outliers are combined sequentially in chronological order to form a historical behavioral sequence for subsequent analysis.

[0052] Specifically, the eye posture and fixation duration within consecutive time segments of the historical behavior sequence are statistically processed to calculate a representative value for each time segment. For example, the head tilt angle is taken as the average value for that time segment, and the single fixation duration is taken as the average fixation duration for that time segment. Then, the representative value is mapped to a corresponding discrete level. For example, based on the user's individual parameters in S12, the head tilt angle can be divided into low (0–10 degrees), medium (11–20 degrees), and high (21 degrees and above), and the single fixation duration can be divided into short (0–3 seconds), medium (4–6 seconds), and long (7 seconds and above). The discrete posture and fixation features of each time segment are combined to form a behavior feature vector. The behavior feature vectors of multiple time segments are arranged in chronological order to obtain the behavior feature vector group.

[0053] The preset benchmark comparison sequence is derived from typical behavioral patterns of a large number of users with healthy eyesight. For example, in the standard sequence, the head tilt angle is consistently maintained below 10 degrees, and the fixation duration is mostly within 3 seconds. Specifically, the behavioral feature vector group is compared with the preset benchmark comparison sequence point by point and feature by feature dimension. For each time point, the behavioral feature vector is subtracted from the corresponding time point's benchmark vector in one dimension to obtain the feature deviation. The feature deviations of all time points are arranged in chronological order to form the deviation matrix, where the rows of the matrix represent time points and the columns represent different behavioral feature dimensions. Subsequently, when the number of vectors deviating from the preset deviation threshold in a continuous time period reaches a preset continuous length requirement, the time period is determined to be an abnormal behavioral segment.

[0054] The preset deviation threshold is obtained by statistically analyzing the behavioral characteristics of a large number of healthy users. Specifically, the mean and standard deviation of each dimension feature in a healthy state are calculated, and then the standard deviation is multiplied by 1.5 to obtain the deviation threshold for that dimension. The deviation threshold for the head tilt angle is 6 degrees and the deviation threshold for the fixation duration is 1.5 seconds, to distinguish between normal fluctuations and abnormal deviations. The preset continuous length is obtained by statistically analyzing the number of times healthy users exceed the deviation threshold within a continuous time period. For example, if the statistics show that the average number of times healthy users exceed the deviation threshold for the head tilt angle or fixation duration is 2, then the preset continuous length is set to 3 times. That is, only when there are more than 3 consecutive deviations is it judged as an abnormal behavior segment, thereby avoiding misjudgment due to single or short-term fluctuations.

[0055] Specifically, risk level data is stored in fields of a behavioral feature vector group and corresponds one-to-one with each time point. Risk level data for each abnormal behavior segment is read sequentially by time and compared with warning records within that time period, including cases where the risk level is higher, lower, or equal to the warning level. If the risk level data does not match the warning record status, the start and end times of that time period are recorded, forming the conflict time coordinates.

[0056] Further, the conflict time coordinates are first listed in chronological order, then the time sampling points in the behavior trend curve are scanned, and the timestamp of each sampling point is matched with the conflict time coordinates. If the time of the sampling point is between the start and end time of a conflict segment, then the sampling point is marked as the abnormal node. By performing the above mapping sequentially on all conflict time periods, a complete set of abnormal nodes is generated on the behavior trend curve. For example, the conflict time coordinates of an abnormal segment are from 14:32 to 14:47, and the behavior trend curve is sampled once per minute. Each sampling point from 14:32 to 14:47 is checked sequentially, and it is found that they are all within the conflict time range, so all these sampling points are marked as abnormal nodes.

[0057] In step S16, based on the corrected evolution coefficient and the screen display characteristics, the abnormal nodes are filtered and removed to obtain effective points of interest. Multiple effective points of interest are then weighted and fused to obtain a set of fused points. Finally, a warning interval is determined based on the number and distribution of the fused point set, including: The risk distribution weight of the abnormal node is calculated based on the modified evolution coefficient, and the abnormal node is filtered according to the risk distribution weight to obtain a set of points to be verified. Visual dwell time is extracted from the screen display features, and the visual dwell time is mapped to the time axis of the set of points to be verified. Points with visual dwell time lower than a preset dwell threshold are removed to obtain effective attention points. Based on the corrected evolution coefficient and the visual dwell time, the time coordinates of multiple effective attention points are weighted and fused to obtain a set of fused points. In one possible implementation, the corrected evolution coefficient corresponding to each anomalous node is obtained, and it is normalized to a maximum and minimum value to map to the 0–1 interval, thus obtaining a risk distribution weight. The risk distribution weight is compared with a preset risk distribution weight threshold. Anomalous nodes below the preset risk distribution weight threshold are removed, while anomalous nodes above or equal to the preset risk distribution weight threshold are retained in the set of locations to be verified. The preset risk distribution weight threshold is determined by subtracting the standard deviation from the mean of the corrected evolution coefficients for time periods with confirmed significant eye strain in historical samples. For example, if the mean is 0.14 and the standard deviation is 0.03, then the preset risk distribution weight threshold is set to 0.11.

[0058] After generating the set of points to be verified, the start and end times and duration of each user gaze behavior are obtained from the screen display features and arranged in chronological order to construct a screen gaze timeline indexed by time. Further, the time coordinates of the points to be verified are mapped to the screen gaze timeline. Specifically, using the time coordinate of each point to be verified as a reference, it is determined whether it falls within the start and end time interval of any gaze behavior. If it does, the duration of that gaze behavior is taken as the visual dwell time of that point; if it does not, the visual dwell time is recorded as zero. Subsequently, the visual dwell time is compared with a preset dwell threshold. Points less than the preset dwell threshold are determined to be short-term gazes and are removed; points greater than or equal to the preset dwell threshold are determined to be valid attention points and are retained, forming the set of valid attention points. The preset dwell threshold is determined through historical sample statistics. In the historical samples, time periods with significant fatigue accumulation confirmed by ophthalmologists or long-term follow-up data are selected. The duration distribution of single-time fixation behavior of users in these time periods is statistically analyzed. The statistics show that fixations of less than 3 seconds are mostly rapid saccades or operation intervals, which have a weak correlation with risk evolution. Fixations of ≥3 seconds are significantly correlated with risk accumulation. Therefore, the preset dwell threshold is set to 3 seconds.

[0059] Specifically, effective attention points are sorted by time, and a preset time window is introduced as a fusion condition. The preset time window is determined through historical sample statistics, and its length is the mean plus standard deviation of the time intervals between adjacent effective attention points. For example, if the mean is 2 minutes and the standard deviation is 1 minute, the window is set to 3 minutes. Points with time intervals shorter than the preset time window are grouped into the same fusion candidate group. A fusion weight is calculated for each point in the candidate group. The corrected evolution coefficient and the visual dwell time are respectively subjected to max-min normalization, then multiplied by the weight of the corrected evolution coefficient and the weight of the visual dwell time, and summed to obtain the fusion weight.

[0060] In this process, the sum of the weighted evolution coefficient and the weighted visual fixation duration is 1. A linear regression analysis of the risk evolution coefficient and fixation duration on fatigue accumulation in historical samples is performed to obtain the correlation coefficient. The corresponding weights are then calculated based on the proportion of each correlation coefficient to the total correlation coefficient. After obtaining the fusion weights, the time coordinates of each point within the candidate group are weighted and summed, then divided by the total fusion weights to obtain the time coordinates of the fused points in that group. This process is repeated for all candidate groups that meet the fusion conditions to ultimately obtain the set of fused points.

[0061] Specifically, the number of fusion point sets is compared with a preset threshold. This preset threshold is determined through historical statistics; specifically, it involves statistically analyzing the average number of fusion point sets within a large user sample during the time periods when the early warning strategy was confirmed to require adjustment, and then setting the preset threshold slightly higher than this average. For example, if statistics show that the number of fusion point sets during the intervention-required time periods is concentrated between 7 and 9, the preset threshold can be set to 8.

[0062] It should be noted that when the number of fused points exceeds the preset threshold, the warning interval is constructed. When the number of fused points does not exceed the preset threshold, the fused point set is sorted by time coordinate, and the time difference between adjacent fused points is calculated sequentially to obtain a time interval sequence. Based on this sequence, the mean and variance of the time intervals are calculated. When the mean of the time intervals is greater than the preset interval threshold and the variance of the time intervals is lower than the preset discrete threshold, the boundary of the stable interval is determined based on the time coordinates of the earliest and latest fused points in the fused point set. The preset interval threshold is determined based on the statistical results of the average time intervals of fused points in historical stable user samples; the preset discrete threshold is determined based on the statistical distribution of the corresponding time interval variances.

[0063] In step S17, within the warning interval, trend analysis is performed on the superimposed features to generate risk prediction results, including: The time boundary is determined based on the warning interval, and the real-time interaction records within the time boundary are extracted from the superimposed features. The interaction frequency, single duration, and interaction interval are extracted from the real-time interaction record. The interaction frequency, single duration, and interaction interval are converted into vectors, and the vectors are fused and calculated to obtain the visual fatigue feature vector. Based on the behavioral data, pupil diameter change data within the time period synchronized with the real-time interaction record is obtained, and the visual fatigue feature vector is timestamped and spliced ​​with the pupil diameter change data to construct a behavioral load sequence; The behavioral load sequence is input into a preset LSTM model to calculate the trend evolution and output the risk evolution sequence. The risk evolution sequence is quantified and numerically mapped to obtain the risk prediction result.

[0064] It should be noted that the user interaction events are limited to user operations when the screen is unlocked and in the foreground, including screen opening, page swiping, and click operations, excluding system events involving screen lock, background processes, or unattended actions. For each user interaction event, the duration of each event is calculated based on the start and end times, and only events with a duration not less than a preset threshold are retained to form the final real-time interaction record.

[0065] After obtaining the real-time interaction records, the number of interactions, duration of each interaction, and interval between adjacent interactions are statistically analyzed per unit time within the warning interval to obtain interaction frequency, single-interaction duration, and interaction interval features. Each feature is then normalized to its minimum and maximum values ​​based on historical samples and combined into a vector in a predetermined order. A weighted summation is then used to generate an eye fatigue feature vector. The fusion weight of each feature is determined by the statistical contribution ratio of each feature to the risk evolution trend in historical user samples; the predetermined order is set based on the contribution of each feature to the risk evolution trend in historical samples.

[0066] The visual fatigue feature vector and the pupil diameter change data are time-aligned and concatenated at a fixed sampling interval to form a multi-dimensional vector sequence, constructing the behavioral load sequence. This sequence includes features such as interaction frequency, single duration, interaction interval, and pupil diameter change, for subsequent trend evolution analysis. The fixed sampling interval is set to 30 seconds, determined based on historical user data statistics and the characteristics of visual fatigue accumulation. Specifically, statistical analysis is performed on the interaction behavior and pupil diameter change data of 500 users during a continuous month of high-sensitivity periods. The data is divided into windows of 20, 25, 30, 35, and 40 seconds. For each window, the mean and standard deviation of interaction frequency and pupil diameter change are calculated, along with the coefficient of variation of the mean. The analysis results show that the mean is most stable and the coefficient of variation is smallest when the window is 30 seconds. At a 95% confidence level, this window reliably reflects the user's accumulated eye load while balancing sequence length and computational efficiency; therefore, 30 seconds was selected as the fixed sampling interval.

[0067] The pre-defined Long Short-Term Memory (LSTM) network model includes an input layer that receives multi-dimensional sequence features, three hidden layers containing 64, 32, and 16 neurons respectively, and an output layer that generates refractive error drift trend values ​​at corresponding time points. The LSTM model uses historical user eye behavior records and corresponding refractive error measurements as supervised data, employs mean squared error as the loss function, uses the Adam optimizer, and optimizes the learning rate through grid search or cross-validation, with a typical value of 0.001. Finally, it is trained until convergence. By traversing the entire sequence, the risk evolution sequence is output.

[0068] For example, during a high-risk period, such as 2:00 PM to 2:30 PM, a user performs multiple fixation actions lasting more than 8 seconds consecutively, with an average interval of only 22 seconds, while the pupil diameter continuously dilates by more than 0.4 mm during this time. When the behavioral load sequence of this period is input into an LSTM model, the trend sequence shows a significant increase after approximately 15 minutes, corresponding to a substantial increase in risk value. Conversely, during low-load periods, such as 8:00 AM to 8:30 AM, user interactions are fewer and shorter in duration, pupil fluctuations are stable, and the model output trend sequence remains flat, corresponding to a lower risk value.

[0069] Specifically, the risk evolution sequence is first segmented according to a preset time analysis window. The length of the preset time analysis window is determined based on the minimum physiological response interval required for a detectable change in refractive power to be produced by continuous eye stimulation. This response interval is obtained by statistical analysis of short-term refractive power change data in historical follow-up samples, for example, set to 5 minutes or 10 minutes.

[0070] Within each preset time analysis window, the start and end times of the risk evolution sequence within that window are used as boundaries. The refractive error drift trend value within that segment is extracted, and a linear fit is performed using the least squares method with time as the independent variable and the refractive error drift trend value as the dependent variable. The slope of the fitted line is the trend slope for that time period. This calculation is performed sequentially on adjacent time analysis windows to obtain a trend slope sequence. Subsequently, based on a preset risk mapping rule, the trend slope is converted into risk values ​​and risk levels, generating the risk prediction results for the corresponding time period. The preset risk mapping rule is established based on long-term follow-up data of historical users. Sample users are divided into high, medium, and low risk groups according to their actual myopia progression. The refractive error drift rate distribution of each group is statistically analyzed, and the trend slope within the window is mapped linearly to the corresponding risk value interval, ensuring that the risk value changes with the trend and that different risk levels are clearly distinguishable numerically.

[0071] For example, for high-risk users, the distribution of refractive error drift rate over their historical follow-up period is first statistically analyzed. A lower limit of 0.18 diopters / hour is used as the high-risk threshold. When the trend slope exceeds the high-risk threshold of 0.18 diopters / hour, the time period is determined to be high-risk, and the trend slope is linearly mapped to a risk value of 0.85–1.0. When the trend slope is between 0.08 and 0.18 diopters / hour, it is determined to be medium-risk, and the risk value is mapped to 0.45–0.80. When the trend slope is below 0.08 diopters / hour, it is determined to be low-risk, and the risk value is mapped to 0–0.45. This mapping result serves as the risk prediction result, guiding subsequent interventions or monitoring.

[0072] In summary, this invention discloses a myopia risk assessment method based on multimodal sensing. By jointly analyzing multidimensional eye-use behavior characteristics and environmental characteristics, and combining dynamic correction of evolution coefficients and abnormal node screening strategies, it achieves a detailed characterization of the user's actual eye load and a continuous assessment of the risk evolution trend, thereby effectively reducing the interference of environmental fluctuations and occasional abnormal behaviors on the assessment results.

[0073] Reference Figure 2 The second embodiment of the present invention provides a myopia risk assessment system based on multimodal sensing, comprising: The data acquisition module is used to acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, and parse the environmental data and extract features to obtain illumination features and screen display features. The behavioral trend curve analysis module is used to perform trend analysis on the behavioral feature set, generate a behavioral trend curve reflecting changes in eye use behavior, obtain user individual parameters corresponding to the behavioral trend curve, calculate a risk score based on the behavioral trend curve and the user individual parameters, and determine the risk evolution type based on the risk score. An interactive feature processing module is used to fuse the illumination feature and the screen display feature to obtain an evolution coefficient when the risk score corresponding to the risk evolution type exceeds a preset risk threshold. The superimposed feature generation module is used to generate superimposed features based on the behavioral feature set and the evolution coefficients, generate deviation parameters according to the matching relationship between the superimposed features and the evolution coefficients, and update the evolution coefficients based on the deviation parameters to obtain the corrected evolution coefficients. The abnormal node module is used to perform consistency verification on the historical behavior sequence in the behavior trend curve based on the deviation parameter, and to determine abnormal nodes. The early warning interval module is used to filter and remove abnormal nodes based on the corrected evolution coefficient and the screen display characteristics to obtain effective attention points, perform weighted fusion on multiple effective attention points to obtain a fused point set, and determine the early warning interval according to the number and distribution of the fused point set. The risk prediction module is used to perform trend analysis processing on the superimposed features within the warning interval to generate risk prediction results.

[0074] It should be noted that the myopia risk assessment system based on multimodal sensing provided in this embodiment of the invention is used to execute all the process steps of the myopia risk assessment method based on multimodal sensing in the above embodiment. The working principles and beneficial effects of the two are one-to-one, so they will not be described again.

[0075] It should be noted that the system embodiments described above are merely illustrative. The units described as separate components may or may not be physically separate, and the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the modules can be selected to achieve the purpose of this embodiment according to actual needs. Furthermore, in the accompanying drawings of the system embodiments provided by this invention, the connection relationships between modules indicate that they have communication connections, which can be specifically implemented as one or more communication buses or signal lines. Those skilled in the art can understand and implement this without any creative effort.

[0076] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. In particular, it should be noted that any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention for those skilled in the art.

Claims

1. A myopia risk assessment method based on multimodal sensing, characterized in that, include: Acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, parse the environmental data and extract features to obtain illumination features and screen display features; Trend analysis is performed on the behavioral feature set to generate a behavioral trend curve reflecting changes in eye use behavior. User individual parameters corresponding to the behavioral trend curve are obtained. A risk score is calculated based on the behavioral trend curve and the user individual parameters. The risk evolution type is determined based on the risk score. When the risk score corresponding to the risk evolution type exceeds a preset risk threshold, the illumination feature and the screen display feature are fused to obtain the evolution coefficient. Based on the behavioral feature set and the evolution coefficients, superimposed features are generated, and deviation parameters are generated according to the matching relationship between the superimposed features and the evolution coefficients. The evolution coefficients are then updated based on the deviation parameters to obtain the corrected evolution coefficients. Based on the deviation parameter, the consistency of the historical behavior sequence in the behavior trend curve is checked to identify abnormal nodes; Based on the corrected evolution coefficient and the screen display characteristics, the abnormal nodes are screened and removed to obtain effective points of concern. Multiple effective points of concern are weighted and fused to obtain a set of fused points. The warning interval is determined according to the number and distribution of the fused points. Within the warning range, trend analysis is performed on the superimposed features to generate risk prediction results.

2. The myopia risk assessment method based on multimodal sensing according to claim 1, characterized in that, The process of acquiring behavioral and environmental data collected by the terminal, parsing the behavioral data and extracting features to obtain a behavioral feature set, and parsing the environmental data and extracting features to obtain illumination features and screen display features includes: Acquire behavioral data and environmental data collected by the terminal; wherein, the behavioral data includes a continuous video frame sequence of user facial information, and the environmental data includes ambient light sensor data and screen display status data; Based on the behavioral data, a continuous video frame sequence is parsed to extract behavioral information reflecting changes in eye usage time, and a behavioral feature set is generated based on the behavioral information. Based on the ambient light sensing data, illumination information reflecting changes in ambient brightness is extracted, and illumination features are generated based on the illumination information. Based on the screen display status data, screen brightness change information and display duration information are parsed, and screen display features are generated based on the screen brightness change information and display duration information.

3. The myopia risk assessment method based on multimodal sensing according to claim 1, characterized in that, The process involves performing trend analysis on the behavioral feature set to generate a behavioral trend curve reflecting changes in eye-use behavior, obtaining individual user parameters corresponding to the behavioral trend curve, calculating a risk score based on the behavioral trend curve and the individual user parameters, and determining the risk evolution type based on the risk score, including: The behavioral feature set is input into a preset random forest model, and a behavioral state score is output. The behavioral state score is smoothed by a sliding window averaging method to generate a behavioral trend curve that reflects changes in eye use behavior. The user individual parameters corresponding to the behavior trend curve are obtained from the preset user database, and the behavior trend curve and the user individual parameters are concatenated using a preset support vector machine model to form a multidimensional trend feature vector. A risk score is calculated based on the multidimensional trend feature vector, and the risk score is compared with a preset risk evolution threshold to obtain the user's risk evolution type.

4. The myopia risk assessment method based on multimodal sensing according to claim 3, characterized in that, The step of fusing the illumination features with the screen display features to obtain evolution coefficients includes: Obtain the ambient light intensity sequence corresponding to the illumination features and the screen brightness change sequence corresponding to the screen display features; The ambient light intensity sequence and the screen brightness change sequence are mapped to a preset visual comfort interaction matrix to obtain the photoelectric interaction difference value and the visual interference intensity index. The environmental pressure load value is obtained by fusing the photoelectric interaction difference value with the visual interference intensity index. Extract pupil response features corresponding to the risk evolution type from the behavioral data, and determine dynamic weights based on the pupil response features; The environmental pressure load value and the risk score are weighted and calculated based on the dynamic weights to obtain the evolution coefficients.

5. The myopia risk assessment method based on multimodal sensing according to claim 1, characterized in that, The process involves generating superimposed features based on the behavioral feature set and the evolution coefficients, generating deviation parameters based on the matching relationship between the superimposed features and the evolution coefficients, and updating the evolution coefficients based on the deviation parameters to obtain corrected evolution coefficients, including: Based on the evolution coefficients, retrieve the corresponding behavioral feature sequences from the behavioral feature set, and parse the behavioral feature sequences to obtain the eye posture vector sequence and the eye usage duration sequence; The eye posture vector sequence and the eye duration sequence are combined to obtain the superimposed feature; The matching degree is calculated based on the matching relationship between the superposition features and the evolution coefficients; When the matching degree is higher than the preset matching degree threshold, the deviation parameter is calculated based on the degree of deviation of the matching degree from the preset matching degree threshold; When the matching degree is not higher than the preset matching degree threshold, a preset deviation stability value is extracted from the preset deviation library as a deviation parameter. The evolution coefficients are numerically compensated based on the deviation parameters to obtain the corrected evolution coefficients.

6. The myopia risk assessment method based on multimodal sensing according to claim 2, characterized in that, The step of extracting historical behavior sequences from the behavior trend curve based on the deviation parameter, performing consistency checks between the historical behavior sequences and a preset benchmark comparison sequence, and identifying abnormal nodes includes: Extract abnormal values ​​from the behavior trend curve that deviate from the coverage range of the deviation parameter, and combine the abnormal values ​​into a historical behavior sequence. The historical behavior sequence is discretized to generate a set of behavior feature vectors; By comparing the behavioral feature vector group with a preset benchmark comparison sequence, abnormal behavior segments can be identified; Analyze the risk level data in the abnormal behavior segment. If the risk level data conflicts with the status of the early warning record in the same period, extract the time coordinate of the conflict. Map the conflict time coordinates onto the coordinate axis of the behavior trend curve to identify abnormal nodes.

7. The myopia risk assessment method based on multimodal sensing according to claim 1, characterized in that, Based on the corrected evolution coefficient and the screen display characteristics, the abnormal nodes are screened and removed to obtain effective points of interest. Multiple effective points of interest are then weighted and fused to obtain a set of fused points. A warning interval is determined based on the number and distribution of the fused point set, including: The risk distribution weight of the abnormal node is calculated based on the modified evolution coefficient, and the abnormal node is filtered according to the risk distribution weight to obtain a set of points to be verified. Visual dwell time is extracted from the screen display features, and the visual dwell time is mapped to the time axis of the set of points to be verified. Points with visual dwell time lower than a preset dwell threshold are removed to obtain effective attention points. Based on the corrected evolution coefficient and the visual dwell time, the time coordinates of multiple effective attention points are weighted and fused to obtain a set of fused points. When the number of the fusion point set exceeds the preset fusion threshold, the early warning interval is constructed based on the distribution span of the fusion point set on the time axis.

8. The myopia risk assessment method based on multimodal sensing according to claim 1, characterized in that, Within the warning interval, trend analysis is performed on the superimposed features to generate risk prediction results, including: The time boundary is determined based on the warning interval, and the real-time interaction records within the time boundary are extracted from the superimposed features. The interaction frequency, single duration, and interaction interval are extracted from the real-time interaction record. The interaction frequency, single duration, and interaction interval are converted into vectors, and the vectors are fused and calculated to obtain the visual fatigue feature vector. Based on the behavioral data, pupil diameter change data within the time period synchronized with the real-time interaction record is obtained, and the visual fatigue feature vector is timestamped and spliced ​​with the pupil diameter change data to construct a behavioral load sequence; The behavioral load sequence is input into a preset LSTM model to calculate the trend evolution and output the risk evolution sequence. The risk evolution sequence is quantified and numerically mapped to obtain the risk prediction result.

9. A myopia risk assessment system based on multimodal sensing, characterized in that, include: The data acquisition module is used to acquire behavioral data and environmental data collected by the terminal, parse the behavioral data and extract features to obtain a behavioral feature set, and parse the environmental data and extract features to obtain illumination features and screen display features. The behavioral trend curve analysis module is used to perform trend analysis on the behavioral feature set, generate a behavioral trend curve reflecting changes in eye use behavior, obtain user individual parameters corresponding to the behavioral trend curve, calculate a risk score based on the behavioral trend curve and the user individual parameters, and determine the risk evolution type based on the risk score. An interactive feature processing module is used to fuse the illumination feature and the screen display feature to obtain an evolution coefficient when the risk score corresponding to the risk evolution type exceeds a preset risk threshold. The superimposed feature generation module is used to generate superimposed features based on the behavioral feature set and the evolution coefficients, generate deviation parameters according to the matching relationship between the superimposed features and the evolution coefficients, and update the evolution coefficients based on the deviation parameters to obtain the corrected evolution coefficients. The abnormal node module is used to perform consistency verification on the historical behavior sequence in the behavior trend curve based on the deviation parameter, and to determine abnormal nodes. The early warning interval module is used to filter and remove abnormal nodes based on the corrected evolution coefficient and the screen display characteristics to obtain effective attention points, perform weighted fusion on multiple effective attention points to obtain a fused point set, and determine the early warning interval according to the number and distribution of the fused point set. The risk prediction module is used to perform trend analysis processing on the superimposed features within the warning interval to generate risk prediction results.