A fast and accurate automated vision detection method and system
By combining static gesture recognition and a Naive Bayes classifier, the problems of hygiene, cost, and efficiency in vision testing are solved, enabling fast and accurate home vision testing, which is suitable for multiple application scenarios.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- FUJIAN COLLEGE OF BIOTECHNOLOGY
- Filing Date
- 2024-05-27
- Publication Date
- 2026-06-19
AI Technical Summary
Existing vision testing methods suffer from hygiene issues related to contact testing, high equipment costs, low testing efficiency, and unstable accuracy, making it difficult to meet the needs of large-scale and home applications.
By employing static gesture recognition technology combined with a Naive Bayes classifier, the system captures gesture images of the tested user through a camera, performs geometric modeling analysis using the MediaPipe Hands module, and combines visual target display and reaction time to quickly determine the visual threshold, avoiding contact detection and reducing the number of visual target tests.
It enables rapid and accurate vision testing, reduces equipment costs, eliminates the risk of cross-infection, and improves testing efficiency and accuracy, making it suitable for home and multi-scenario applications.
Smart Images

Figure CN118557134B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of automated vision testing technology, and in particular to a rapid and accurate automated vision testing method and system. Background Technology
[0002] Currently, vision testing methods mainly include wearable technology based on virtual reality, optical sensor devices based on refraction, manual testing based on eye charts, and human-computer interaction testing. Wearable technology based on virtual reality: This method utilizes virtual reality technology, allowing users to immerse themselves in a virtual visual environment through wearable devices such as head-mounted displays or glasses for vision testing. These devices are typically equipped with built-in screens and sensors that can display various visual stimuli and record the user's reactions. Although this technology offers high immersion and interactivity, the equipment is expensive and poses health risks such as cross-infection due to contact testing. Optical sensor devices based on refraction: This method uses optical sensor devices, such as automated refractometers or automated refractometers, to measure the eye's refraction and refractive power, and assess vision accordingly. These devices are typically used by specialized medical institutions and offer high accuracy and reliability. However, these devices are usually expensive and require professional operators.
[0003] Manual vision testing based on eye charts: This is a traditional vision testing method, following the national standard GB / T11533-2011, and employing a line-by-line scanning testing process. It starts with the largest optotypes and gradually decreases in size until the test subject can no longer correctly identify most optotypes of a specific size. Although simple to operate, it requires human interaction, increasing labor costs. In large-scale vision testing, it is less efficient and prone to errors due to tester fatigue. Human-computer interaction testing based on eye charts: This utilizes human-computer interaction technologies such as voice recognition and gesture recognition, allowing test subjects to interact with the computer through voice and gesture commands to complete the vision test. This non-contact method enables self-service and automated testing, reducing labor costs and improving testing efficiency and convenience. However, it still uses a line-by-line scanning testing process and requires extensive testing across all vision levels, therefore there is still room for improvement in testing efficiency.
[0004] In summary, the existing technology has the following problems that need to be solved:
[0005] 1. Hygiene issues of contact testing: Some existing vision testing methods use contact testing, which poses hygiene problems and may lead to health risks such as cross-infection, which is not conducive to the realization of large-scale testing.
[0006] 2. High cost and equipment dependence: Some vision testing methods, such as wearable technology based on virtual reality and optical sensor equipment based on optometry, have high costs and dependence on specialized equipment, which limits their practicality and universality, thus restricting their application and popularization in various scenarios such as home.
[0007] 3. The detection efficiency is still not high enough: Traditional human-interactive vision testing methods are time-consuming and require a lot of manpower, resulting in low efficiency. Although vision testing based on human-computer interaction technology is more efficient than traditional human-interactive methods, it still uses a line-by-line scanning testing process, so there is still room for improvement in detection efficiency.
[0008] 4. Insufficient stability of detection accuracy: Vision tests based on human-computer interaction technology are significantly affected by external interference in terms of accuracy. For example, speech recognition technology is easily affected by background noise, accents, and changes in speech rate; dynamic gesture recognition may be affected by environmental interference during the start and end points of gestures and the target tracking process; traditional static gesture recognition methods based on image processing may be affected by factors such as skin color and lighting. Therefore, the stability of detection accuracy needs to be further improved. Summary of the Invention
[0009] The purpose of this invention is to provide a fast and accurate automated vision testing method and system, which aims to improve the detection efficiency and accuracy of automated vision testing, without the need for contact testing and without additional hardware cost investment.
[0010] The technical solution adopted by this invention to solve its technical problem is as follows:
[0011] On the one hand, the present invention provides a rapid and accurate automated vision detection method, comprising the following steps:
[0012] The process control module is started, and the process control module sends the first instruction to the auxiliary function module to activate the auxiliary function module. The auxiliary function module drives the corresponding prompting device to prompt the user under test, and after setting the visual acuity level to be tested, it enters the visual acuity test loop process.
[0013] The process control module sends a second instruction to the visual target display module. After receiving the second instruction, the visual target display module drives the display screen to display visual targets of the corresponding size according to the set visual acuity level to be tested, and provides visual target stimulation to the tested user.
[0014] The process control module sends a third instruction to the gesture recognition module. After receiving the third instruction, the gesture recognition module drives the camera to capture gesture images.
[0015] After receiving prompts and visual stimuli, the tested user makes a corresponding static gesture;
[0016] The gesture recognition module uses a single visual target gesture recognition algorithm to detect the user's reaction time to visual target stimuli and recognize the acquired gesture images, generating detection and recognition results and sending them to the process control module.
[0017] After receiving the detection and recognition results, the process control module calls the vision threshold determination mechanism to determine whether to continue the vision detection loop. If the loop continues, the vision detection loop is repeated. If the loop ends, the vision threshold is determined.
[0018] Under the visual acuity level corresponding to the determined visual acuity threshold, multiple optotype tests are performed to determine the optotype level. The determined optotype level is the actual visual acuity value of the tested user.
[0019] As a further optimization, the process control module is activated before the following steps are also included:
[0020] The user's eyes and the target displayed on the screen are at the same horizontal plane, and the horizontal distance between the user's eyes and the target is set as the detection distance;
[0021] The camera is placed directly facing the user's gesture, and the horizontal distance between the camera and the user's hand is set as the camera's working distance. When the camera captures gesture images, only one hand is allowed to appear in the camera's field of view.
[0022] As a further optimization, after activating the auxiliary function module, the user enters the parameter input interface and waits for the user to input the necessary parameters, including whether the user is nearsighted, the most recent vision test value before the current vision test time, and the test distance.
[0023] As a further optimization, after receiving the instruction, the optotype display module drives the display screen to display an optotype of the corresponding size according to the set visual acuity level to be tested. The direction of the optotype is randomly generated during the display process, and the size of the optotype is set according to the detection distance.
[0024] As a further optimization, the auxiliary function module drives the corresponding prompting device to provide prompts to the user under inspection, including:
[0025] The auxiliary function module drives the voice prompt device to provide voice prompts to the user under test, and / or the auxiliary function module drives the text prompt device to provide text prompts to the user under test.
[0026] As a further optimization, the gesture recognition module calls a gesture recognition algorithm for a single visual target to detect the user's reaction time to the visual target stimulus and recognizes the acquired gesture images, including:
[0027] Define static gesture rules;
[0028] Identify static gestures from a single gesture image;
[0029] Detecting the reaction time of the tested user to visual target stimuli during gesture recognition;
[0030] Recognize gestures for individual visual targets;
[0031] The accuracy of static gesture recognition and vision testing was evaluated.
[0032] As a further optimization, a vision threshold determination mechanism based on a Naive Bayes classifier is adopted to determine the vision threshold.
[0033] As a further optimization, the vision threshold determination mechanism based on the Naive Bayes classifier includes the following steps:
[0034] The gesture recognition module is executed, and the output results include a feature vector containing parameters of the test user's reaction time to visual target stimuli and the gesture recognition results.
[0035] Update the boundary difference of the visual threshold;
[0036] Determine whether the boundary difference of the updated vision threshold is equal to the spacing of the set vision threshold. If yes, output the vision threshold and end. Otherwise, determine whether the gesture recognition result is equal to 1. If it is equal, execute the Naive Bayes classifier, output the current estimated vision level, and calculate the next vision level to be tested using the summary formula. If it is not equal, directly calculate the next vision level to be tested using the summary formula.
[0037] After calculating the next visual acuity level to be tested using the summarized formula, the process returns to the gesture recognition module until the boundary difference of the updated visual acuity threshold equals the spacing of the set visual acuity threshold.
[0038] On the other hand, the present invention also provides a fast and accurate automated vision testing system, applied to the aforementioned fast and accurate automated vision testing method, comprising:
[0039] The auxiliary function module is used to drive the corresponding prompting device to prompt the user after the auxiliary function module is activated, and to enter the vision test cycle after setting the vision level to be tested.
[0040] The optotype display module is used to drive the display screen to display optotypes of corresponding size according to the set visual acuity level to be tested after receiving the second instruction, so as to stimulate the user with optotypes.
[0041] The gesture recognition module is used to drive the camera to capture gesture images after receiving a third instruction. It is also used to detect the user's reaction time to the visual stimulus after the user receives a prompt and visual stimulus and makes a corresponding static gesture. The module calls the gesture recognition algorithm of a single visual stimulus to recognize the captured gesture image, generate detection and recognition results and send them to the process control module.
[0042] The process control module is used to send a first instruction to the auxiliary function module after the process control module is started, thereby activating the auxiliary function module; and to send a second instruction to the visual target display module; and to send a third instruction to the gesture recognition module.
[0043] After receiving the detection and recognition results sent from the gesture recognition module, it calls the vision threshold determination mechanism to determine whether to continue the vision detection loop. If the loop continues, the vision detection loop is repeated. If the loop ends, the vision threshold is determined. Under the vision level corresponding to the determined vision threshold, multiple optotype detections are performed to determine the optotype level. The determined optotype level is the actual vision value of the tested user.
[0044] The beneficial effects of this invention include:
[0045] (1) Breaking through the bottleneck of detection efficiency: By utilizing the reaction time of the user's gesture recognition and other key data obtained during the interaction, a vision threshold determination mechanism based on Naive Bayes classifier is adopted, which avoids the time-consuming traditional line-by-line detection method, quickly determines the vision threshold, reduces the number of optotype tests, significantly shortens the test time, and improves the user experience.
[0046] (2) Improve detection accuracy: The accurate recognition algorithm developed by using the MediaPipe Hands module to provide precise hand landmarks and perform geometric modeling analysis of gestures in vision test scenarios provides a high and stable vision detection accuracy.
[0047] (3) Achieve low-cost vision testing: This system only requires a home computer equipped with a camera to achieve self-service home vision testing without human intervention, and is easy to promote on a large scale.
[0048] (4) Achieve non-contact detection: Static gesture interaction is non-contact, avoiding contact between the tested user and the equipment, eliminating the health and safety risks of cross-infection; at the same time, non-contact measurement can be used in automated testing environments, promoting the use of this system in multiple scenarios such as schools, communities, and medical institutions. Attached Figure Description
[0049] Figure 1 This is a flowchart of a rapid and accurate automated vision testing method according to Embodiment 1 of the present invention;
[0050] Figure 2 This is a schematic diagram of the framework and process of vision testing in Embodiment 3 of the present invention;
[0051] Figure 3 This is a schematic diagram of the static gestures used in Embodiment 3 of the present invention;
[0052] Figure 4 This is a schematic diagram showing the index and distribution of 21 joint points in the hand in Embodiment 3 of the present invention;
[0053] Figure 5 This is a flowchart of a single visual target gesture recognition process in Embodiment 3 of the present invention;
[0054] Figure 6 This is a Bland-Altman consistency evaluation chart for the two measurement methods in Embodiment 3 of the present invention;
[0055] Figure 7 This is a flowchart illustrating the visual threshold determination mechanism in Embodiment 3 of the present invention. Detailed Implementation
[0056] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of the present invention, and not all embodiments. The components of the embodiments of the present invention described and shown in the accompanying drawings can generally be arranged and designed in various different configurations.
[0057] Example 1
[0058] This embodiment provides a fast and accurate automated vision testing method, the flowchart of which can be found here. Figure 1 The method includes the following steps:
[0059] S1. The process control module is activated. It sends a first instruction to the auxiliary function module, which then activates the auxiliary function module. The auxiliary function module drives the corresponding prompting device to provide prompts to the user being tested. After setting the desired visual acuity level, it enters the visual acuity testing loop. In this embodiment, after activating the auxiliary function module, a parameter input interface is entered, waiting for the user to input necessary parameters, such as whether they are nearsighted, the most recent visual acuity test value before the current test time, and the test distance. In this embodiment, the desired visual acuity level can be set... Set to 4.0.
[0060] S2. The process control module sends a second instruction to the optotype display module. Upon receiving the second instruction, the optotype display module drives the display screen to show an optotype of the corresponding size according to the set visual acuity level to be tested, thus stimulating the user. Here, after receiving the instruction, the optotype display module drives the display screen to show an optotype of the corresponding size according to the set visual acuity level to be tested. The direction of the optotype is randomly generated during the display process, and the size of the optotype is set according to the detection distance. Since the optotype display module is responsible for controlling the display of optotype E to ensure accurate presentation, the direction of the optotype is randomly generated during the display process to prevent any potential pre-memorization by the user. In addition, the size of the optotype can be flexibly adjusted according to the detection distance, thereby reducing the limitation of distance on the detection environment.
[0061] S3. The process control module sends a third instruction to the gesture recognition module. After receiving the third instruction, the gesture recognition module drives the camera to capture gesture images.
[0062] S4. After receiving prompts and visual stimuli, the user being tested makes the corresponding static gesture.
[0063] S5. The gesture recognition module calls the gesture recognition algorithm of a single visual target to detect the reaction time of the tested user to the visual target stimulus and recognizes the collected gesture image, generates detection and recognition results and sends them to the process control module.
[0064] S6. After receiving the detection and recognition results, the process control module calls the vision threshold determination mechanism to determine whether to continue the vision detection loop. If the loop continues, the vision detection loop is repeated. If the loop ends, the vision threshold is determined.
[0065] S7. Under the visual acuity level corresponding to the determined visual acuity threshold, perform multiple optotype tests to determine the optotype level. The determined optotype level is the actual visual acuity value of the tested user.
[0066] In actual testing, the participation of the user being tested is required. Therefore, before starting the process control module, the following steps are also necessary:
[0067] The user's eyes and the target displayed on the screen are at the same horizontal plane, and the horizontal distance between the user's eyes and the target is set as the detection distance;
[0068] The camera is placed directly facing the user's gesture, and the horizontal distance between the camera and the user's hand is set as the camera's working distance. When the camera captures gesture images, only one hand is allowed to appear in the camera's field of view.
[0069] It should be noted that the assistive function modules are designed to support the vision testing process. These functions can improve the convenience and user-friendliness of self-service and automated testing, thereby enhancing the overall user experience. Therefore, the assistive function modules driving corresponding prompting devices to provide prompts to the user under test may include:
[0070] The auxiliary function module drives the voice prompt device to provide voice prompts to the user under test, and / or the auxiliary function module drives the text prompt device to provide text prompts to the user under test.
[0071] It should be noted that, in this embodiment, the gesture recognition module calls a gesture recognition algorithm for a single visual target to detect the user's reaction time to the visual target stimulus and recognizes the acquired gesture image, which may include:
[0072] Define static gesture rules;
[0073] Identify static gestures from a single gesture image;
[0074] Detecting the reaction time of the tested user to visual target stimuli during gesture recognition;
[0075] Recognize gestures for individual visual targets;
[0076] The accuracy of static gesture recognition and vision testing was evaluated.
[0077] Preferably, this embodiment uses a vision threshold determination mechanism based on a Naive Bayes classifier to determine the vision threshold. This determination mechanism may include the following steps:
[0078] The gesture recognition module is executed, and the output results include a feature vector containing parameters of the test user's reaction time to visual target stimuli and the gesture recognition results.
[0079] Update the boundary difference of the visual threshold;
[0080] Determine whether the boundary difference of the updated vision threshold is equal to the spacing of the set vision threshold. If yes, output the vision threshold and end. Otherwise, determine whether the gesture recognition result is equal to 1. If it is equal, execute the Naive Bayes classifier, output the current estimated vision level, and calculate the next vision level to be tested using the summary formula. If it is not equal, directly calculate the next vision level to be tested using the summary formula.
[0081] After calculating the next visual acuity level to be tested using the summarized formula, the process returns to the gesture recognition module until the boundary difference of the updated visual acuity threshold equals the spacing of the set visual acuity threshold.
[0082] Example 2
[0083] Based on Example 1, this example provides a fast and accurate automated vision testing system, including:
[0084] The auxiliary function module is used to drive the corresponding prompting device to prompt the user after the auxiliary function module is activated, and to enter the vision test cycle after setting the vision level to be tested.
[0085] The optotype display module is used to drive the display screen to display optotypes of corresponding size according to the set visual acuity level to be tested after receiving the second instruction, so as to stimulate the user with optotypes.
[0086] The gesture recognition module is used to drive the camera to capture gesture images after receiving a third instruction. It is also used to detect the user's reaction time to the visual stimulus after the user receives a prompt and visual stimulus and makes a corresponding static gesture. The module calls the gesture recognition algorithm of a single visual stimulus to recognize the captured gesture image, generate detection and recognition results and send them to the process control module.
[0087] The process control module is used to send a first instruction to the auxiliary function module after the process control module is started, thereby activating the auxiliary function module; and to send a second instruction to the visual target display module; and to send a third instruction to the gesture recognition module.
[0088] After receiving the detection and recognition results sent from the gesture recognition module, it calls the vision threshold determination mechanism to determine whether to continue the vision detection loop. If the loop continues, the vision detection loop is repeated. If the loop ends, the vision threshold is determined. Under the vision level corresponding to the determined vision threshold, multiple optotype detections are performed to determine the optotype level. The determined optotype level is the actual vision value of the tested user.
[0089] The application environment and implementation principle of this embodiment are the same as those of Embodiment 1, so they will not be described again.
[0090] Example 3
[0091] Based on Examples 1 and 2, this example provides a fast and accurate automated vision testing system. See [link to example]. Figure 2 The hardware components of this system include a computer host, monitor, camera, and speakers. The computer monitor has a resolution of 1920×1080, a pixel density of 96, and a brightness of 500 cd / m². 2The camera resolution is 2 to 4 megapixels. Before detection begins, the user's eye and the target on the display screen are on the same horizontal plane; the horizontal distance between them is the detection distance, which can be set from 2 to 5 meters. The camera is placed directly facing the user's gesture. The horizontal distance between the camera and the hand is the camera's working distance, which can be adjusted from 0.5 to 1.5 meters. To accurately recognize static gestures, only one hand can be visible in the camera's field of view at a time.
[0092] This embodiment uses the vision testing process in Embodiment 1 to perform vision testing. Specifically, when determining static gesture rules, the gesture recognition module in this embodiment uses static gestures as the direction of the visual target, such as... Figure 3 As shown. When the test begins but the visual target is not yet displayed on the screen, the user's hand gesture is in a ready state, indicated by a clenched fist, as shown. Figure 3 (a) Once the target appears, the user indicates the direction of the observed target by pointing with their index finger, such as... Figure 3 (b)-3(e). When the direction of the target cannot be clearly identified, the user being inspected should indicate this with an open palm, such as... Figure 3 (f) and avoid guessing directions without certainty. After a single visual target's gesture is recognized, the visual target is cleared, and the user's gesture returns to the ready state.
[0093] In this embodiment, recognizing a static gesture from a single gesture image may include the following steps:
[0094] a. Hand Detection: Input a single image into the MediaPipe Hands module to detect hands. If multiple hands are detected or no hands are detected, the program terminates and returns the gesture state value to None. If a single hand is detected, proceed to the next step.
[0095] b. Determining the index finger's position: coordinate points as follows Figure 4 As shown, extract the coordinates (x, y) of the four points on the index finger with indices 5, 6, 7, and 8. i ,y i A vector is formed by connecting points 6 and 7. Its direction is from point 6 to point 7. Similarly, connecting points 7 and 8 forms a vector. Its direction points from point 7 to point 8. If the index finger is straight, then the two vectors... and The included angle θ l It must be less than the preset threshold θ0 (θ0 is set to 15 degrees in this method), that is... At the same time, using the formula Calculate the distances from the four points on the index finger to the wrist point (x0, y0) with index 0, where 5 ≤ i ≤ 8. If the index finger is open, the distance from the fingertip (point 8) to the wrist point (point 0) must be maximized, i.e. If both of the above conditions are met, proceed to the next step. If the conditions are not met, terminate the process and return the gesture state value to None.
[0096] c. Middle Finger State Determination: Using a method similar to that used for index finger state determination, if the middle finger is in an open and straight state, the gesture is an open palm, and the gesture state value returns to Unclear, terminating the program; otherwise, proceed to the next step. d. Gesture Direction Determination: To determine the gesture direction, a vector must first be calculated. Unit vector in the positive x-axis direction The angle between Then consider the displacement difference Δd along the y-axis. y =y7-y6 and the displacement difference Δd along the x-axis x = x7 - x6 to determine the final direction. If -45° < θ x <45° indicates the gesture is horizontal; if Δd x If >0, the gesture state value is Right, Δd x If θ ≤ 0, the gesture state is Left. Conversely, if θ > 0, the gesture state is Left. x ≤-45° or θ x ≥45° indicates the gesture is in the vertical direction, if Δd y If >0, the gesture state value is Up, Δd y If the value is ≤0, the gesture state is Down. At this point, the gesture state for a single captured image can be determined.
[0097] It should be noted that in this embodiment, the process of recognizing a single visual target gesture includes recognizing all gesture images acquired during the transition from the ready state to the pointing state, until a recognition termination condition is met. The recognition termination condition is: when three consecutive images are recognized as the same gesture or the cumulative execution time exceeds a predefined limit. The flowchart of this algorithm is as follows... Figure 5 As shown.
[0098] The gesture recognition result for a single visual target includes two elements: the user's reaction time T. r And gesture recognition results R g , where R g =1 indicates correct recognition, R g =0 indicates a recognition error. Here, the reaction time T... rThe measurement starts timing from the appearance of the visual target on the screen until the first detected gesture, at which point the timing stops. Its magnitude serves as an important parameter for measuring the visual acuity of the tested user and is therefore incorporated into the feature vector F. The feature vector F is determined by the current visual acuity level being tested. Reaction time T r and myopia status S m (Where nearsightedness is represented by 1, and conversely by 0) is composed of vectors F and R. g Together, these results are used as the output of the gesture recognition module and passed to the process control module, providing important information for subsequent vision prediction.
[0099] Specifically, in this embodiment, the accuracy of static gesture recognition and vision detection can be evaluated through experiments.
[0100] For the static gesture recognition accuracy evaluation experiment, 100 users were evaluated. They used their left and right hands to perform gestures in four different directions (up, down, left, right) and gestures that were not clearly visible. Each gesture was recognized 10 times, generating a total of 10,000 gesture recognition instances. Among them, 9,760 instances were correctly recognized, achieving an overall accuracy rate of 97.6%. Specific results are shown in Table 1.
[0101] Table 1
[0102]
[0103] For the visual acuity testing accuracy evaluation experiment, the visual acuity testing method (based on a static gesture interaction system) and the traditional visual acuity testing method (manual interaction) in this embodiment were compared and evaluated in a practical application scenario with a standard testing distance of 5 meters. 122 eyes were randomly selected from a sample of 200 eyes, and two rounds of visual acuity testing were performed on each eye. To evaluate the consistency between the two measurement methods, the Bland-Altman analysis method was used. The calculated average difference between the results measured by the two visual acuity testing methods was -0.0123, and the standard deviation was 0.0687. The visual acuity data obtained by the two methods did not show a significant difference, meaning the difference did not deviate significantly from zero, indicating that the results measured by the two methods are consistent. Figure 6 As can be seen, most data points are within the 95% consistency limit, with no obvious trend, and 96.72% (118 out of 122 eyes) are within the specified ±0.1 deviation threshold. Therefore, it can be concluded that the visual acuity data collected in this embodiment is highly consistent with the data collected by traditional methods.
[0104] It should be noted that in this embodiment, the process control module effectively manages the initiation, iteration, and termination of the vision test process by implementing a vision threshold determination mechanism. This mechanism effectively bypasses the time-consuming line-by-line scanning method, reduces the number of optotype tests required, and significantly improves efficiency.
[0105] The core of the vision threshold determination mechanism is estimating the actual visual acuity of the tested user; therefore, selecting an appropriate prediction model is crucial. In the field of machine learning, commonly used classic classification methods include Support Vector Machines (SVM), Artificial Neural Networks (ANN), and Naive Bayes Classifiers (NBC). Among these, NBC has advantages such as simplicity, efficiency, stable classification performance, and the ability to handle multi-classification problems. Furthermore, it is suitable for small sample data and incremental training scenarios, making it particularly suitable for visual acuity estimation. In addition, preliminary experiments have verified that NBC outperforms SVM and ANN in visual acuity level estimation. Therefore, this embodiment chooses to use NBC for vision detection.
[0106] In this embodiment, it is assumed that the sample training set is Where n is the number of samples, and m is the dimension of the sample features. The corresponding label dataset is... Label Where the category set ω = {ω1, ω2, ..., ω} k There are k categories in total. Each category has ω. i The prior probability is P(ω) i When a new unknown sample x = (x1, x2, ..., x...) m ), which belongs to category ω i The conditional probability is expressed as P(x|ω) i Using Bayes' theorem, the posterior probability can be calculated:
[0107]
[0108] In the Naive Bayes algorithm, assuming features x1, x2, ..., x... m They are independent. This assumption allows the conditional probability P(x|ω) to be... i Transformed into:
[0109]
[0110] Therefore, the NBC model in this embodiment can be represented as:
[0111]
[0112] Where H(x) represents the maximum posterior probability.
[0113] For a continuous feature value, say the d-th feature value is xd Given category ω i In the case that the mean is variance is The Gaussian distribution of is described by the conditional probability density function as follows:
[0114]
[0115] In this embodiment, the following steps are performed to establish the initial NBC model:
[0116] a. Determine the feature attributes and category set: Select feature vectors In each collected sample x = x1, x2, x3, x i Let represent the sampled value of the i-th feature, where y is the corresponding label value. The category set ω includes 13 visual acuity levels, ranging from 4.0 to 5.2.
[0117] b. Obtaining the training dataset: The data comes from 200 eyes of 100 test subjects. Each eye underwent 5 rounds of testing, each round including 13 different levels of visual targets, totaling 13,000 data points. (R excluded) g A total of 7405 valid samples were collected, with a value of 0. 80% were randomly selected to create the training set, and the remaining 20% constituted the test set.
[0118] c. Model Training: Input the training data into the NBC model and calculate P(ω) i ) and P(x j |ω i The Gaussian density function is used to handle continuous variables T. r .
[0119] d. Apply a classifier for estimation: considering that the estimated visual acuity value is always greater than the current tested visual acuity level. And smaller than the set L of visual acuity levels that were tested for unclear vision. inc The minimum value in, i.e., min{L inc Therefore, it is only necessary to calculate the condition. H(x) under the condition that:
[0120]
[0121] The trained classifier is used to classify the target data. For a given new sample x, if P(ω) exists... k If |x) = H(x), then x is classified into category ω. k .
[0122] It should be noted that in this embodiment, incremental learning can be used for NBC updates. Introducing incremental learning to update NBC allows for dynamic adjustment to new data patterns without requiring a complete retraining, ensuring real-time data processing capabilities while improving the model's adaptability and efficiency. The model is updated using prior information from the initial model and newly added samples. Before the test begins, the tested user can voluntarily input their actual visual acuity values. When these values are provided, the system generates new samples based on the data from the testing process and adds them to a new training dataset X. new The system periodically checks for new samples to populate the model; if so, it triggers an offline model update task.
[0123] In this embodiment, the current sample set X S There are n samples in total. Represents category ω i The number of samples in X new Each new sample x in t Its label is y t The classifier is updated using the following steps:
[0124] a. By removing x t Update X new .
[0125] b. By adding x t Update X S .
[0126] c. Update the classifier parameters according to the following formula.
[0127]
[0128] d. Check X new Is it empty? If it is empty, the update process is complete; otherwise, return to step a and continue processing X. new The remaining samples.
[0129] In this embodiment, an NBC output error evaluation experiment can also be conducted to evaluate the performance of the three machine learning models (SVM, ANN, and NBC) in predicting visual acuity values. The performance of the model is measured by the error between the estimated value and the actual visual acuity value. Table 2 provides a statistical analysis of the prediction accuracy within a specific error range. When the error boundary is 0.1, 0.2, and 0.3, the accuracy of NBC is 91.02%, 96.98%, and 99.47%, respectively, which is better than the ANN and SVM models, indicating that the NBC model has better predictive ability.
[0130] Table 2
[0131]
[0132]
[0133] In this embodiment, estimating the actual visual acuity value based on a single visual acuity test is very challenging for the visual acuity threshold determination mechanism. Therefore, the visual acuity threshold determination mechanism in this embodiment is developed based on NBC (Non-Natural Biological) and aims to prevent excessive prediction errors in NBC from causing the visual acuity search process to diverge. Before understanding the testing process of this mechanism, it is necessary to first explain the variables involved.
[0134] R g Gesture recognition results.
[0135] The current visual acuity level tested.
[0136] The visual acuity level from the previous test.
[0137] The next vision level test is coming up.
[0138] Current estimated visual acuity level.
[0139] L c R test completed g The set of visual acuity levels with a value of 1.
[0140] L inc R test completed g The set of visual acuity levels with a value of 0.
[0141] Δl: Boundary difference of visual threshold Δl=min{L inc}-max{L c}
[0142] ∈: Spacing between visual thresholds.
[0143] This mechanism involves two key decision points. First, it determines when the vision threshold search process ends. This primarily depends on whether the convergence condition Δl = ∈ is met. Second, it determines the next iteration... The calculation depends on R g The value of R. g When = 1, the eigenvector F is passed to the NBC model, producing an estimated value of at this time, In R g When the value is 0, the reaction time obtained from gesture recognition cannot be used as a prediction of the actual visual acuity value. Therefore, the NBC model is skipped. The calculation depends on and In summary, The calculation can be summarized by the formula: The specific workflow diagram is as follows: Figure 7 As shown.
[0144] In this embodiment, a visual acuity testing efficiency evaluation experiment can also be conducted. Sixty-two eyes were randomly selected from 100 subjects, with visual acuity ranging from 4.0 to 5.2. The experiments were all conducted on a human-computer interaction platform based on static gesture recognition. The difference was that these eyes underwent visual acuity testing using both a NBC-based visual acuity threshold determination mechanism and a traditional line-by-line scanning method. The average number of optotype tests performed for each method at different visual acuity levels is detailed in Table 3. Table 3 shows that the NBC-based visual acuity threshold determination mechanism required the fewest tests across all visual acuity levels. On average, each of the 62 tested eyes required only 9.01 optotype tests, while the traditional line-by-line scanning method required 27.85 tests. Therefore, the NBC-based visual acuity threshold determination mechanism has a significant advantage in improving efficiency, greatly reducing the number of optotype tests. In terms of overall testing time, it is reduced by approximately 68% compared to the traditional line-by-line scanning method ((27.85-9.01) / 27.85).
[0145] Table 3
[0146]
[0147]
[0148] The above are merely preferred embodiments of the present invention and are not intended to limit the present invention. Various modifications and variations can be made to the present invention by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.
Claims
1. A rapid and accurate automated vision testing method, characterized in that, Includes the following steps: The process control module is started, and the process control module sends the first instruction to the auxiliary function module to activate the auxiliary function module. The auxiliary function module drives the corresponding prompting device to prompt the user under test, and after setting the visual acuity level to be tested, it enters the visual acuity test loop process. The process control module sends a second instruction to the visual target display module. After receiving the second instruction, the visual target display module drives the display screen to display visual targets of the corresponding size according to the set visual acuity level to be tested, and provides visual target stimulation to the tested user. The process control module sends a third instruction to the gesture recognition module. After receiving the third instruction, the gesture recognition module drives the camera to capture gesture images. After receiving prompts and visual stimuli, the tested user makes a corresponding static gesture; The gesture recognition module calls the gesture recognition algorithm of a single visual target to detect the reaction time of the tested user to the visual target stimulus and recognizes the collected gesture images, generates detection and recognition results and sends them to the process control module; After receiving the detection and recognition results, the process control module calls the vision threshold determination mechanism to determine whether to continue the vision detection loop. If the loop continues, the vision detection loop is repeated. If the loop ends, the vision threshold is determined. Under the visual acuity level corresponding to the determined visual acuity threshold, multiple optotype tests are performed to determine the optotype level. The determined optotype level is the actual visual acuity value of the tested user. A visual acuity threshold determination mechanism based on a Naive Bayes classifier is used to determine the visual acuity threshold. This mechanism includes the following steps: The gesture recognition module is executed, and the output results include a feature vector containing parameters of the test user's reaction time to visual target stimuli and the gesture recognition results. Update the boundary difference of the visual threshold; Determine whether the boundary difference of the updated vision threshold is equal to the spacing of the set vision threshold. If yes, output the vision threshold and end. Otherwise, determine whether the gesture recognition result is equal to 1. If it is equal, execute the Naive Bayes classifier, output the current estimated vision level, and calculate the next vision level to be tested using the summary formula. If it is not equal, directly calculate the next vision level to be tested using the summary formula. After calculating the next visual acuity level to be tested by summarizing the formula, return to the gesture recognition module until the boundary difference of the updated visual acuity threshold is equal to the spacing of the set visual acuity threshold. The next visual acuity level test to be conducted The formula for its calculation is summarized as follows: ,in, This indicates the gesture recognition result. This indicates the current visual acuity level. This indicates the visual acuity level from the previous test. This indicates the next vision level that will be tested. This indicates the currently estimated visual acuity level; the output visual acuity threshold is represented as follows: , in, The test is now complete. A set of visual acuity levels The test is now complete. A set of visual acuity levels.
2. A quick and accurate automated visual detection method as claimed in claim 1, wherein, Before starting the process control module, the following is also included: The user's eyes and the target displayed on the screen are at the same horizontal plane, and the horizontal distance between the user's eyes and the target is set as the detection distance; The camera is placed directly facing the user's gesture, and the horizontal distance between the camera and the user's hand is set as the camera's working distance. When the camera captures gesture images, only one hand is allowed to appear in the camera's field of view.
3. A quick and accurate automated visual detection method as claimed in claim 2, wherein, After activating the auxiliary function module, the parameter input interface will appear, where the user will be waiting to input the necessary parameters, including whether the user is nearsighted, the most recent vision test value before the current vision test time, and the test distance.
4. The rapid and accurate automated vision testing method according to claim 2, characterized in that, After receiving the instruction, the optotype display module drives the display screen to display an optotype of the corresponding size according to the set visual acuity level to be tested. The direction of the optotype is randomly generated during the display process, and the size of the optotype is set according to the detection distance.
5. The quick and accurate automated visual detection method of claim 1, wherein, The auxiliary function module drives the corresponding prompting device to provide prompts to the user being tested, including: The auxiliary function module drives the voice prompt device to provide voice prompts to the user under test, and / or the auxiliary function module drives the text prompt device to provide text prompts to the user under test.
6. The quick and accurate automated visual detection method of claim 1, wherein, The gesture recognition module calls a gesture recognition algorithm for a single visual target to detect the user's reaction time to the visual target stimulus and recognizes the acquired gesture images, including: Define static gesture rules; Identify static gestures from a single gesture image; Detecting the reaction time of the tested user to visual target stimuli during gesture recognition; Recognize gestures for individual visual targets; The accuracy of static gesture recognition and vision testing was evaluated.
7. A fast and accurate automated visual acuity detection system for use in a fast and accurate automated visual acuity detection method according to any one of claims 1-6, characterized in that, include: The auxiliary function module is used to drive the corresponding prompting device to prompt the user after the auxiliary function module is activated, and to enter the vision test cycle after setting the vision level to be tested. The optotype display module is used to drive the display screen to display optotypes of corresponding size according to the set visual acuity level to be tested after receiving the second instruction, so as to stimulate the user with optotypes. The gesture recognition module is used to drive the camera to capture gesture images after receiving a third instruction. It is also used to detect the user's reaction time to the visual stimulus after the user receives a prompt and visual stimulus and makes a corresponding static gesture. The module calls the gesture recognition algorithm of a single visual stimulus to recognize the captured gesture image, generate detection and recognition results and send them to the process control module. The process control module is used to send a first instruction to the auxiliary function module after the process control module is started, thereby activating the auxiliary function module; and to send a second instruction to the visual target display module; and to send a third instruction to the gesture recognition module. After receiving the detection and recognition results sent from the gesture recognition module, it calls the vision threshold determination mechanism to determine whether to continue the vision detection loop. If the loop continues, the vision detection loop is repeated. If the loop ends, the vision threshold is determined. Under the vision level corresponding to the determined vision threshold, multiple optotype detections are performed to determine the optotype level. The determined optotype level is the actual vision value of the tested user.