A method, apparatus, system and storage medium for obtaining distance

By acquiring facial height, pupil distance, and facial pose feature values, and combining these with focal length to calculate the distance between the face and the display panel, the distance error caused by facial pose changes in existing technologies is solved, achieving more accurate distance assessment.

CN116086396BActive Publication Date: 2026-06-26BEIJING JIGAN TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING JIGAN TECH CO LTD
Filing Date
2021-11-08
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

In existing technologies, the calculation of the distance from a person's eye to the display screen using face detection and eye localization methods suffers from significant errors, especially when the face pose changes, leading to inaccurate distance assessment.

Method used

By acquiring face height, pupil distance, and face pose feature values, and combining them with focal length to calculate the distance between the face and the display panel, the degree of face rotation is identified using face pose feature values. An appropriate pupil distance or face height is selected for calculation, and an accurate focal length is obtained through calibration.

Benefits of technology

It improves the accuracy and stability of distance calculation between the face and the display panel, reduces errors caused by changes in face posture, and ensures the accuracy of distance assessment and the stability of the system.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116086396B_ABST
    Figure CN116086396B_ABST
Patent Text Reader

Abstract

Embodiments of the present application provide a method, device, system and storage medium for obtaining distance, the method comprising: obtaining a face height, a pupil distance and a face posture feature value of a jth object on an ith frame image, wherein the face posture feature value is used to represent a face rotation attribute of the jth object, i and j are integers greater than or equal to 1; if an image distance is selected from the face height and the pupil distance according to the face posture feature value, then a distance between a face of the jth object and a display panel is calculated according to at least the image distance and a focal length; if the image distance is not obtained according to the face posture feature value, then the above process is repeated for an i+1th frame image to obtain the distance between the face of the jth object and the display panel. The present application effectively overcomes the problem of large distance error caused by calculating the distance between the eyes and the display screen according to the pupil distance in any case.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This application relates to the field of distance measurement, and more specifically, embodiments of this application relate to a method, apparatus, system, and storage medium for obtaining distance. Background Technology

[0002] When watching content on a display screen such as a television, some people (e.g., children) may sit very close to the screen, which is an unhealthy viewing habit. Therefore, it is very important to detect the distance between the viewer and the display panel and to remind them when they are too close.

[0003] For example, in some related technologies, human experience is needed to roughly determine whether the distance between the viewer and the display screen is too close. Specifically, each person uses their own experience to determine whether the viewer is too far or too close to the display screen, and then guides the viewer (e.g., a child) to adjust the distance to better protect their eyesight.

[0004] In other related technologies, the pupil distance is obtained through face detection and eye localization methods. Then, the distance from the human eye to the screen is calculated by combining the calculated pupil distance with the camera focal length based on the similarity of triangles. However, the inventors found that the distance from the human eye to the display screen obtained by this method has a very large error in some cases.

[0005] Therefore, how to accurately assess the distance between the human eye and the display screen has become an urgent technical problem to be solved. Summary of the Invention

[0006] The purpose of this application is to provide a method, apparatus, system, and storage medium for obtaining distance. These embodiments of the application can at least effectively solve the problem of inaccurate distance between the face and the display panel caused by changes in face posture, thereby improving the accuracy and stability of the system.

[0007] In a first aspect, some embodiments of this application provide a method for obtaining distance, the method comprising: obtaining the face height, pupil distance, and face pose feature value of the j-th object in the i-th frame image, wherein the face pose feature value is used to characterize the face rotation attribute of the j-th object, and i and j are integers greater than or equal to 1; if it is determined from the face height and the pupil distance to be selected as the image distance based on the face pose feature value, then the distance between the face of the j-th object and the display panel is calculated at least based on the image distance and the focal length, wherein the focal length is an attribute parameter value of the image acquisition unit used to acquire the image of the j-th object; if the image distance is not obtained based on the face pose feature value, then the above process is repeated for the (i+1)-th frame image to obtain the distance between the face of the j-th object and the display panel.

[0008] Some embodiments of this application can identify various situations of face rotation degree by introducing face pose feature values, and select to calculate the distance between the human eye and the display screen based on the pupil distance or face height according to the degree of face rotation. This effectively overcomes the defect that the distance calculated based on the pupil in all cases results in a large error between the calculated distance and the actual distance.

[0009] In some embodiments, the facial pose feature value includes a horizontal angle value, wherein the horizontal angle value is used to characterize the angle of left and right rotation of the face of the j-th object; confirming the selection of the face height and the pupil distance as the image distance based on the facial pose feature value includes: if it is confirmed that the horizontal angle value is less than or equal to a horizontal angle threshold, then confirming the selection of the pupil distance as the image distance; calculating the distance between the face and the display panel based at least on the image distance and the focal length includes: calculating the distance between the face and the display panel based on the actual pupil distance value, the pupil distance and the horizontal focal length, wherein the horizontal focal length is obtained through calibration.

[0010] In some embodiments of this application, the pupil distance is used to calculate the distance between the face and the display panel only when the angle of left and right rotation of the face is small, based on the facial pose feature value. This ensures the accuracy of the distance between the face and the display screen calculated using the pupil distance.

[0011] In some embodiments, the distance is calculated using the following formula:

[0012] Z = fx * GK / G'K'

[0013] Wherein, fx is used to characterize the horizontal focal length, which is obtained by taking multiple frames of images from a calibration plate with a fixed width through multiple shots by the image capturing unit, and the distance between the calibration plate and the image capturing unit is different each time. G'K' is used to characterize the pupil distance, and GK is used to characterize the actual pupil distance value.

[0014] Some embodiments of this application provide a scheme for calculating the distance between a face and a display screen based on the pupil distance and the focal length obtained by the calibration method. This scheme not only quantifies the distance between the face and the display screen, but also improves the accuracy of the distance between the face and the display panel because the focal length obtained by the calibration method is more accurate.

[0015] In some embodiments, obtaining the pupil distance of the j-th object in the i-th frame image includes: obtaining the left eye pupil point and the right eye pupil point of the j-th object in the i-th frame image according to the pupil recognition model; and calculating the Euclidean distance between the left eye pupil point and the right eye pupil point to obtain the pupil distance.

[0016] Some embodiments of this application use a technical solution to determine pupil distance by calculating the Euclidean distance between the left and right pupil points on an image. This method effectively reduces the amount of data calculation and improves data processing speed compared to related techniques for determining pupil distance.

[0017] In some embodiments, the Euclidean distance is obtained by calculating the pixel distance between the left pupil point and the right pupil point.

[0018] Some embodiments of this application obtain the pupil distance by directly calculating the pixel difference between two points on the image, which requires less computation and saves more resources.

[0019] In some embodiments, obtaining the pupillary distance based on the Euclidean distance between the pupil points of the left and right eyes includes: calculating the Euclidean distance between the pupil points of the left and right eyes to obtain the initial pupillary image distance; and correcting the initial pupillary image distance based on the value of the horizontal angle to obtain the pupillary distance.

[0020] Some embodiments of this application correct the pupil distance obtained from the initial pupil image distance based on the pixel difference by using the horizontal angle value. This can effectively eliminate the error caused by directly calculating the pupil distance based on the pixel distance in the image after the face is rotated left and right.

[0021] In some embodiments, the initial pupillary distance is corrected using the following formula:

[0022] L=L'*cosβ

[0023] Where L' represents the initial image distance of the pupil, and β represents the value of the horizontal angle.

[0024] Some embodiments of this application provide a technical solution for correcting the initial image distance of the pupil using the cosine value of the horizontal angle, thereby realizing the quantization from the initial image distance of the pupil to the pupil distance.

[0025] In some embodiments, the actual pupil distance value is a statistically derived value of the regularity of human eye pupil distance.

[0026] Some embodiments of this application provide a simple way to obtain the actual pupil distance value, that is, to use the empirical value as the actual pupil distance value, thereby improving the versatility of the technical solution.

[0027] In some embodiments, the facial pose feature value further includes a vertical angle value, which is used to characterize the angle of vertical rotation of the face of the j-th object; the step of determining whether to select one of the face height and the pupil distance as the image distance based on the facial pose feature value includes: if it is confirmed that the horizontal angle is greater than the horizontal angle threshold and the vertical angle value is less than the vertical angle threshold, then the face height is confirmed as the image distance; the step of calculating the distance between the face and the display panel based at least on the image distance and the focal length includes: calculating the distance between the face and the display panel based on the actual face height value, the face height, and the vertical focal length, wherein the vertical focal length is obtained through calibration.

[0028] In some embodiments of this application, when it is confirmed that the left and right rotation angles are greater than a set value and the up and down rotation angles are not too large, the distance between the face and the display panel is calculated using the face height value. On the one hand, the reason why the pupil distance is not used to calculate the distance when the left and right rotation angles of the face are large is because if the pupil distance is used, the distance error will be very large. On the other hand, since the vertical angle is small, the error will be very small when the distance is calculated using the face height, thus improving the accuracy of the distance between the face and the display panel obtained when the face is in this posture.

[0029] In some embodiments, the distance is calculated using the following formula:

[0030] Z = fy * EF / E'F'

[0031] Wherein, fy is used to characterize the vertical focal length, which is obtained by taking multiple frames of images from a calibration plate with a fixed height through multiple shots by the image capturing unit, and the distance between the calibration plate and the image capturing unit is different each time. E'F' is used to characterize the height of the face, and EF is used to characterize the actual height value of the face.

[0032] In some embodiments of this application, the vertical focal length when determining the distance between the face and the display screen based on the face height is obtained through calibration.

[0033] In some embodiments, the face height is the height of the bounding box of the face of the j-th object obtained by a face detection algorithm on the i-th frame image.

[0034] Some embodiments of this application use the height of the rectangle outside the face obtained by the face detection algorithm as the value of the face height, which is simple to implement and requires less computation.

[0035] In some embodiments, the face height value is obtained by: obtaining the face bounding box of the j-th object on the i-th frame image using a face detection algorithm; correcting the height of the face bounding box according to the value of the vertical angle to obtain the face height.

[0036] The embodiments of this application correct the height of the rectangular frame of the face by using the value of the vertical angle, thereby obtaining a more accurate face height value and improving the accuracy of the distance between the face and the display screen.

[0037] In some embodiments, the method further includes: extracting the facial key points of the j-th object according to a facial key point detection model; wherein the facial pose feature value is obtained by inputting the facial key points and the facial bounding box into the facial pose acquisition model.

[0038] Some embodiments of this application obtain facial pose feature values ​​through a facial pose acquisition algorithm.

[0039] In some embodiments, the method further includes: obtaining a historical distance value between the j-th object and the display panel through the previous frame or multiple previous frames of the i-th frame image; the step of calculating the distance between the face of the j-th object and the display panel based at least on the image distance and focal length includes: obtaining an i-th distance value between the face of the j-th object and the display panel based on the i-th frame image; and obtaining the distance based on the i-th distance value and the historical distance value.

[0040] Some embodiments of this application obtain the final distance by fusing the current i-th distance and the historical distance value, which can effectively suppress the defect of sudden distance changes caused by changes in the face angle at the same location.

[0041] In some embodiments, obtaining the distance based on the i-th distance and the historical distance value includes: taking a weighted average of the i-th distance value and the historical distance value to obtain the distance.

[0042] Some embodiments of this application fuse the i-th distance and historical distance values ​​through a weighted average method. This not only effectively suppresses the defect of abrupt distance changes caused by changes in the face angle at the same location, but also allows for more flexible adjustment of the weight of each distance in the final distance, thereby improving the versatility of the technical solution of this application.

[0043] In some embodiments, the method further includes: storing the historical distance values ​​through a sliding window; wherein obtaining the distance based on the i-th distance value and the historical distance values ​​includes: storing the i-th distance value in the sliding window; and calculating the average of all data stored in the sliding window to obtain the distance.

[0044] Some embodiments of this application employ a sliding window that can trigger the mean calculation operation immediately when the sliding window is full, thereby improving data processing speed. Using the average value as the final distance requires less computation than the weighted average method, resulting in better resource consumption and faster processing speed.

[0045] In some embodiments, the method further includes generating a warning message when it is confirmed that the distance is less than or equal to a set distance threshold.

[0046] Some embodiments of this application automatically alert users to objects that are too close to the display screen by generating warning messages.

[0047] In some embodiments, before obtaining the face height, pupil distance, and face pose feature values ​​of the j-th object in the i-th frame image, the method further includes: confirming that all objects in the i-th frame image are detected; confirming that the face key points of all objects are extracted, wherein the face key points include at least the left eye pupil point and the right eye pupil point; and confirming that the face pose feature values ​​of all objects are obtained.

[0048] Some embodiments of this application can ensure that all people watching smart terminal programs are detected, and ensure that all objects that are too close to the display screen are detected in real time.

[0049] Secondly, some embodiments of this application provide an apparatus for obtaining the distance between a face and a display panel. The apparatus includes: a feature acquisition module configured to acquire the face height, pupil distance, and face pose feature value of a j-th object in the i-th frame image, wherein the face pose feature value is used to characterize the face rotation attribute, and i and j are integers greater than or equal to 1; and a distance acquisition module configured to perform the following operations: if it is determined from the face height and the pupil distance to be selected as the image distance based on the face pose feature value, then the distance between the face of the j-th object and the display panel is calculated based at least on the image distance and the focal length; if the image distance is not acquired based on the face pose feature value, then the distance between the face of the j-th object and the display panel is obtained by acquiring the (i+1)-th frame image.

[0050] Thirdly, some embodiments of this application provide a smart device, the smart device comprising: a display panel; a frame capture module configured to acquire frames of images, wherein the frames of images are captured by an image acquisition unit of a viewer watching a program on the display panel; a feature extraction module configured to detect faces in the frames of images, obtain bounding boxes for each face, extract key point features from the detected faces, and obtain face pose feature values ​​based on the face detection boxes and the key point features; and a distance detection module configured to perform the following operations on each object detected in each frame of images: acquire the face height, pupil distance, and face pose of the j-th object in the i-th frame of images. The facial pose feature value is used to characterize the facial rotation attribute of the j-th object, where i and j are integers greater than or equal to 1. If the facial pose feature value determines that the face height and the pupil distance are selected as the image distance, then the distance between the face of the j-th object and the display panel is calculated based at least on the image distance and the focal length. If the image distance is not obtained based on the facial pose feature value, then the above process is repeated for the (i+1)-th frame image to obtain the distance between the face of the j-th object and the display panel. The reminder module is configured to generate and provide a warning message when the calculated distance between the face and the display panel is greater than or equal to a set distance threshold.

[0051] In some embodiments, the feature extraction module further includes: a face detection module, configured to detect all faces in each frame image to obtain face bounding boxes for all faces; a key point detection module, configured to extract face features within the face bounding boxes to obtain face key points; and a face pose feature detection module, configured to obtain face pose feature values ​​based on the face bounding boxes and the face key points.

[0052] In some embodiments, the distance detection module includes: an interpupillary distance measuring module configured to calculate the pupil distance between the left and right pupil points of the j-th object in the i-th frame image; a face measuring module configured to determine the distance between the face and the display panel based on either the pupil distance or the face height; and a distance smoothing module configured to smooth the distance calculated by the face measuring module to obtain a fused distance, so that the alert module can determine whether to generate a warning message based on the fused distance.

[0053] In some embodiments, the distance detection module further includes an angle correction distance module, configured to: correct the Euclidean distance between the obtained left eye pupil point and the right eye pupil point according to the value of the horizontal angle to obtain the pupil distance, or correct the height of the obtained face bounding box according to the value of the vertical angle to obtain the face height.

[0054] Thirdly, some embodiments of this application provide a system comprising one or more computers and one or more storage devices storing instructions, which, when executed by the one or more computers, cause the one or more computers to perform operations according to the corresponding methods described in the first aspect.

[0055] Fourthly, some embodiments of this application provide one or more computer storage media for storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations according to the corresponding methods described in the first aspect.

[0056] Fifthly, some embodiments of this application provide a computer program product, the computer program product including a computer program, which, when executed by a processor, implements the distance acquisition method described in any of the embodiments of the first aspect above. Attached Figure Description

[0057] To more clearly illustrate the technical solutions of the embodiments of this application, the accompanying drawings used in the embodiments of this application will be briefly introduced below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.

[0058] Figure 1 This is a schematic diagram of the composition of the distance acquisition system provided in the embodiments of this application;

[0059] Figure 2 One of the flowcharts for the method of obtaining distance provided in the embodiments of this application;

[0060] Figure 3 This is a schematic diagram illustrating the principle of obtaining focal length and ranging in an embodiment of this application.

[0061] Figure 4 This is a block diagram illustrating the composition of a smart device provided in an embodiment of this application.

[0062] Figure 5 A second flowchart illustrating the method for obtaining distance provided in this application embodiment;

[0063] Figure 6The third flowchart of the method for obtaining distance provided in the embodiments of this application;

[0064] Figure 7 A schematic diagram of facial pose feature values ​​provided in the embodiments of this application;

[0065] Figure 8 The facial landmark map obtained by the facial landmark detection algorithm is provided in the embodiments of this application;

[0066] Figure 9 A schematic diagram illustrating the correction of pupillary distance provided in an embodiment of this application;

[0067] Figure 10 The fourth flowchart of the method for obtaining distance provided in the embodiments of this application;

[0068] Figure 11 This is a schematic diagram of a sliding window structure provided in an embodiment of this application;

[0069] Figure 12 A block diagram of the apparatus for obtaining distance provided in an embodiment of this application. Detailed Implementation

[0070] The technical solutions in the embodiments of this application will now be described with reference to the accompanying drawings.

[0071] It should be noted that similar reference numerals and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures. Furthermore, in the description of this application, terms such as "first," "second," etc., are used only to distinguish descriptions and should not be construed as indicating or implying relative importance.

[0072] As can be seen from the background section, the inventors of this application have discovered that the method of directly using pupil distance to obtain the distance between a person's face and the display screen does not take into account the rotation of the person's face relative to the display screen. Therefore, when the face rotates, the error in calculating the distance from the person's eye to the display screen using pupil distance is very large. Under normal circumstances, the viewer's face cannot be strictly parallel to the panel where the display screen is located, which severely limits the application scenarios of the relevant technical solution.

[0073] Unlike related technical solutions that directly determine the distance between the face and the display panel based on pupil distance, some embodiments of this application also introduce facial pose feature values ​​(used to reflect the rotation of the viewer's face when viewing the display screen) to determine which image distance and actual distance to use to obtain the distance between the face and the display panel. Because the embodiments of this application consider facial pose feature values, the problem of inaccurate distance measurements between the face and the display panel caused by changes in facial pose features can be effectively solved, improving system accuracy and stability.

[0074] Please refer to Figure 1 , Figure 1 The following are application scenario diagrams of some embodiments of this application provided as examples. The diagrams illustrate a system for obtaining distances, which can determine the distances between the first user 101 and the second user 102 who are viewing content on the display screen and the display panel 201.

[0075] exist Figure 1 The system includes a camera 301, an electronic device with a display panel 201, and a server 401.

[0076] In some embodiments of this application, the camera 301 is capable of capturing images of a certain area located in front of the display panel 201 (e.g., Figure 1 The server 401 captures images of the viewing object within the dotted-line area shown in the diagram and transmits each captured frame to the server 401. This allows the server 401 to analyze each frame to determine the distance between each user or viewing object and the display panel 201. For example, the server 401 determines a first distance D1 between the first user 101 and the display panel 201 and a second distance D2 between the second user 102 and the display panel 201 by analyzing each frame. In other embodiments of this application, the server 401 can also generate a warning signal based on the confirmed distance between each user and the display panel 201 to alert users who are too close to the display panel.

[0077] In some embodiments of this application, the electronic device with the display panel 201 is a smart device such as a smart TV. This smart TV includes a memory and a processor, which can analyze each frame of images captured by the camera to obtain the distance between each user and the display panel. That is, in these embodiments, a separate server 401 is not required; the smart device itself can analyze each frame of images to obtain the distance between the viewing object and the display panel. In some embodiments of this application, these smart devices can generate warning messages based on the distance information between each user and the display panel to alert users who are too close to the display panel, thereby protecting their eyes.

[0078] It should be noted that, in Figure 1 In this embodiment, the image capturing unit (or image acquisition unit) such as the camera 301 is located on the same plane as the display panel. For example, the image capturing unit is positioned above the display panel 201. In other embodiments of this application, the image capturing unit may also be positioned below or on the display panel. Because the camera is positioned around or above the display panel, the distance between the camera and the display panel is much smaller than the distance between the viewing object and the display panel. Therefore, in some embodiments of this application, the distance between each viewing object and the camera is directly used as the distance between the viewing object and the display panel. In other embodiments of this application, the distance between the viewing object and the camera can also be converted into the distance between the viewing object and the display panel according to relevant mathematical algorithms or formulas.

[0079] The following is combined with Figure 2 Exemplary illustration by Figure 1 The server or by Figure 1 The method for obtaining distance provided in some embodiments of this application is executed by a smart device.

[0080] like Figure 2 As shown, some embodiments of this application provide a method for obtaining distance, the method comprising: S101, obtaining the face height, pupil distance, and face pose feature value of the j-th object in the i-th frame image, wherein the face pose feature value is used to characterize the face rotation attribute of the j-th object, and i and j are integers greater than or equal to 1. S102, if it is determined from the face height and the pupil distance to be selected as the image distance based on the face pose feature value, then the distance between the face of the j-th object and the display panel is calculated based at least on the image distance and the focal length, wherein the focal length is an attribute parameter value of the image acquisition unit used to acquire the image of the j-th object. S103, if the image distance is not obtained based on the face pose feature value, then the above process is repeated for the (i+1)-th frame image to obtain the distance between the face of the j-th object and the display panel. Some embodiments of this application can identify various situations of face rotation degree by introducing face pose feature values, and select to calculate the distance between the human eye and the display screen based on the pupil distance or face height according to the degree of face rotation. This effectively overcomes the problem that the distance calculated based on the pupil in all cases may result in a large error between the calculated distance and the actual distance.

[0081] The face height obtained in S101 is the height value of the face of each viewing object in the captured image, in pixels. In some embodiments of this application, this face height is equal to the height of the face bounding box obtained by the face detection algorithm. In other embodiments of this application, this face height is a corrected value obtained after correcting the face bounding box according to the face rotation angle.

[0082] The pupil distance involved in S101 (i.e., the pupil distance on the image) is also the distance between the two pupils of any viewing object in the captured image, measured in pixels. In some embodiments of this application, the pupil distance is the Euclidean distance between the left and right pupil points on the image. In some embodiments of this application, the pupil distance is a corrected value obtained by correcting the Euclidean distance based on the face rotation angle.

[0083] S101 involves facial pose feature values ​​used to characterize the face during viewing. Figure 1 The angle at which the content displayed on the display screen is rotated. In some embodiments of this application, the facial pose feature value includes at least one of the left-right rotation angle value and the up-down rotation angle value of the face viewing the display panel.

[0084] In actual execution, only one of processes S102 and S103 is executed. For example, when it is confirmed that the facial pose feature value meets the set requirements, the distance between the face and the display panel will be calculated by selecting one of the two parameters, pupil distance or face height. If it is determined that neither of these two parameters can be used as parameters for calculating the distance between the face and the display panel based on the facial pose feature value, the next frame of the current frame will be read to obtain the distance between the face and the display panel.

[0085] It is understandable that in order to execute the S101 process, it is necessary to obtain the face height, pupil distance, and face pose feature values ​​in advance. In some other embodiments of this application, in order to promptly remind users who are too close to the display panel to move their position, after obtaining the distance between the face and the display panel, a process of generating a warning message and providing the warning message to the viewer may also be included. In order to detect the distance between all viewers and the display panel as much as possible, in some embodiments of this application, the distance acquisition method in some embodiments of this application further includes: confirming that all objects on the i-th frame image are detected; confirming that the facial key points of all objects are extracted, wherein the facial key points include at least the left eye pupil point and the right eye pupil point; and confirming that the face pose feature values ​​of all objects are obtained.

[0086] The following example illustrates the process of determining the distance between a face and a display panel based on pupil distance, using the distance between the face of the j-th object in the i-th frame image as an example.

[0087] To determine whether pupil distance can be used as the image distance for calculating the distance between the face and the display panel, in some embodiments of this application, the method for obtaining the distance includes: S101, obtaining the face height, pupil distance, and horizontal angle values ​​(i.e., the face pose feature values ​​include the horizontal angle value) of the j-th object in the i-th frame image, where i and j are integers greater than or equal to 1, and the horizontal angle value is used to characterize the angle of left-right rotation of the j-th object's face. For example, the vertical plane connecting the face and the camera is 0 degrees, and the horizontal angle is the angle relative to this plane. When the face and the camera are on the same horizontal plane, the vertical (up-down) angle is 0 degrees, and the up-down angle is relative to this horizontal plane. That is, when the person is facing the vertical plane connecting the center of the camera, it is 0 degrees, and the value of the horizontal angle is greater than or equal to -90 degrees and less than or equal to +90 degrees. The corresponding process S102, which involves selecting one of the face height and the pupil distance as the image distance based on the face pose features, includes, for example, selecting the pupil distance as the image distance if the horizontal angle value is less than or equal to a horizontal angle threshold. The process S103, which involves calculating the distance between the face and the display panel based at least on the image distance and focal length, includes, for example, calculating the distance between the face and the display panel based on the actual pupil distance value, the pupil distance, and the horizontal focal length, wherein the horizontal focal length is obtained through calibration.

[0088] In other words, some embodiments of this application only use pupil distance to calculate the distance between the face and the display panel when the angle of left and right rotation of the face is small based on the value of the horizontal angle. This ensures the accuracy of calculating the distance between the face and the display screen using pupil distance.

[0089] In some embodiments of this application, S102 calculates the distance between the face and the display panel based on the pupil distance and the following formula:

[0090] Z = fx * GK / G'K' (1)

[0091] Wherein, fx is used to characterize the horizontal focal length, which is obtained by taking multiple frames of images from a calibration plate with a fixed width through multiple shots by the image capturing unit, and the distance between the calibration plate and the image capturing unit is different each time. G'K' is used to characterize the pupil distance, and GK is used to characterize the actual pupil distance value.

[0092] The following is combined with Figure 3 The process of obtaining the horizontal focal length fx through calibration is illustrated by example, and the principle of distance measurement is explained.

[0093] Figure 3Based on the pinhole imaging principle, the imaging target is an object MN (corresponding to the actual distance between the pupils during actual ranging, i.e., the actual pupil distance value, or the actual height of the face, i.e., the actual face height value). The object MN passes through the pinhole plane O1, and its projection onto the imaging plane O2 is M'N' (corresponding to the pupil distance or face height during actual ranging). Figure 3 The focal length of the central camera is f, and the distance between the object MN and the pinhole plane O1 is Z (corresponding to the distance between the face and the display panel during actual distance measurement). Figure 3 To calculate the target distance Z, based on the principle of triangle similarity, we obtain the following formula: Z = f * MN / M'N'. If the object MN is represented by the pupillary distance, then MN is the actual pupillary distance length of the human eye (i.e., the actual pupillary distance value), and M'N' is the distance between the pupils on the imaging plane (i.e., the pupillary distance). The actual pupillary distance of the human eye can be characterized by a statistically obtained empirical value X. For example, the actual pupillary distance of an adult's eye is X = 65 mm. Figure 3 The imaging distance of the interpupillary distance is M'N' = P. It's easy to understand that knowing the camera's focal length f allows us to obtain Z (i.e., the distance between the face and the display panel). In some embodiments of this application, the camera's focal length f is obtained through calibration, i.e., f = M'N' * Z / MN. Specifically, in some embodiments of this application, a calibration plate of fixed width MN is used. The calibration plate is placed vertically, and then the camera is used to photograph the calibration plate at different distances (i.e., Z). The number of pixels occupied by the width in the captured image is then obtained; this number is M'N'. The horizontal focal length fx can be obtained based on this width value M'N'. Having obtained the equivalent focal length through calibration, it's easy to understand that if the interpupillary distance is used to calculate the distance between the face and the display panel, then Z = fx * GK / G'K', where G'K' represents the interpupillary distance, and GK represents the actual interpupillary distance value. It should be noted that the center of the camera used to photograph the calibration board must be on the same line as the center of the calibration board, and the vertical plane of the camera must be parallel to the plane of the calibration board.

[0094] The process of obtaining pupil distance is illustrated below.

[0095] In some embodiments of this application, the process of obtaining the pupil distance of the j-th object on the i-th frame image before S101 includes, for example,: obtaining the left and right pupil points of the j-th object on the i-th frame image according to a pupil recognition model; and calculating the Euclidean distance between the left and right pupil points to obtain the pupil distance. For example, this Euclidean distance is obtained by calculating the pixel distance between the left and right pupil points. That is, in these embodiments, the Euclidean distance between the two pupils on the image is directly used as the pupil distance.

[0096] To further correct the defect of incorrect pupil distance caused by the rotation of the viewer's face relative to the display panel, in some embodiments of this application, the process of obtaining the pupil distance based on the Euclidean distance between the left and right pupil points described above includes, for example,: calculating the Euclidean distance between the left and right pupil points on the image to obtain the initial pupil image distance; correcting the initial pupil image distance based on the obtained horizontal angle value to obtain the pupil distance involved in S101. That is, in these embodiments, a horizontal angle value is also used to correct the Euclidean distance, and the corrected Euclidean distance is used as the pupil distance.

[0097] For example, in some embodiments of this application, in order to correct the large calculation error caused by directly using the Euclidean distance between the two pupils in the image as the pupil distance after face rotation, the initial pupil distance is corrected using the following formula:

[0098] L=L'*cosβ(2)

[0099] Where L' is used to characterize the initial image distance of the pupil (i.e., the pixel difference between the left and right pupils in the i-th frame image), and β is used to characterize the value of the obtained horizontal angle.

[0100] It should be noted that the embodiments of this application are not limited to using the cosine value to correct the initial pupil image distance. For example, in some other embodiments of this application, the pupil distance can also be obtained based on the initial pupil image distance and the sine value, tangent value, etc.

[0101] Similar to the practice of using empirical values ​​for the actual distance between pupils when describing the calibration method for obtaining horizontal focal length, in some embodiments of this application, the actual pupil distance value involved in the above formula (1) is a regular value of the human eye pupil distance obtained through statistical methods. It should be noted that in other embodiments of this application, the actual pupil distance value can also be the distance between the actual pupils of the viewer's two eyes (in units of length such as meters and centimeters) collected in advance. For example, if a family has three children, the actual distance between the pupils of these three children can be collected in advance and input into the storage unit of the smart TV through the interactive interface. When it is necessary to test the distance between a face and the display panel, the algorithm can read the features of the three people and obtain the corresponding actual pupil distance to calculate the distance between each child and the display panel.

[0102] The following example illustrates the process of calculating the distance between a face and a display panel based on the face height, using the distance between the face of the j-th object in the i-th frame image as an example.

[0103] For example, in some embodiments of this application, the method for obtaining distance includes: S101, obtaining the face height, pupil distance, horizontal angle value, and vertical angle value of the j-th object in the i-th frame image, where i and j are integers greater than or equal to 1, the horizontal angle value is used to characterize the angle of left-right rotation of the face, and the vertical angle value is used to characterize the angle of up-down rotation of the face. S102 involves the process of determining one of the face height and the pupil distance as the image distance based on the face pose feature value, which includes: if it is confirmed that the horizontal angle is greater than the horizontal angle threshold and the vertical angle value is confirmed to be less than the vertical angle threshold, then the face height is confirmed to be selected as the image distance. S103 involves the process of calculating the distance between the face and the display panel based at least on the image distance and the focal length, which includes, for example, calculating the distance between the face and the display panel based on the actual face height value, the face height, and the vertical focal length, wherein the vertical focal length is obtained through calibration. It should be noted that the methods for obtaining the actual face height value include: In some embodiments of this application, the height from the forehead to the chin can be measured using a ruler or other measuring tools as a rough value of the actual face height. In some embodiments of this application, the distance between the face and the screen calculated using the interpupillary distance of the face on the front-facing display screen is used as a known condition z. Combined with the pixel height m'n' of the face frame detected by the face detection and the equivalent focal length fy (i.e., the vertical focal length), the actual face height value is calculated as: mn = z * m'n' / fy. In some embodiments of this application, the actual face height value mn = z * m'n' / fy is obtained through calibration. An image is captured at a fixed vertical camera distance z, and a face detection algorithm is used to detect the face frame height m'n'. Combined with the equivalent focal length fy (i.e., the vertical focal length), the actual face height value mn = z * m'n' / fy can be calculated.

[0104] In other words, some embodiments of this application use face height to calculate the distance between the face and the display panel when the left and right rotation angles are large and the up and down rotation angles are not too large. On the one hand, the reason why pupil distance is not used to calculate the distance when the left and right rotation angles of the face are large is because if pupil distance is used, the calculated distance between the face and the display panel will have a very large error. On the other hand, since the vertical angle is small, the error will be very small when using face height to calculate the distance, thus improving the accuracy of the distance between the face and the display panel.

[0105] In some embodiments of this application, S102 calculates the distance between the face and the display panel based on the face height and the following formula:

[0106] Z = fy * EF / E'F'(3)

[0107] Wherein, fy represents the vertical focal length, which is obtained by taking multiple frames of images from a calibration plate at a fixed height using an image capturing unit, with the distance between the calibration plate and the image capturing unit varying each time. E'F' represents the face height, and EF represents the actual face height value. In some embodiments of this application, the vertical focal length used to determine the distance between the face and the display screen based on the face height is obtained through a calibration method.

[0108] The following is combined with Figure 3 To further explain the process of obtaining the numerical focal length, it's easy to understand that knowing the camera's focal length *f* allows us to obtain *Z* (i.e., the distance between the face and the display panel). In some embodiments of this application, the camera's focal length *f* is obtained through calibration, i.e., *f* = *M'N'*Z / MN. Specifically, in some embodiments of this application, a calibration plate of fixed height *MN* is used. This calibration plate is placed vertically, and then the camera is used to photograph the calibration plate at different distances (i.e., *Z*). The number of pixels occupied by the height in the captured image is then obtained; this number is *M'N'*. Based on this height value *M'N'* and the actual height *MN* of the calibration plate, the vertical focal length *fy* can be obtained, meaning the equivalent focal length is obtained through the calibration method. It's easy to understand that if the face height is used to calculate the distance between the face and the display panel, then *Z* = *fy* *EF* / *E'F'*.

[0109] To reduce computational load and improve processing speed, in some embodiments of this application, the face height involved in formula (3) above is the height of the bounding box of the face of the j-th object obtained by the face detection algorithm on the i-th frame image. It is understood that the bounding box includes a rectangular frame surrounding the face, and the bounding box is obtained by inputting the i-th frame image into the model corresponding to the face detection algorithm. That is, in some embodiments of this application, the bounding box obtained by the face detection algorithm can be directly used as the face height. It should be noted that in some embodiments of this application, the vertical distance between the lowest and highest points output by key point detection in the image captured by the front-facing camera can also be used as the face height value.

[0110] Unlike examples where the face outline is directly used as the face height, in some embodiments of this application, to reduce the large error that occurs when using the face outline height as the face height due to the vertical rotation of the viewing object relative to the display panel, the face height value in the above formula (3) is obtained as follows: the face outline of the j-th object is obtained on the i-th frame image using a face detection algorithm; the height of the face outline is corrected according to the vertical angle value to obtain the face height. Embodiments of this application correct the height of the face outline (e.g., the face outer rectangle) using the vertical angle value to obtain a more accurate face height, thereby improving the accuracy of the distance between the obtained face and the display screen. In other words, in some embodiments of this application, the height value of the face outline needs to be corrected according to the vertical angle value, and the corrected height value of the face outline is used as the face height required by formula (3).

[0111] In some embodiments of this application, the method for obtaining distance further includes: extracting the facial key points of the j-th object according to a facial key point detection model; wherein, the facial pose feature value involved in S101 is obtained by inputting the facial key points and the facial bounding box into the facial pose acquisition model. It should be noted that the algorithm of the facial pose acquisition model can be obtained from the following literature: Fine-Grained Head Pose Estimation Without Keypoints, "img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation", "FSA-Net: Learning Fine-Grained Structure Aggregation for Head Pose Estimation from a SingleImage" or "WHENet: Real-time Fine-Grained Estimation for Wide Range HeadPose".

[0112] It should be noted that, regardless of whether the distance between the face and the display panel is calculated using pupil distance or face height, in order to effectively suppress the defect of abrupt distance changes caused by changes in the face angle at the same location, in some embodiments of this application, the method for obtaining the distance further includes: obtaining the historical distance value between the j-th object and the display panel through the previous frame or multiple previous frames of the i-th frame image. The corresponding process of S102 involving calculating the distance between the face of the j-th object and the display panel based at least on the image distance and focal length includes, for example,: obtaining the i-th distance value between the face of the j-th object and the display panel based on the i-th frame image; and obtaining the distance based on the i-th distance value and the historical distance value. For example, in some embodiments of this application, the process of obtaining the distance based on the i-th distance and the historical distance value includes, for example,: performing a weighted average of the i-th distance value and the historical distance value to obtain the distance. For example, some embodiments of this application fuse the i-th distance and the historical distance value using an average method, thereby improving the speed of data processing.

[0113] To store historical distance values ​​and instantly trigger an average calculation operation to obtain the smoothed distance between a face and the display panel, some embodiments of this application exemplarily include: storing the historical distance values ​​via a sliding window; wherein, obtaining the distance based on the i-th distance value and the historical distance value includes: storing the i-th distance value in the sliding window; and calculating the average of all data stored in the sliding window to obtain the distance. Some embodiments of this application use a sliding window to instantly trigger an average calculation operation when the sliding window is full, improving data processing speed.

[0114] To promptly alert viewers who are too close to the display panel, some embodiments of this application further include, when determining that the calculated distance between the face and the display panel exceeds a set distance threshold, generating and providing a warning message. For example, the warning message may be provided by playing it aloud or displaying it on the display panel; the examples in this application do not limit the method of notification.

[0115] The following example illustrates some embodiments of the distance acquisition method of this application using the distance between a face and the display screen of a smart device.

[0116] Some embodiments of this application involve smart devices (e.g., smart TVs) that process data in real time. Figure 1 The camera 301 previews the image, and by detecting the face of each person in the image, it extracts information such as facial key points and facial angles, and further calculates the distance between the face and the display screen.

[0117] like Figure 4 As shown, some embodiments of this application provide a smart device (e.g., a smart TV device) including a screenshot module 210, a feature extraction module 220, a distance detection module 230, and an alert module 218.

[0118] pass Figure 4 The four modules can efficiently and stably calculate the distance between a face and the camera, effectively solving the problem of inaccurate distance measurement caused by changes in facial posture, thus improving the system's accuracy and stability. When applied to smart TVs, it provides users with excellent distance detection capabilities, reminding them to use the TV safely and protect their eyesight.

[0119] Figure 4 The image capture module 210 is configured to acquire image frames from the current camera. These image frames can be used for previewing, and then these image frames are sent to the feature extraction module 220 for feature extraction.

[0120] Figure 4 The feature extraction module 220 may further include three sub-modules: a face detection module 211, a key point extraction module 212, and a face pose feature detection module 213. The face detection module 211 is configured to detect all faces in the image frames captured by the camera. The key point extraction module 212 is configured to calculate the key points of the face based on the face detection results. The specific key points of the face are as follows: Figure 8 As shown in the figure, 98 facial key points are displayed from pixel 1 to pixel 98, where pixel 97 and pixel 98 are the left and right pupil points of the image. The face pose feature detection module 213 is configured to calculate the face pose feature value based on the face detection result and the result of the key point extraction module 212, and finally output the results of these three sub-modules to the distance detection module 230.

[0121] The distance detection module 230 is configured to receive the input bounding rectangle of the face (obtained by the face detection module 211), the key points of the face (obtained by the key point extraction module 212), and the face pose feature values ​​(obtained by the face pose feature detection module 213), and to obtain the distance between the face and the display panel based on the input data, and then input the calculated distance to the reminder module 218. In some embodiments of this application, such as... Figure 4As shown, in some embodiments of this application, the distance detection module 230 further includes four sub-modules: an interpupillary distance measurement module 214, a face distance measurement module 215, an angle correction distance module 216, and a distance smoothing module 217. The interpupillary distance measurement module 214 is used to determine the Euclidean distance between pupils based on facial key points. The face distance measurement module 215 is used to obtain the face bounding box. The angle correction distance module 216 is used to correct the Euclidean distance based on the value of the horizontal angle to obtain the pupil distance or to correct the height value of the face bounding box based on the value of the vertical angle to obtain the face height. The distance smoothing module 217 is used to smooth the i-th distance and the historical distance value to obtain the final distance between the face and the display panel corresponding to the i-th frame image.

[0122] The alert module 218 is configured to compare a preset distance threshold with the calculated distance between the face and the display panel. When the calculated distance is less than or equal to the distance threshold, an alert message is generated to notify the relevant user. For example, the alert message can be output through sound, image, or text.

[0123] like Figure 5 As shown in the figure, this flowchart illustrates a method for obtaining the distance between a face and the display screen of a smart TV. The method includes the following steps:

[0124] S301, acquire a frame of image.

[0125] For example, using Figure 4 The image capture module 210 captures a single frame of image.

[0126] S302, Facial Feature Extraction.

[0127] use Figure 4 The feature extraction module 220 is used to extract facial features. As mentioned above, the feature extraction module includes a face detection module 211, a key point extraction module 212, and a face pose feature detection module 213.

[0128] The face detection module 211 uses a pre-trained model to detect all faces in the current frame. The keypoint extraction module 212 uses the pre-trained model to detect the keypoints of each face detected by the face detection module 211 and saves the landmarks of each keypoint. The face pose feature detection module 213 uses the pre-trained model, taking the outputs of the face detection module 211 and the keypoint extraction module 212 as input, to detect the face pose feature values.

[0129] In this step, the detection criteria for faces and the total number of faces to be detected can be set. If no face is detected or the number of detected faces is less than the set number, the subsequent steps are not performed. Instead, a new frame image is acquired, and the step continues. Specifically, as shown... Figure 6 As shown, this process includes, for example, the following steps: S401 Input a frame of image. S402 Detect all faces that meet the face detection criteria. S403 Determine if there is a face; if so, proceed to S404; otherwise, return to S401 and wait for the next frame of image input. S404 Detect facial landmarks. S405 Determine if any facial landmarks have been detected; if so, proceed to S406; otherwise, return to S401 and wait for the next frame of image input. S406 Detect facial pose features. S407 Determine if any facial pose features have been detected; if so, proceed to the next step in S408, i.e., input the distance detection module to obtain the distance between the face and the display panel; otherwise, return to S401 and wait for the next frame of image input.

[0130] It should be noted that the face detection criteria involved in S402 include, but are not limited to, situations where the size of the detected face is greater than a set threshold or the confidence level of the detected face is greater than a set threshold. Pre-trained models used for face detection include, but are not limited to, any deep learning-based model capable of face detection, any model capable of detecting facial landmarks (at least including the pupil point), and any model capable of detecting facial pose features (at least including the horizontal angle value yaw and the vertical angle value pitch). The facial landmark detection model can be one of the following: PFLD (A Practical Facial Landmark Detector), TCDCN (Tasks-Constrained Deep Convolutional Network), DAN (Deep Alignment Networks), or TCNN (Tweaked Convolutional Neural Networks).

[0131] S303: Determine whether the facial features have been successfully extracted. If so, proceed to S304; otherwise, proceed to S301 to obtain the next frame image.

[0132] S304, distance detection, which is the distance detection module calculates the distance between the face and the display panel based on the features extracted by the feature extraction module. As described above, the distance detection module includes: pupil distance measurement module 214, face distance measurement module 215, angle correction distance module 216 and distance smoothing module 217.

[0133] In some embodiments, the Euclidean distance between pupils in the image is used as the pupil distance, i.e., the pupil distance is obtained by the pupil distance measurement module 214. For example, using the results in S302 (which include the bounding rectangle of each face, facial landmarks, and facial pose feature values), the face height, pupil distance, horizontal angle yaw, and vertical angle pitch of each person can be obtained. Figure 7 Where yaw represents the angle of left-right rotation of the face, and pitch represents the angle of up-down rotation of the face. For example... Figure 8 This is a schematic diagram of key landmarks on a human face. After obtaining the landmarks, to calculate the pixel distance of the pupillary distance P, we only need to calculate... Figure 8 The distance between point 97(x97,y97) and point 98(x98,y98) is P = sqrt((x98-x97)*(x98-x97)+(y98-y97)*(y98-y97)), where "sqrt" represents the operation of taking the square root.

[0134] In some embodiments of this application, the height of the face bounding box is used as the face height.

[0135] Unlike examples that directly use Euclidean distance as pupil distance or use the height of the face bounding box as face height, in some embodiments of this application, the Euclidean distance corrected by the horizontal angle or the height of the face bounding box corrected by the vertical intersection value is used as pupil distance and face height, respectively.

[0136] For example, in some embodiments of this application, the Euclidean distance between the left and right pupils obtained from the image is input to the angle correction distance module for angle mapping processing, thus completing the correction process. The function of the angle correction distance module 216 can be combined with... Figure 9 To elaborate, in Figure 9 In the image, L represents the initial pupil image distance calculated based on the imaging plane, i.e. Figure 9 The pixel distance between the left pupil keypoint 97' and the right pupil keypoint 98' is equal to Figure 8 The pixel distance between the left pupil keypoint 97 and the right pupil keypoint 98 in the image, if the face has a yaw angle relative to the display panel, i.e., a horizontal angle of yaw = β, then the pupil distance should be... Figure 9The distance between 97” and 98” corresponds to L’ in the formula. To eliminate the error caused by the angle, the pupil distance is L’ = L / cosβ, where L is the initial pupil image distance. After angle mapping, the corrected Euclidean distance is obtained, and this corrected Euclidean distance is used as the pupil distance required in the above formula (1). In some other embodiments of this application, the distance between the face and the display panel may also be calculated based on the face height. In this case, the face height can be corrected by the value of the vertical angle to obtain the corrected face height value, which is then used as the specific value of the face height in the above formula (3), and the distance between the face and the display panel is calculated.

[0137] As an example, after obtaining the horizontal and vertical focal lengths through calibration, the distance between the face and the display panel can be calculated using the following process. For instance, after obtaining the equivalent focal lengths (including horizontal and vertical focal lengths) through calibration, if the distance detection module uses pupil distance to calculate the distance between the eyes and the display panel, the calculation formula is: Z = fx * MN / M'N'. If the face height is used to calculate the distance, Z = fy * EF / E'F'.

[0138] In other words, in some embodiments of this application, when the value of the horizontal angle yaw of the face is determined to be less than the preset horizontal angle threshold, the distance is directly calculated using the pupil distance. When the value of the horizontal angle of the face is greater than or equal to the preset threshold, and the value of the vertical angle pitch is less than the preset vertical angle threshold, the distance is calculated using the obtained face height. If none of the above conditions are met, the next step is performed, i.e., the result of continuing to process the next frame is returned.

[0139] The following is combined with Figure 10 This section details how smart devices such as smart TVs determine the relationship between a face and the display panel based on acquired features. For example... Figure 10 As shown, in some embodiments of this application, the method for obtaining the distance between the display panel of a smart TV and a face includes: S501, obtaining the bounding rectangle of the face (in some embodiments, used to characterize the face height), facial key points (used to determine the pupil distance), and facial pose feature values. S502, calculating the pupil distance based on the facial key points, for example, obtaining the Euclidean distance as the pupil distance. S503, determining whether the value of the horizontal angle of the face is less than a set threshold; if it is less, then executing S505, that is, calculating the initial distance between the face and the display screen based on the Euclidean distance between the left and right pupils in the image as the pupil distance; otherwise, executing S507. Unlike these embodiments, in other embodiments of this application, after executing S503, the Euclidean distance obtained in S502 can be input into a first angle mapping module. This module can correct the Euclidean distance using the values ​​of the Euclidean distance and the horizontal angle to obtain a more accurate pupil distance, and then continue to complete S505 based on this pupil distance.

[0140] In some embodiments of this application, S507 includes determining that the value of the numerical angle of the face is less than a vertical angle threshold. If it is less than this threshold, then S509 is executed to calculate the initial distance between the face and the display screen based on the face frame height value. Unlike these embodiments, before executing S509, the mapping process of the second angle mapping module can be used first, i.e., the face frame height value can be corrected based on the vertical angle value, and the initial distance between the face and the display screen can be calculated based on the corrected face frame height value. In this case, the face frame height value in S509 is the corrected face frame height value.

[0141] To further mitigate abrupt distance changes caused by face rotation, mean filtering can be applied to the calculated distance between the face and the display panel. Figure 10 The S506 distance is smooth. Specifically, as... Figure 11 As shown, a sliding window length Len = 3 can be set. The distance between the face and the display panel, calculated after angle mapping, is used as the input of the sliding window, i.e., element 4. When the sliding window is full, the earliest element 1 in the sliding window will be popped out. The sliding window contains elements 2, 3, and 4 (these elements are three historical distance values). Distance smoothing involves averaging elements 2, 3, and 4, and then outputting the average to the reminder module. This step can reduce abrupt changes in distance to some extent. It should be noted that in some embodiments, the historical distance values ​​are also distance-smoothed values. In other embodiments of this application, the elements in the sliding window can also be unsmoothed distance values.

[0142] S305, determine whether the distance detection was successful. If successful, proceed to S306; otherwise, return to S301 to obtain the next frame image.

[0143] S306, determine whether a reminder is needed. If yes, execute S307; otherwise, return to S301 to obtain the next frame image.

[0144] S307, remind the user, that is, output a warning message through the reminder module 218.

[0145] The averaged distance value output is used as input to the alert module 218 to determine if it is less than a preset threshold. If it is, it indicates that the user is too close and the module will alert the user, with the alert method not limited to voice prompts. If it is not, the module continues to process subsequent input distance values.

[0146] When applied to smart TVs, it can provide users with excellent distance detection capabilities, reminding them to use the TV safely and protecting their eyesight.

[0147] S308, End.

[0148] In other words, in some examples of this application, the Euclidean distance between the left and right pupil points of a face landmark in an image can be directly calculated. In some embodiments of this application, facial pose feature values ​​(such as...) are introduced... Figure 7 Where yaw represents the angle of left-right facial rotation and pitch represents the angle of up-down facial rotation, combined with interpupillary distance measurement, yaw is used to eliminate abrupt distance changes caused by facial tilting at the same position. In some embodiments of this application, by introducing facial pose feature values ​​and combining them with facial height measurement, when the vertical angle value pitch is small and the horizontal angle value yaw is large, the problem of abrupt distance changes caused by the face looking up or down is solved. In some embodiments of this application, mean filtering is also introduced to smooth the distance and suppress abrupt distance changes caused by changes in facial angle at the same position. To calculate the distance, the equivalent focal lengths fx and fy of the camera need to be obtained, and a pinhole imaging model is used for calibration.

[0149] Understandably, unlike existing technologies that use pupil distance to calculate the distance between the human eye and the display screen regardless of conditions, some embodiments of this application only calculate the distance between the human eye and the display screen based on pupil distance when the degree of left-right rotation of the face is determined to be small based on facial posture feature values ​​(e.g., angles representing left-right and up-down rotation of the face). Therefore, this effectively overcomes the problem of abrupt distance changes caused by head tilting. Unlike related technologies that use pupil distance to calculate the distance between the human eye and the display screen, some embodiments of this application also determine the distance between the human eye and the display screen based on facial posture feature values ​​when the degree of left-right rotation is large and the degree of up-down rotation is small. This is combined with face height measurement to calculate the distance between the human eye and the display screen, effectively overcoming the problem of low accuracy and abrupt distance changes caused by using pupil distance to calculate the distance between the human eye and the display screen due to face rotation. Some embodiments of this application also introduce mean filtering to smooth the distance and suppress abrupt distance changes caused by changes in the face angle at the same location.

[0150] Please refer to Figure 12 , Figure 12 The apparatus for obtaining distance provided in the embodiments of this application is shown. It should be understood that this apparatus is similar to the one described above. Figure 2Corresponding to the method embodiments, it can execute the various steps involved in the above method embodiments. The specific functions of the device can be found in the description above. To avoid repetition, detailed descriptions are appropriately omitted here. The device includes at least one software function module that can be stored in the memory or embedded in the device's operating system in the form of software or firmware. The distance acquisition device includes: a feature acquisition module 601, configured to acquire the face height, pupil distance, and face pose feature value of the j-th object in the i-th frame image, wherein the face pose feature value is used to characterize the face rotation attribute, and i and j are integers greater than or equal to 1; a distance calculation module 602, configured to perform the following operations: if it is determined from the face height and the pupil distance to be selected as the image distance based on the face pose feature value, then the distance between the face of the j-th object and the display panel is calculated based at least on the image distance and focal length; if the image distance is not acquired based on the face pose feature value, then the distance between the face of the j-th object and the display panel is obtained by acquiring the i+1-th frame image.

[0151] It should be noted that, in some embodiments of this application, the distance acquisition device also includes Figure 4 The image capture module and feature extraction module, and Figure 12 The distance calculation module 602 corresponds to Figure 4 The distance detection module.

[0152] Some embodiments of this application provide a computer program product, the computer program product including a computer program, which, when executed by a processor, implements the distance acquisition method described in any of the embodiments of the first aspect above.

[0153] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process of the device described above can be referred to the corresponding process in the aforementioned method, and will not be elaborated further here.

[0154] Some embodiments of this application provide a system comprising one or more computers and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations according to the method for obtaining distance described above.

[0155] Some embodiments of this application provide one or more computer storage media for storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations according to the method for obtaining distance described above.

[0156] In the several embodiments provided in this application, it should be understood that the disclosed apparatus and methods can also be implemented in other ways. The apparatus embodiments described above are merely illustrative. For example, the flowcharts and block diagrams in the accompanying drawings illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods, and computer program products according to various embodiments of this application. In this regard, each block in a flowchart or block diagram may represent a module, segment, or portion of code containing one or more executable instructions for implementing a specified logical function. It should also be noted that in some alternative implementations, the functions marked in the blocks may occur in a different order than those marked in the drawings. For example, two consecutive blocks may actually be executed substantially in parallel, and they may sometimes be executed in reverse order, depending on the functions involved. It should also be noted that each block in a block diagram and / or flowchart, and combinations of blocks in block diagrams and / or flowcharts, can be implemented using a dedicated hardware-based system that performs the specified function or action, or using a combination of dedicated hardware and computer instructions.

[0157] In addition, the functional modules in the various embodiments of this application can be integrated together to form an independent part, or each module can exist independently, or two or more modules can be integrated to form an independent part.

[0158] If the aforementioned functions are implemented as software functional modules and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.

[0159] The above description is merely an embodiment of this application and is not intended to limit the scope of protection of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application. It should be noted that similar reference numerals and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.

[0160] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.

[0161] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.

Claims

1. A method for obtaining distance, characterized in that, The method includes: Obtain the face height, pupil distance, and face pose feature value of the j-th object in the i-th frame image, wherein the face pose feature value is used to characterize the face rotation attribute of the j-th object, and i and j are integers greater than or equal to 1; If, based on the facial pose feature value, it is determined that one of the facial height and the pupil distance is selected as the image distance, then the distance between the face of the j-th object and the display panel is calculated at least based on the image distance and the focal length, wherein the focal length is the attribute parameter value of the image acquisition unit used to acquire the image of the j-th object; If the image distance is not obtained based on the facial pose feature value, the above process is repeated for the (i+1)th frame image to obtain the distance between the face of the jth object and the display panel; The step of determining an image distance from the face height and the pupil distance based on the face pose feature value includes: determining the degree of face rotation based on the face pose feature value, and selecting an image distance from the face height and the pupil distance based on the degree of face rotation.

2. The method as described in claim 1, characterized in that, The facial pose feature value includes a horizontal angle value, wherein the horizontal angle value is used to characterize the angle of left and right rotation of the face of the j-th object; The step of determining the image distance from the face height and the pupil distance based on the face pose feature value includes: If it is confirmed that the value of the horizontal angle is less than or equal to the horizontal angle threshold, then the pupil distance is selected as the image distance. The calculation of the distance between the face and the display panel based at least on the image distance and focal length includes: The distance between the face and the display panel is calculated based on the actual pupil distance and the horizontal focal length, wherein the horizontal focal length is obtained through calibration.

3. The method as described in claim 2, characterized in that, The step of obtaining the pupil distance of the j-th object in the i-th frame image includes: The left and right pupil points of the j-th object in the i-th frame image are obtained according to the pupil recognition model; The pupil distance is obtained by calculating the Euclidean distance between the pupil points of the left and right eyes.

4. The method as described in claim 3, characterized in that, The calculation of the Euclidean distance between the pupil points of the left and right eyes to obtain the pupil distance includes: The initial image distance of the pupil is obtained by calculating the Euclidean distance between the pupil points of the left and right eyes; The pupil distance is obtained by correcting the initial image distance of the pupil based on the value of the horizontal angle.

5. The method as described in claim 2, characterized in that, The facial pose feature value also includes a vertical angle value, which is used to characterize the vertical rotation angle of the face of the j-th object; The step of determining the image distance from the face height and the pupil distance based on the face pose feature value includes: If it is confirmed that the horizontal angle is greater than the horizontal angle threshold and the vertical angle is less than the vertical angle threshold, then the face height is selected as the image distance. The calculation of the distance between the face and the display panel based at least on the image distance and focal length includes: The distance between the face and the display panel is calculated based on the actual height of the face, the face height, and the vertical focal length, wherein the vertical focal length is obtained through calibration.

6. The method as described in claim 5, characterized in that, The face height is the height of the bounding box of the face of the j-th object obtained by the face detection algorithm on the i-th frame image.

7. The method as described in claim 5, characterized in that, The face height was obtained in the following way: The bounding box of the face of the j-th object is obtained on the i-th frame image using a face detection algorithm; The height of the face outline is obtained by correcting the value of the vertical angle.

8. The method according to any one of claims 1-7, characterized in that, The method further includes: The facial key points of the j-th object are extracted based on the facial key point detection model; in, The facial pose feature value is obtained by inputting the facial key points and the facial bounding box into the facial pose acquisition model.

9. The method according to any one of claims 1-7, characterized in that, The method further includes: obtaining the historical distance value between the j-th object and the display panel through the previous frame or multiple previous frames of the i-th frame image; The calculation of the distance between the face of the j-th object and the display panel, based at least on the image distance and focal length, includes: The i-th distance value between the face of the j-th object and the display panel is obtained based on the i-th frame image; The distance is obtained based on the i-th distance value and the historical distance value.

10. The method as described in claim 9, characterized in that, The method further includes: storing the historical distance values ​​via a sliding window; wherein... The step of obtaining the distance based on the i-th distance value and the historical distance value includes: Store the i-th distance value into the sliding window; The distance is obtained by calculating the average of all the data stored in the sliding window.

11. The method according to any one of claims 1-7, characterized in that, Before obtaining the face height, pupil distance, and face pose feature values ​​of the j-th object in the i-th frame image, the method further includes: Confirmation that all objects on the i-th frame of the image were detected; Confirm that facial key points of all the objects have been extracted, wherein the facial key points include at least the left eye pupil point and the right eye pupil point; Confirm that the facial pose feature values ​​of all the objects have been obtained.

12. A smart device, characterized in that, The intelligent device includes: Display panel; The image capture module is configured to acquire each frame of images, wherein each frame of images is obtained by the image acquisition unit from the viewer who is watching the program on the display panel; The feature extraction module is configured to: detect faces present in each frame of the image to obtain the bounding box of each face, extract key point features from the detected faces, and obtain face pose feature values ​​based on the face detection box and the key point features. The distance detection module is configured to perform the following operations on each detected object in each frame of an image: obtain the face height, pupil distance, and face pose feature value of the j-th object in the i-th frame of the image, wherein the face pose feature value is used to characterize the face rotation attribute of the j-th object, and i and j are integers greater than or equal to 1; if it is determined from the face height and the pupil distance to be selected as the image distance based on the face pose feature value, then the distance between the face of the j-th object and the display panel is calculated based at least on the image distance and the focal length; if the image distance is not obtained based on the face pose feature value, then the above process is repeated for the (i+1)-th frame of the image to obtain the distance between the face of the j-th object and the display panel; The alert module is configured to generate and provide a warning message when the calculated distance between the face and the display panel is greater than or equal to a set distance threshold. The step of determining an image distance from the face height and the pupil distance based on the face pose feature value includes: determining the degree of face rotation based on the face pose feature value, and selecting an image distance from the face height and the pupil distance based on the degree of face rotation.

13. A system comprising one or more computers and one or more storage devices storing instructions, wherein when the instructions are executed by the one or more computers, the one or more computers perform the method according to any one of claims 1-11.

14. A computer storage medium storing instructions that, when executed by the one or more computers, cause the one or more computers to perform the method according to any one of claims 1-11.

15. A computer program product, characterized in that, The computer program product includes a computer program that, when executed by a processor, implements the distance acquisition method according to any one of claims 1-11.