Face recognition method and device with ultra-wide viewing angle and computer device

By dynamically adjusting computing power and feature fusion, the problem of recognition efficiency and accuracy of face recognition technology under changes in spatiotemporal environment under ultra-wide field of view is solved, and efficient and robust face recognition is achieved in complex environments.

CN122244987APending Publication Date: 2026-06-19ZHUHAI BCOM ELECTRONIC TECH CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHUHAI BCOM ELECTRONIC TECH CO LTD
Filing Date
2026-05-20
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing facial recognition technologies suffer from severe image distortion under ultra-wide viewing angles, making it difficult to cover large monitoring areas. They also lack the ability to perceive the spatiotemporal environment, resulting in low recognition efficiency and high false recognition rates. In particular, they struggle to balance recognition speed and accuracy in scenarios with drastic changes in lighting or high concurrency.

Method used

By integrating multi-dimensional environmental information such as time, location, and lighting, the computing power configuration is dynamically adjusted. Combined with infrared supplementary lighting and binocular liveness detection, distortion correction and feature fusion are achieved, and computing power is dynamically allocated to improve recognition efficiency and robustness.

Benefits of technology

It achieves efficient and robust face recognition with an ultra-wide field of view, balancing recognition speed and accuracy, adapting to environmental changes in different scenarios, and improving the system's scene adaptability and recognition accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244987A_ABST
    Figure CN122244987A_ABST
Patent Text Reader

Abstract

This invention discloses an ultra-wide-angle face recognition method, device, and computer equipment. The method includes: acquiring current spatiotemporal environment data and determining target computing power configuration parameters based on the spatiotemporal environment data; capturing an ultra-wide-angle original image with a horizontal field of view of not less than 120 degrees through a wide-angle acquisition module, and having an infrared illumination module generate a preprocessed image according to the ambient light intensity and the target computing power configuration parameters; using a main control module to call a distortion correction model matched with the target computing power configuration parameters to perform distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image; and controlling the access control of the door station hardware platform based on the enhanced image and the infrared depth information acquired by the binocular liveness detection module, combined with the current time information. This application can improve the efficiency, robustness, and scene adaptability of face recognition under ultra-wide-angle conditions.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of computer vision and biometric recognition technology, and in particular to an ultra-wide-angle face recognition method, apparatus and computer equipment. Background Technology

[0002] Existing facial recognition technologies typically use cameras with ordinary viewing angles, which are insufficient to cover large monitoring areas. Especially in ultra-wide viewing angle scenarios (horizontal field of view ≥ 90°), severe image distortion occurs, making facial feature extraction difficult. Furthermore, traditional methods lack the ability to perceive spatiotemporal environment (time, location, lighting), and with fixed computing power allocation, they are prone to low recognition efficiency and high false recognition rates in low-light, high-concurrency, or specific geographical locations. For example, during rush hour or in areas with severe light pollution, where lighting changes drastically and building structures are complex, existing technologies struggle to balance recognition speed and accuracy. Therefore, there is an urgent need for an adaptive facial recognition solution that can dynamically adjust computing power and algorithm parameters according to the environment. Summary of the Invention

[0003] This invention aims to address at least one of the technical problems existing in the prior art. To this end, this invention proposes a spatiotemporally adaptive ultra-wide-view face recognition method, device, and computer equipment, which can achieve dynamic computing power scheduling, distortion correction, and feature fusion by fusing multi-dimensional environmental information such as time, location, and illumination, thereby improving the efficiency, robustness, and scene adaptability of face recognition under ultra-wide-view conditions.

[0004] Firstly, this application proposes an ultra-wide-angle face recognition method applied to a door station hardware platform. The door station hardware platform includes a main control module and connected to it a wide-angle acquisition module, an infrared illumination module, and a binocular liveness detection module. The method includes: The current spatiotemporal environment data is acquired, and the target computing power configuration parameters are determined based on the spatiotemporal environment data; wherein, the spatiotemporal environment data includes current time information, device geographical location information, and ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network; The wide-angle acquisition module captures an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees, and the infrared fill light module fills the ultra-wide field-of-view original image with fill light according to the ambient light intensity and the target computing power configuration parameters to obtain a pre-processed image. The main control module calls a distortion correction model that matches the target computing power configuration parameters to perform distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image; wherein, the distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded in different geographical locations are different; The access control system of the door station hardware platform is controlled based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information.

[0005] In one embodiment, controlling the access control of the door station hardware platform based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, includes: The infrared depth information acquired by the binocular liveness detection module is fused with the facial features of the enhanced image at the feature level to obtain an anti-attack fusion feature. Based on the current time information, determine whether it is in a preset high-concurrency period. If so, start the pre-allocated computing power channel to perform real-time comparison and liveness verification of the anti-attack fusion features. If the identification is successful and the person is verified as real, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographical location information.

[0006] In one embodiment, determining the target computing power configuration parameters based on the spatiotemporal environment data includes: A spatiotemporal mapping table is constructed, which records the empirical values ​​of device load corresponding to different time periods and different geographical locations; If the current time information is during morning or evening peak hours and the device's geographical location information is located in a densely populated area, then the target computing power configuration parameter is set to high load mode, and the proportion of computing power allocated to feature extraction in high load mode is higher than a preset threshold. If the current time information is during off-peak hours and the device's geographical location information is in a low-light environment, then the target computing power configuration parameters are set to energy-saving enhancement mode. In energy-saving enhancement mode, the infrared supplementary light power is increased first and the recognition accuracy of non-critical areas is reduced.

[0007] In one embodiment, the wide-angle acquisition module captures an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees, including: The corresponding light environment prediction model is matched according to the geographical location information of the device, and the light environment prediction model is trained based on historical illumination data. If the device's geographical location information is a preset geographical location, the predicted ambient light attenuation coefficient is greater than the preset value. The exposure time of the wide-angle acquisition module is extended by a first preset duration, and the wavelength of the infrared fill light module is simultaneously adjusted to the 850nm band.

[0008] In one embodiment, the step of using the main control module to call a distortion correction model that matches the target computing power configuration parameters includes: Obtain the lens's physical distortion parameters and the current image resolution; If the building layout corresponding to the device's geographical location information is an arc-shaped structure, then a preset arc compensation model is loaded. The arc compensation model uses a non-uniform grid interpolation algorithm for pixel mapping of the image edge to compensate for the visual distortion caused by the reflection of the arc-shaped building.

[0009] In one embodiment, determining whether it is in a preset high-concurrency period based on the current time information includes: Obtain historical recognition logs and analyze the frequency of face recognition requests for different date types; If the current time information is within a preset time range and the device geographical location information is within a preset geographical location range, then it is determined to be a high-concurrency period, and a multi-threaded parallel processing queue is pre-activated; If the length of the face request queue exceeds a preset value, the inference accuracy of the neural network and the clock frequency of the image processing unit are reduced.

[0010] In one embodiment, if the identification is successful and the person is verified as a real person, the access control unlocking operation is triggered; otherwise, the corresponding exception handling strategy generated based on the geographic location information includes: If the device's geographical location information indicates a residential area and recognition fails, a local cache of unfamiliar faces will be activated and automatically deleted after a preset time interval. If the device's geographical location information is an industrial park and identification fails, the abnormal data will be uploaded to the cloud management platform in real time, and an immediate alarm notification will be triggered based on the current time information.

[0011] In one embodiment, the feature-level fusion of the infrared depth information acquired by the binocular liveness detection module with the facial features of the enhanced image includes: The infrared depth information is subjected to noise reduction processing to generate a depth confidence map; The face feature vector of the enhanced image is weighted and concatenated with the depth confidence map to generate a fused feature vector; The fused feature vector is evaluated for authenticity using a pre-trained generative adversarial network discriminator. If the result is true, a liveness verification pass signal is output.

[0012] Secondly, this application also proposes an ultra-wide-angle face recognition device applied to a door station hardware platform. The door station hardware platform includes a main control module and connected to it a wide-angle acquisition module, an infrared supplementary lighting module, and a binocular liveness detection module. The device includes: The spatiotemporal awareness configuration module is used to acquire current spatiotemporal environment data and determine target computing power configuration parameters based on the spatiotemporal environment data; wherein, the spatiotemporal environment data includes current time information, device geographical location information and ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network; The image acquisition module is used to capture an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees through the wide-angle acquisition module, and to instruct the infrared supplementary lighting module to supplement the ultra-wide field-of-view original image according to the ambient light intensity and the target computing power configuration parameters to obtain a pre-processed image. The image processing module is used to call a distortion correction model that matches the target computing power configuration parameters using the main control module, and to perform distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image; wherein, the distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded for different geographical locations are different; The control module is used to control the access control of the door station hardware platform based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information.

[0013] Thirdly, this application also proposes a computer device, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the method described in the first aspect.

[0014] The ultra-wide-angle face recognition method, apparatus, and computer device according to embodiments of the present invention have at least the following beneficial effects: By acquiring current spatiotemporal environmental data, target computing power configuration parameters are determined based on the spatiotemporal environmental data; wherein, the spatiotemporal environmental data includes current time information, device geographical location information, and ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network; an ultra-wide-angle original image with a horizontal field of view of not less than 120 degrees is captured by the wide-angle acquisition module, and the infrared supplementary lighting module generates a pre-processed image according to the ambient light intensity and the target computing power configuration parameters; the main control module calls the target computing power configuration parameters... The number-matching distortion correction model performs distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image. The distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded in different geographical locations are different. Based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, the door access control hardware platform is controlled. Thus, by fusing multi-dimensional environmental information such as time, location, and lighting, dynamic computing power scheduling, distortion correction, and feature fusion can be achieved to improve the efficiency, robustness, and scene adaptability of face recognition under ultra-wide field of view, balancing recognition speed and accuracy under wide field of view.

[0015] Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the present application. Attached Figure Description

[0016] The present invention will be further described below with reference to the accompanying drawings and embodiments, wherein: Figure 1 This is a flowchart illustrating the steps of the ultra-wide-angle face recognition method according to an embodiment of the present invention; Figure 2 This is a detailed flowchart illustrating the feature comparison stage and image preprocessing stage in the distortion correction and wide dynamic range fusion processing of this invention. Figure 3 This is a detailed flowchart illustrating the face detection stage, large-angle adaptive correction stage, and feature extraction stage in the distortion correction and wide dynamic range fusion processing of this invention. Figure 4 This is a schematic diagram of the structure of the ultra-wide-angle face recognition device according to an embodiment of the present invention; Figure 5 This is an internal structural diagram of a computer device according to an embodiment of the present invention. Detailed Implementation

[0017] Embodiments of the present invention are described in detail below. Examples of these embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are exemplary and are only used to explain the present invention, and should not be construed as limiting the present invention.

[0018] In the description of this invention, it should be understood that the orientation descriptions, such as up, down, front, back, left, right, etc., are based on the orientation or positional relationship shown in the accompanying drawings. They are only for the convenience of describing this invention and simplifying the description, and do not indicate or imply that the device or element referred to must have a specific orientation, or be constructed and operated in a specific orientation. Therefore, they should not be construed as limiting this invention.

[0019] In the description of this invention, "several" means one or more, "multiple" means two or more, "greater than," "less than," and "exceeding" are understood to exclude the stated number, while "above," "below," and "within" are understood to include the stated number. The use of "first" and "second" in the description is merely for distinguishing technical features and should not be construed as indicating or implying relative importance, or implicitly indicating the number of indicated technical features, or implicitly indicating the order of the indicated technical features.

[0020] In the description of this invention, unless otherwise explicitly defined, terms such as "set up," "install," and "connect" should be interpreted broadly, and those skilled in the art can reasonably determine the specific meaning of the above terms in this invention in conjunction with the specific content of the technical solution.

[0021] In the description of this invention, the terms "one embodiment," "some embodiments," "illustrative embodiment," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples.

[0022] Existing facial recognition technologies typically use cameras with ordinary viewing angles, which are insufficient to cover large monitoring areas. Especially in ultra-wide viewing angle scenarios (horizontal field of view ≥ 90°), severe image distortion occurs, making facial feature extraction difficult. Furthermore, traditional methods lack the ability to perceive spatiotemporal environment (time, location, lighting), and with fixed computing power allocation, they are prone to low recognition efficiency and high false recognition rates in low-light, high-concurrency, or specific geographical locations. For example, during rush hour or in areas with severe light pollution, where lighting changes drastically and building structures are complex, existing technologies struggle to balance recognition speed and accuracy. Therefore, there is an urgent need for an adaptive facial recognition solution that can dynamically adjust computing power and algorithm parameters according to the environment.

[0023] To this end, this application proposes an ultra-wide-angle face recognition method, device, and computer equipment, which can achieve dynamic computing power scheduling, distortion correction, and feature fusion by integrating multi-dimensional environmental information such as time, location, and illumination, so as to improve the efficiency, robustness, and scene adaptability of face recognition under ultra-wide-angle conditions.

[0024] Reference Figure 1 This application proposes an ultra-wide-angle face recognition method applied to a door station hardware platform. The door station hardware platform includes a main control module and connected to it a wide-angle acquisition module, an infrared supplementary lighting module, and a binocular liveness detection module. The method includes steps 100 to 400, wherein: Step 100: Obtain the current spatiotemporal environment data and determine the target computing power configuration parameters based on the spatiotemporal environment data; wherein, the spatiotemporal environment data includes the current time information, the device's geographical location information, and the ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network.

[0025] Specifically, by acquiring current spatiotemporal environmental data, target computing power configuration parameters are determined based on this data, thereby establishing a mapping relationship between environmental perception and hardware performance. The spatiotemporal environmental data includes current time information, device geographic location information, and ambient light intensity. The target computing power configuration parameters include the clock speed of the image processing unit and the inference accuracy of the neural network. The clock speed of the image processing unit refers to the operating frequency of the hardware pipeline that processes image signals, which directly affects the frame rate of image correction. The inference accuracy of the neural network refers to the data format used in model computation (such as FP32, FP16, INT8). This embodiment, by acquiring environmental data in real time, changes the traditional fixed-frequency operation mode, establishes a mapping relationship between environmental perception and hardware performance, and realizes adaptive adjustment of hardware performance according to the environment.

[0026] For example, the system has a pre-stored "spatiotemporal load mapping table". When a device locates itself in a commercial center area (geographical location information) and the time is 09:00 AM on a weekday (time information), the system determines that it will face high pedestrian traffic pressure and will immediately query the table to obtain the corresponding "high load computing power parameters". At this time, the system will not only increase the resources at the algorithm level, but also directly adjust the hardware clock through the underlying driver, increasing the main frequency of the image processing unit from the basic 400MHz to 800MHz to ensure that the subsequent complex distortion correction algorithm can run in real time and avoid screen stuttering.

[0027] Step 200: Capture an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees using the wide-angle acquisition module, and instruct the infrared supplementary lighting module to supplement the ultra-wide field-of-view original image according to the ambient light intensity and target computing power configuration parameters to obtain a preprocessed image.

[0028] Specifically, a wide-angle acquisition module captures an ultra-wide field-of-view original image with a horizontal field of view of no less than 120 degrees, and an infrared illumination module generates a pre-processed image based on the ambient light intensity and target computing power configuration parameters. The wide-angle acquisition module uses a 1 / 2.7-inch sensor with a 2.8mm focal length lens to ensure coverage of a field of view of over 135 degrees. The operating state of the infrared illumination module is no longer solely determined by the light sensor, but is inversely constrained by the "target computing power configuration parameters" in step 100. If the computing power configuration parameters indicate a "high load mode," the illumination module is forced to activate a high-power mode to obtain the original data with the highest signal-to-noise ratio; if it is in an "energy-saving mode," the illumination power is reduced, relying on algorithm enhancement.

[0029] Optionally, in environments with extremely low light, if computing power allows, the system will control the wide-angle acquisition module to perform long exposures (e.g., 1 / 15 second) and simultaneously turn on infrared fill light to capture more environmental texture information, providing a clear input source for subsequent ultra-wide-angle distortion correction.

[0030] Step 300: The main control module calls a distortion correction model that matches the target computing power configuration parameters to perform distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image; wherein, the distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded in different geographical locations are different.

[0031] Specifically, ordinary wide-angle lenses produce barrel distortion, while certain architectural environments (such as arched doorways and glass curtain walls) produce secondary optical distortion. Optionally, the main control module uses the "device geolocation information" obtained from GPS to match corresponding architectural environment features from a local database. When the device locates itself in a location with an arched architectural layout, the system loads a specialized "arch compensation model." The nonlinear mapping parameters within this model are specially trained to reverse-stretch pixels at the image edges. This geolocation-based dynamic model loading mechanism allows the same hardware to adapt to different architectural styles of entrances worldwide, eliminating the need for manual on-site adjustments, significantly reducing deployment costs and improving recognition accuracy.

[0032] Step 400: Based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, control the door access control of the door station hardware platform.

[0033] Specifically, by combining the enhanced image and infrared depth information obtained by the binocular liveness detection module with the current time information to control the door access control hardware platform, the image recognition results are combined with physical security strategies, so that the access control logic depends not only on whether it is the person or not, but also on the current time.

[0034] Optionally, during late-night hours (e.g., 00:00-06:00), if the system detects unfamiliar individuals loitering, it will automatically raise the alert level, trigger a more stringent liveness detection process, and prioritize alarm messages. During normal daytime hours, priority will be given to ensuring passage speed.

[0035] The aforementioned ultra-wide-angle face recognition method acquires current spatiotemporal environmental data and determines target computing power configuration parameters based on this data. The spatiotemporal environmental data includes current time information, device geographic location information, and ambient light intensity. The target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network. The method captures an ultra-wide-angle original image with a horizontal field of view of no less than 120 degrees using a wide-angle acquisition module, and instructs an infrared supplementary lighting module to generate a pre-processed image based on the ambient light intensity and target computing power configuration parameters. Finally, the main control module calls a distortion correction model matched to the target computing power configuration parameters to process the pre-processed image. The image processing performs distortion correction and wide dynamic range fusion to generate an enhanced image. The distortion correction model is dynamically loaded based on the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded in different geographical locations are different. Based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, the door access control hardware platform is controlled. Thus, by fusing multi-dimensional environmental information such as time, location, and lighting, dynamic computing power scheduling, distortion correction, and feature fusion can be achieved to improve the efficiency, robustness, and scene adaptability of face recognition under ultra-wide field of view, balancing recognition speed and accuracy under wide field of view.

[0036] In an exemplary embodiment, step 400, controlling the access control of the door station hardware platform based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, includes: The infrared depth information acquired by the binocular liveness detection module is fused with the facial features of the enhanced image at the feature level to obtain anti-attack fusion features; Based on the current time information, determine whether it is in a preset high-concurrency period. If so, start the pre-allocated computing power channel to perform real-time comparison and liveness verification of the anti-attack fusion features. If the identification is successful and the person is verified as real, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographical location information.

[0037] Optionally, the infrared depth map acquired using a binocular camera contains real spatial information, while photo or screen attacks cannot generate true depth. Therefore, an algorithm is used to weightedly concatenate the feature vectors of the depth map and the RGB image to generate an anti-attack fusion feature. The system pre-loads historical pedestrian flow data for the location. When the "current time information" falls within a high-concurrency range of historical data (such as the 6:00 PM-7:00 PM rush hour in office buildings), the system will pre-allocate a dedicated "pre-allocated computing power channel." This channel exclusively uses NPU resources, bypassing the regular task queue, and directly compares the generated anti-attack fusion feature, thereby achieving millisecond-level ultra-fast passage and effectively alleviating congestion during rush hour. If the identification is successful and the person is verified as real, the access control unlocking operation is triggered; otherwise, a corresponding anomaly handling strategy is generated based on the geographical location information.

[0038] In one exemplary embodiment, determining the target computing power configuration parameters based on spatiotemporal environmental data includes: A spatiotemporal mapping table is constructed, which records the empirical values ​​of device load corresponding to different time periods and different geographical locations; If the current time information is during the morning or evening rush hour, and the device's geographical location information is in a densely populated area, then the target computing power configuration parameter will be set to high load mode, and the proportion of computing power allocated to feature extraction in high load mode will be higher than the preset threshold. If the current time information is during off-peak hours and the device's geographical location information is in a low-light environment, then the target computing power configuration parameters will be set to energy-saving enhancement mode. In energy-saving enhancement mode, the infrared supplementary light power will be increased first and the recognition accuracy of non-critical areas will be reduced.

[0039] The spatiotemporal mapping table can be a multidimensional array, with dimensions including geographic location type, time period, and light intensity. For example, the empirical load value corresponding to an industrial park, 08:00-10:00, and low light is 90 points (out of 100), indicating that full power operation is required.

[0040] For example, under high load mode, the system requests the highest priority CPU / GPU / NPU resource lock from the operating system kernel and increases the clock speed of the image processing unit to 1.2GHz (assuming a base frequency of 600MHz) to ensure rapid capture and recognition of target faces even in crowded areas. Optionally, in residential areas late at night (low light, low pedestrian traffic), the system disables power-consuming algorithms such as background segmentation, retaining only high-precision recognition of the face region, while increasing the infrared illumination power to 100% to reduce gain noise of the image processing unit, thereby maintaining high-security access control monitoring with low power consumption.

[0041] In an exemplary embodiment, step 200, capturing an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees using the wide-angle acquisition module, includes: The corresponding light environment prediction model is matched according to the geographical location information of the equipment. The light environment prediction model is trained based on historical illumination data. If the device's geographical location information is the preset geographical location, the predicted ambient light attenuation coefficient is greater than the preset value. The exposure time of the wide-angle acquisition module is extended by the first preset duration, and the wavelength of the infrared fill light module is simultaneously adjusted to the 850nm band.

[0042] Specifically, since the illumination variation curves differ at different latitudes and longitudes, the system can use GPS coordinates to call the corresponding astronomical algorithm model to calculate the solar altitude angle at the current moment, thereby predicting the ambient light attenuation coefficient. For example, infrared light in the 850nm band has the highest sensitivity to silicon-based sensors and can effectively penetrate atmospheric haze at dusk. By extending the exposure time (e.g., from 1 / 30s to 1 / 15s) in conjunction with 850nm supplemental lighting, the clarity of ultra-wide-angle images in backlit or backlit scenes can be significantly improved without increasing the computational burden, solving the technical problem of unclear edges of faces.

[0043] In an exemplary embodiment, step 300, which involves using the main control module to call a distortion correction model that matches the target computing power configuration parameters, includes: Obtain the lens's physical distortion parameters and the current image resolution; If the building layout corresponding to the device's geographical location information is an arc-shaped structure, then a preset arc compensation model is loaded. The arc compensation model uses a non-uniform grid interpolation algorithm for pixel mapping of the image edge to compensate for the visual distortion caused by the reflection of the arc-shaped building.

[0044] Specifically, traditional bilinear interpolation is uniform and cannot solve the local compression caused by curved curtain walls. This application's embodiment uses a non-uniform grid interpolation algorithm to dynamically adjust the grid density of the right edge of the image (e.g., increasing it by 1.5 times) according to the radius of curvature of the curved building, thus re-expanding the compressed face pixels. It is understood that this process must be completed in real-time within the image processing unit, thus placing extremely high demands on the "main frequency of the image processing unit." This explains why the main frequency needs to be dynamically adjusted according to the scene in step 100: because the computational load of processing the curved compensation algorithm is 2-3 times that of ordinary correction, if the main frequency is not increased, the frame rate will drop significantly.

[0045] In an exemplary embodiment, determining whether a preset high-concurrency period is in place based on the current time information includes: Obtain historical recognition logs and analyze the frequency of face recognition requests for different date types; If the current time information is within a preset time range and the device's geographical location information is within a preset geographical location range, then it is determined to be a high-concurrency period, and a multi-threaded parallel processing queue is pre-started; If the length of the face request queue exceeds a preset value, the inference accuracy of the neural network and the clock speed of the image processing unit will be reduced.

[0046] For example, the system analyzes logs from the past week using machine learning. For instance, if a surge in requests is detected at a location between 07:30 and 09:00, Monday through Friday, the system will pre-start a multi-threaded parallel processing queue at 07:20, allocating the face images to be processed to different thread cores. Optionally, when the queue is too long (e.g., more than 10 people), in pursuit of maximum processing speed, the system will temporarily sacrifice a slight amount of recognition accuracy by switching the neural network from FP16 to INT8 (8-bit integer), while appropriately reducing the clock speed of the image processing unit to balance power consumption and heat generation. This ensures that the system does not overheat and throttle under high load, thus maintaining stable operation for extended periods.

[0047] In an exemplary embodiment, if the identification is successful and the person is verified as a real person, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographic location information, including: If the device's geographical location information is a residential area and recognition fails, the local cache of unfamiliar faces will be activated and automatically deleted after a preset time interval. If the device's geographical location information is an industrial park and identification fails, the abnormal data will be uploaded to the cloud management platform in real time, and an immediate alarm notification will be triggered based on the current time information.

[0048] Specifically, considering the high sensitivity of home users to privacy, when identification fails (e.g., due to a visitor), the system only encrypts and stores a blurred thumbnail locally, automatically overwriting and deleting it after 24 hours, never uploading it to the cloud, thus protecting user privacy to the greatest extent. For high-security scenarios such as industrial parks, once identification fails, the system immediately uploads a high-definition captured image and on-site video footage to the security center. If this occurs late at night (outside of working hours), the system will also directly trigger an audible and visual alarm and push a message to the on-duty personnel's mobile phone, achieving tiered security.

[0049] In an exemplary embodiment, feature-level fusion of the infrared depth information acquired by the binocular liveness detection module with the facial features of the enhanced image includes: The infrared depth information is denoised to generate a depth confidence map; The face feature vector of the enhanced image is weighted and concatenated with the depth confidence map to generate a fused feature vector. A pre-trained adversarial generative network discriminator is used to determine whether the fused feature vector is true or false. If the result is true, a liveness verification pass signal is output.

[0050] Specifically, since infrared depth maps typically contain a lot of noise, the system uses a bilateral filtering algorithm to remove noise while preserving edge information, generating a "depth confidence map" that reflects the three-dimensionality of the face. The facial feature vector of the enhanced image is then weighted and concatenated with the depth confidence map to generate a fused feature vector. Further, the facial feature vector of the enhanced image is weighted and concatenated with the depth confidence map to generate another fused feature vector. A pre-trained adversarial generative network discriminator is used to determine whether the fused feature vector is genuine or not. If the determination result is genuine, a liveness verification pass signal is output.

[0051] The following describes in detail the distortion correction and wide dynamic range fusion processing flow of the present application embodiment, as well as the ultra-wide field-of-view face recognition system used to implement the method, using a specific embodiment.

[0052] Specifically, the ultra-wide-angle face recognition system uses the Yizhi SV826 high-performance security AI SoC as its main control chip. This chip integrates a dual-core CPU (1.5GHz) and a second-generation self-developed NPU, providing 1.5 TOPS of intelligent computing power, which can efficiently support the real-time operation of the face recognition algorithm. The front-end image acquisition uses the Gekko GC2093 CMOS image sensor, a high-quality 1080P image sensor with 2 megapixels (1920×1080) and a 1 / 2.9-inch optical format, combined with a 130-degree ultra-wide-angle lens to achieve a wide field of view coverage. The system is equipped with an 8-inch ISP full-view high-definition display that supports wide dynamic range glare suppression technology, ensuring image clarity in complex lighting environments such as backlight and strong light.

[0053] Reference Figure 2 and Figure 3This is a flowchart of the workflow for an ultra-wide-angle face recognition door station, specifically including: system power-on initialization, hardware initialization, network connection establishment and cloud platform registration, entering standby mode, human body induction wake-up, entering the image preprocessing stage, specifically including: wide dynamic range fusion (2-3 frames HDR), adaptive noise reduction (2D / 3D joint), automatic white balance / exposure adjustment, distortion correction (multi-level correction); then entering the face detection stage, specifically including: Haar cascade / CNN detector detection, multi-scale sliding window, face region selection and localization, and face angle estimation; then entering the large-angle adaptive correction stage, specifically including: based on face angle... The system dynamically adjusts distortion parameters, performs multi-level distortion correction, viewpoint pairing encoding and extreme angle processing, frontal face transformation, and multi-view normalization. It then proceeds to the feature extraction stage, which includes: CNN convolutional neural network feature extraction, inter-channel feature information fusion, viewpoint pairing encoder encoding, and generation of 128 / 512-dimensional feature vectors. Further, it enters the feature comparison stage, where users can choose between 1:N local feature library comparison or 1:1 cloud verification, perform angle matching to evaluate the best pairing, calculate confidence, and determine thresholds. The system then determines if the comparison is successful; if successful, it unlocks the door lock and records the data, uploading it to the cloud platform for synchronization. Finally, it returns to standby mode.

[0054] Specifically, this application employs a distortion correction and large-angle recognition algorithm to address the barrel distortion problem caused by a 130-degree ultra-wide-angle lens. The core process of this algorithm is as follows: Step 1: Face Region Localization: After receiving an image containing a face, the system quickly locates the face in the image using the Haar cascade classification algorithm and selects the face region. Step 2: Dynamic Distortion Correction: Based on the angle and position of the face image, the system dynamically adjusts distortion correction parameters, performing multi-level distortion correction. Traditional distortion correction methods, which force faces at extreme angles to be frontal, can lead to decreased recognition accuracy or even misjudgments. This system's dynamic correction strategy adaptively adjusts the correction intensity according to the actual angle. Step 3: Adaptive Image Enhancement: Through adaptive histogram equalization, the system dynamically adjusts the brightness and contrast parameters of the image based on image quality and environmental conditions, improving image clarity. Step 4: CNN Feature Extraction: The system extracts facial features from the corrected image using a CNN convolutional neural network model, outputting high-quality feature vectors.

[0055] Understandably, in wide-angle scenarios (such as looking up, looking down, or sideways), facial recognition faces the following main challenges: Facial occlusion: Parts of the face are obscured at extreme angles. Non-frontal angles: Traditional frontal face recognition methods are ineffective at extreme angles. Illumination variations: Differences in illumination distribution caused by different angles. Distortion: Stretching and deformation in the edge areas of wide-angle lenses.

[0056] Therefore, to address the aforementioned challenges, this application's system draws upon cutting-edge academic techniques such as Pose-Pairing Encoders. The core ideas of this technology are: Angle Classification and Pairing: Classifying faces based on their angles and establishing pairing relationships between different angles. Encoder Specialization: Designing specialized encoders for different angle categories, rather than using a single general-purpose encoder. Feature Encoding: Selecting the appropriate encoder based on the actual angle of the face for feature extraction, encoding more representative features.

[0057] The advantage of this application's embodiment is that when the face is in a more extreme viewpoint, it does not force the face to be turned to face before recognition, but directly extracts the effective features under that angle through angle pairing encoding, thereby better recognizing the face under extreme viewpoints.

[0058] The system in this application embodiment also introduces multi-view positive normalization (Pose-Pairing Normalization) technology to improve the matching performance between viewpoints by generating higher-quality face images. Specifically, it includes: a multi-angle generator capable of generating multiple standard viewpoint frontal images of a face from any angle; a multi-angle discriminator determining the authenticity and identity consistency of the generated images; and positive normalization selection automatically selecting the optimal normalization strategy based on the input image quality.

[0059] Furthermore, in the feature comparison stage, the system employs a pose-pairing matching evaluation method, which differs from the traditional template feature averaging method. Traditional methods average all image features, leading to the dilution of features from representative images or images of poor quality. The pose-pairing matching evaluation method, however, involves: acquiring the angle information of all images within the template; establishing the optimal angle pairing with the target face; extracting pairing features using a corresponding face pairing encoder; and selecting the best pairing for matching determination.

[0060] Optionally, the system in this embodiment employs wide dynamic range (WDR) strong light suppression technology, enabling it to adapt to various complex lighting environments such as backlighting and strong light. For example, the GC2093 sensor supports 2-3 frame WDR fusion, preserving details in both bright and dark areas through multi-frame exposure synthesis technology. Combined with professional security-grade ISP processing, it achieves adaptive noise reduction and automatic exposure adjustment, ensuring a high accuracy rate for face recognition in different scenarios and environments.

[0061] The system in this embodiment also employs a GC2093+GC2053 binocular camera configuration, supporting binocular liveness detection technology. Through the cooperation of an infrared camera and a visible light camera, it can effectively distinguish real faces from photos, videos, masks, and other attack methods. Liveness detection algorithm analysis includes: facial temperature distribution in infrared images, three-dimensional information formed by binocular parallax, and detection of micro-expressions and micro-movements.

[0062] For example, the 1.5 TOPS NPU integrated in the Yizhi SV826 is specifically designed for accelerating the computation of face recognition algorithms. Key optimization strategies include: Model quantization: quantizing floating-point models to INT8 precision to improve inference speed while maintaining recognition accuracy; Operator fusion: fusing operators such as convolution, batch normalization, and activation functions to reduce memory access; Multi-threaded pipeline: parallel processing of image acquisition, preprocessing, face detection, feature extraction, and comparison decision; The system's recognition speed is less than 0.3 seconds, the recognition distance is 0.3-3 meters, and it supports both 1:N and 1:1 comparison modes.

[0063] The following is combined Figure 2 The system workflow is explained in detail.

[0064] During standby and wake-up phases, the system employs a human body detection wake-up mechanism. When a human body approaches, the GC2093 sensor initiates image acquisition. This design significantly reduces system power consumption, making it suitable for security applications requiring long-term operation.

[0065] During the image acquisition and preprocessing stage, raw image data is acquired using the GC2093 CMOS sensor and transmitted to the SV826 chip via the MIPI interface. The ISP module performs preprocessing such as wide dynamic range fusion, noise reduction, and white balance to provide high-quality image data for distortion correction and face recognition.

[0066] In the distortion correction and angle adaptation stage, the system performs multi-level distortion correction for ultra-wide field-of-view images, specifically including: the first level: barrel distortion correction of the overall image; the second level: local fine correction based on the detected face position and angle; and the third level: adaptive adjustment before feature extraction.

[0067] In the facial feature extraction and comparison stage, the system uses a deep convolutional neural network to extract facial features, and the generated feature vectors are compared 1:N with the local feature database. The feature database supports a capacity of 50,000 faces, meeting the application needs of large communities. The comparison result is returned within 0.3 seconds, and the door lock is activated after successful recognition.

[0068] During the data synchronization and cloud management phase, identification records (including successful and failed attempts) are uploaded to the cloud platform in real time, supporting multiple communication protocols such as HTTP, MQTT, and WebSocket. Property management personnel can query access records, remotely configure device parameters, and distribute facial feature databases via a web interface.

[0069] The embodiments of this application have the following advantages: strong wide-angle recognition capability: through distortion correction, viewpoint pairing encoding and multi-viewpoint normalization technology, a high recognition rate is maintained for faces at extreme angles; good adaptability to complex lighting: wide dynamic range fusion and strong light suppression technology ensure clear recognition even in backlight and low-light environments; high level of anti-counterfeiting security: binocular liveness detection effectively prevents attacks using photos, videos and other means; excellent real-time performance: NPU hardware acceleration, recognition speed is less than 0.3 seconds; high system integration: the SV826 SoC integrates CPU, NPU and ISP, with a single chip completing all processing.

[0070] Furthermore, in an exemplary embodiment, the software program used in the method of this application is an ultra-wide-angle face recognition door station software, which is a comprehensive management platform for intelligent access control scenarios. It employs a 120° to 150° ultra-wide-angle camera combined with a deep learning algorithm to achieve wide-range, high-precision, and contactless personnel recognition and access control. The software integrates core functions such as face detection, liveness detection, audio and video intercom, and remote control, and is suitable for residential communities, office buildings, industrial parks, and other locations. Its core functional modules include: 1. Ultra-wide field of view face recognition Wide-area face detection: Utilizing the advantages of ultra-wide-angle lenses, the software supports the simultaneous detection of up to 10 faces in the image within a distance range of 0.3m-3m, completely solving the pain points of traditional access control caused by limited field of view, such as tall people looking down, children looking up, and wheelchair users being unable to align.

[0071] Intelligent distortion correction: Built-in wide-angle distortion correction algorithm automatically corrects the stretching and deformation of the image edges, ensuring that all faces in the image meet the recognition standards and that the recognition accuracy in edge areas is not reduced.

[0072] Multi-posture adaptation: Supports face recognition with ±45° horizontal tilt and ±30° pitch angle, so users do not need to deliberately face the camera, achieving a seamless passage experience of "recognition as you approach".

[0073] 2. High-precision identification and rapid comparison Deep learning algorithm: A lightweight convolutional neural network is used to extract 512-dimensional facial features, with a recognition accuracy of over 99.5% and a single recognition time of less than 0.3 seconds.

[0074] Local face database with tens of thousands of entries: Supports storing face feature data of over 20,000 people, and can still function normally even without an internet connection. Employs multi-level indexing to accelerate retrieval; matching time for a database of tens of thousands is less than 0.5 seconds.

[0075] Multimodal fusion authentication: Supports multiple authentication methods such as face + IC card, face + password, and face + QR code to meet the needs of different security level scenarios.

[0076] 3. Intelligent live-body anti-counterfeiting Multi-dimensional liveness detection: It integrates liveness detection based on action commands (blinking, opening mouth, shaking head) with infrared stereo liveness detection (infrared imaging to distinguish real faces from photos / screens), effectively resisting attacks such as photos, videos, and 3D masks, with an attack pass rate of <0.1%.

[0077] Anti-playback detection: Automatically identifies video playback features such as screen borders, moiré patterns, and reflective dots to prevent recording attacks.

[0078] 4. Adaptive to complex environments Intelligent Wide Dynamic Range (WDR): Automatically triggers multi-frame fusion algorithm in backlit scenes to balance details in bright and dark areas, ensuring that faces are clearly identifiable in strong backlight conditions.

[0079] Infrared night vision enhancement: Infrared supplementary light is automatically turned on when the ambient light is below 10 Lux, and nighttime recognition is not affected.

[0080] Automatic exposure and white balance: Real-time statistics of image brightness, dynamically adjusting exposure parameters and color temperature to adapt to changes in lighting conditions at different times and in different weather.

[0081] 5. Audio / video intercom and remote control SIP Protocol Intercom: Supports standard SIP protocol, enabling high-definition audio and video intercom between the indoor unit, property center, or mobile APP when a visitor calls. Full-duplex communication and echo cancellation are supported.

[0082] Remote unlocking: During a call or via a mobile app, send an unlocking command to the access control system with one click, and record unlocking logs.

[0083] Call forwarding and call answering: Supports automatic transfer to the property management center when the indoor unit does not answer, or property management personnel to answer calls from designated door stations.

[0084] 6. Event Management and Data Statistics Video capture: Automatically captures and stores images locally for each identification, call, and unlocking operation, supporting event tracing.

[0085] Visitor Log: Records the visitor's identity, time, identification result, and access status to form a complete access log.

[0086] Data statistics and reports: Statistics on the number of people passing through, recognition success rate, and anomaly alarms are compiled by day, week, and month, and reports can be exported.

[0087] 7. Equipment Management and Maintenance Remote operation and maintenance: Supports OTA firmware upgrades, remote parameter configuration, and real-time monitoring of device status.

[0088] Offline caching: When the network is interrupted, the identified records and event data are cached locally, and automatically synchronized to the cloud after the network is restored.

[0089] Self-diagnostic alarms: Automatically push alarm information when the device is abnormal (network offline, storage full, tampering triggered).

[0090] Ultimately, it is used to achieve the following performance metrics:

[0091] It is understood that the ultra-wide-angle face recognition door station software in this application solves the core pain points of traditional access control systems, such as numerous blind spots, poor user experience, and weak adaptability, through the deep integration of wide-angle optics and deep learning algorithms. Its organic combination of functions such as wide-range detection, high-precision recognition, multi-dimensional liveness detection, and adaptive capabilities in complex environments provides users with a safe, convenient, and intelligent access experience.

[0092] It should be understood that although the steps in the flowcharts of the above embodiments are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowcharts of the above embodiments may include multiple steps or multiple stages. These steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the steps or stages in other steps. It is understood that the steps in different embodiments can be freely combined as needed, and all non-contradictory solutions formed by such combinations are within the scope of protection of this application.

[0093] Based on the same inventive concept, this application also provides an ultra-wide-angle face recognition device for implementing the ultra-wide-angle face recognition method described above. The solution provided by this device is similar to the implementation described in the above method; therefore, the specific limitations in one or more ultra-wide-angle face recognition device embodiments provided below can be found in the limitations of the ultra-wide-angle face recognition method described above, and will not be repeated here.

[0094] In one exemplary embodiment, such as Figure 4 As shown, an ultra-wide-angle face recognition device 900 is provided, applied to a door station hardware platform. The door station hardware platform includes a main control module and connected to it a wide-angle acquisition module, an infrared supplementary lighting module, and a binocular liveness detection module. The device includes: The spatiotemporal perception configuration module 901 is used to acquire the current spatiotemporal environment data and determine the target computing power configuration parameters based on the spatiotemporal environment data; wherein, the spatiotemporal environment data includes the current time information, the device's geographical location information and the ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network; The image acquisition module 902 is used to capture an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees through the wide-angle acquisition module, and to enable the infrared supplementary lighting module to supplement the ultra-wide field-of-view original image according to the ambient light intensity and target computing power configuration parameters to obtain a pre-processed image. The image processing module 903 is used to call a distortion correction model that matches the target computing power configuration parameters using the main control module, to perform distortion correction and wide dynamic range fusion processing on the preprocessed image, and generate an enhanced image; wherein, the distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded in different geographical locations are different. The control module 904 is used to control the access control of the door station hardware platform based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information.

[0095] In one embodiment, the access control system of the door station hardware platform is controlled based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, including: The infrared depth information acquired by the binocular liveness detection module is fused with the facial features of the enhanced image at the feature level to obtain anti-attack fusion features; Based on the current time information, determine whether it is in a preset high-concurrency period. If so, start the pre-allocated computing power channel to perform real-time comparison and liveness verification of the anti-attack fusion features. If the identification is successful and the person is verified as real, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographical location information.

[0096] In one embodiment, determining the target computing power configuration parameters based on spatiotemporal environmental data includes: A spatiotemporal mapping table is constructed, which records the empirical values ​​of device load corresponding to different time periods and different geographical locations; If the current time information is during the morning or evening rush hour, and the device's geographical location information is in a densely populated area, then the target computing power configuration parameter will be set to high load mode, and the proportion of computing power allocated to feature extraction in high load mode will be higher than the preset threshold. If the current time information is during off-peak hours and the device's geographical location information is in a low-light environment, then the target computing power configuration parameters will be set to energy-saving enhancement mode. In energy-saving enhancement mode, the infrared supplementary light power will be increased first and the recognition accuracy of non-critical areas will be reduced.

[0097] In one embodiment, the wide-angle acquisition module captures an ultra-wide field-of-view raw image with a horizontal field of view of not less than 120 degrees, including: The corresponding light environment prediction model is matched according to the geographical location information of the equipment. The light environment prediction model is trained based on historical illumination data. If the device's geographical location information is the preset geographical location, the predicted ambient light attenuation coefficient is greater than the preset value. The exposure time of the wide-angle acquisition module is extended by the first preset duration, and the wavelength of the infrared fill light module is simultaneously adjusted to the 850nm band.

[0098] In one embodiment, the use of the main control module to invoke a distortion correction model that matches the target computing power configuration parameters includes: Obtain the lens's physical distortion parameters and the current image resolution; If the building layout corresponding to the device's geographical location information is an arc-shaped structure, then a preset arc compensation model is loaded. The arc compensation model uses a non-uniform grid interpolation algorithm for pixel mapping of the image edge to compensate for the visual distortion caused by the reflection of the arc-shaped building.

[0099] In one embodiment, determining whether a preset high-concurrency period is in place based on the current time information includes: Obtain historical recognition logs and analyze the frequency of face recognition requests for different date types; If the current time information is within a preset time range and the device's geographical location information is within a preset geographical location range, then it is determined to be a high-concurrency period, and a multi-threaded parallel processing queue is pre-started; If the length of the face request queue exceeds a preset value, the inference accuracy of the neural network and the clock speed of the image processing unit will be reduced.

[0100] In one embodiment, if the identification is successful and the person is verified as a real person, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographic location information, including: If the device's geographical location information is a residential area and recognition fails, the local cache of unfamiliar faces will be activated and automatically deleted after a preset time interval. If the device's geographical location information is an industrial park and identification fails, the abnormal data will be uploaded to the cloud management platform in real time, and an immediate alarm notification will be triggered based on the current time information.

[0101] In one embodiment, feature-level fusion of the infrared depth information acquired by the binocular liveness detection module with the facial features of the enhanced image includes: The infrared depth information is denoised to generate a depth confidence map; The face feature vector of the enhanced image is weighted and concatenated with the depth confidence map to generate a fused feature vector. A pre-trained adversarial generative network discriminator is used to determine whether the fused feature vector is true or false. If the result is true, a liveness verification pass signal is output.

[0102] The modules in the aforementioned ultra-wide-angle face recognition device can be implemented entirely or partially through software, hardware, or a combination thereof. These modules can be embedded in the processor of a computer device in hardware form or independent of it, or stored in the memory of a computer device in software form, so that the processor can call and execute the corresponding operations of each module.

[0103] In one exemplary embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as follows: Figure 5 As shown, this computer device includes a processor, memory, input / output interfaces (I / O), and a communication interface. The processor, memory, and I / O interfaces are connected via a system bus, and the communication interface is also connected to the system bus via the I / O interfaces. The processor provides computational and control capabilities. The memory includes non-volatile storage media and internal memory. The non-volatile storage media stores the operating system, computer programs, and a database. The internal memory provides the environment for the operating system and computer programs stored in the non-volatile storage media. The database stores ultra-wide-angle face recognition data. The I / O interfaces are used for information exchange between the processor and external devices. The communication interface is used for communication with external terminals via a network connection. When the computer program is executed by the processor, it implements an ultra-wide-angle face recognition method.

[0104] Those skilled in the art will understand that Figure 5 The structure shown is merely a block diagram of a portion of the structure related to the present application and does not constitute a limitation on the computer device to which the present application is applied. Specific computer devices may include more or fewer components than those shown in the figure, or combine certain components, or have different component arrangements.

[0105] In one exemplary embodiment, a computer device is provided, including a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to implement the steps of the method described above.

[0106] In one embodiment, a computer-readable storage medium is provided having a computer program stored thereon, the computer program performing the following steps when executed by a processor: The embodiments of the present invention have been described in detail above with reference to the accompanying drawings. However, the present invention is not limited to the above embodiments, and various changes can be made within the scope of knowledge possessed by those skilled in the art without departing from the spirit of the present invention. Furthermore, the embodiments of the present invention and the features thereof can be combined with each other unless otherwise specified.

Claims

1. A super-wide viewing angle face recognition method, characterized in that, The method is applied to a door station hardware platform, which includes a main control module and connected to it a wide-angle acquisition module, an infrared supplementary lighting module, and a binocular liveness detection module. The current spatiotemporal environment data is acquired, and the target computing power configuration parameters are determined based on the spatiotemporal environment data; wherein, the spatiotemporal environment data includes current time information, device geographical location information, and ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network; The wide-angle acquisition module captures an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees, and the infrared fill light module fills the ultra-wide field-of-view original image with fill light according to the ambient light intensity and the target computing power configuration parameters to obtain a pre-processed image. The main control module calls a distortion correction model that matches the target computing power configuration parameters to perform distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image; wherein, the distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded in different geographical locations are different; The access control system of the door station hardware platform is controlled based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information.

2. The method of claim 1, wherein, The step of controlling the access control of the door station hardware platform based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information, includes: The infrared depth information acquired by the binocular liveness detection module is fused with the facial features of the enhanced image at the feature level to obtain an anti-attack fusion feature. Based on the current time information, determine whether it is in a preset high-concurrency period. If so, start the pre-allocated computing power channel to perform real-time comparison and liveness verification of the anti-attack fusion features. If the identification is successful and the person is verified as a real person, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographical location information.

3. The method of claim 1, wherein, The determination of target computing power configuration parameters based on the spatiotemporal environment data includes: A spatiotemporal mapping table is constructed, which records the empirical values ​​of device load corresponding to different time periods and different geographical locations; If the current time information is during morning or evening peak hours and the device's geographical location information is located in a densely populated area, then the target computing power configuration parameter is set to high load mode, and the proportion of computing power allocated to feature extraction in high load mode is higher than a preset threshold. If the current time information is during off-peak hours and the device's geographical location information is in a low-light environment, then the target computing power configuration parameters are set to energy-saving enhancement mode. In energy-saving enhancement mode, the infrared supplementary light power is increased first and the recognition accuracy of non-critical areas is reduced.

4. The method of claim 1, wherein, The wide-angle acquisition module captures ultra-wide field-of-view raw images with a horizontal field of view of not less than 120 degrees, including: The corresponding light environment prediction model is matched according to the geographical location information of the device, and the light environment prediction model is trained based on historical illumination data. If the device's geographical location information is a preset geographical location, the predicted ambient light attenuation coefficient is greater than the preset value. The exposure time of the wide-angle acquisition module is extended by a first preset duration, and the wavelength of the infrared fill light module is simultaneously adjusted to the 850nm band.

5. The method according to claim 1, characterized in that, The step of using the main control module to call a distortion correction model that matches the target computing power configuration parameters includes: Obtain the lens's physical distortion parameters and the current image resolution; If the building layout corresponding to the device's geographical location information is an arc-shaped structure, then a preset arc compensation model is loaded. The arc compensation model uses a non-uniform grid interpolation algorithm for pixel mapping of the image edge to compensate for the visual distortion caused by the reflection of the arc-shaped building.

6. The method according to claim 2, characterized in that, The step of determining whether a preset high-concurrency period is in place based on the current time information includes: Obtain historical recognition logs and analyze the frequency of face recognition requests for different date types; If the current time information is within a preset time range and the device geographical location information is within a preset geographical location range, then it is determined to be a high-concurrency period, and a multi-threaded parallel processing queue is pre-activated; If the length of the face request queue exceeds a preset value, the inference accuracy of the neural network and the clock frequency of the image processing unit are reduced.

7. The method according to claim 2, characterized in that, If the identification is successful and the person is verified as real, the access control unlocking operation is triggered; otherwise, a corresponding exception handling strategy is generated based on the geographic location information, including: If the device's geographical location information indicates a residential area and recognition fails, a local cache of unfamiliar faces will be activated and automatically deleted after a preset time interval. If the device's geographical location information is an industrial park and identification fails, the abnormal data will be uploaded to the cloud management platform in real time, and an immediate alarm notification will be triggered based on the current time information.

8. The method according to claim 2, characterized in that, The step of fusing the infrared depth information acquired by the binocular liveness detection module with the facial features of the enhanced image at the feature level includes: The infrared depth information is subjected to noise reduction processing to generate a depth confidence map; The face feature vector of the enhanced image is weighted and concatenated with the depth confidence map to generate a fused feature vector; The fused feature vector is evaluated for authenticity using a pre-trained generative adversarial network discriminator. If the result is true, a liveness verification pass signal is output.

9. A face recognition device with an ultra-wide viewing angle, characterized in that, This device is applied to a door station hardware platform, which includes a main control module and connected to it a wide-angle acquisition module, an infrared supplementary lighting module, and a binocular liveness detection module. The device includes: The spatiotemporal awareness configuration module is used to acquire current spatiotemporal environment data and determine target computing power configuration parameters based on the spatiotemporal environment data; wherein, the spatiotemporal environment data includes current time information, device geographical location information and ambient light intensity, and the target computing power configuration parameters include the main frequency of the image processing unit and the inference accuracy of the neural network; The image acquisition module is used to capture an ultra-wide field-of-view original image with a horizontal field of view of not less than 120 degrees through the wide-angle acquisition module, and to instruct the infrared supplementary lighting module to supplement the ultra-wide field-of-view original image according to the ambient light intensity and the target computing power configuration parameters to obtain a pre-processed image. The image processing module is used to call a distortion correction model that matches the target computing power configuration parameters using the main control module, and to perform distortion correction and wide dynamic range fusion processing on the preprocessed image to generate an enhanced image; wherein, the distortion correction model is dynamically loaded according to the device's geographical location information, and the nonlinear mapping parameters of the distortion correction model loaded for different geographical locations are different; The control module is used to control the access control of the door station hardware platform based on the enhanced image and the infrared depth information obtained by the binocular liveness detection module, combined with the current time information.

10. A computer device comprising a memory and a processor, wherein the memory stores a computer program, characterized in that, When the processor executes the computer program, it implements the steps of the method according to any one of claims 1 to 8.