Intelligent security device control method and system based on face recognition

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By acquiring and processing facial image similarity judgments, the problem of low facial recognition efficiency in smart locks is solved, improving user experience and security.

CN115577337BActive Publication Date: 2026-06-26YUNDING NETWORK TECH BEIJING

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: YUNDING NETWORK TECH BEIJING
Filing Date: 2021-12-08
Publication Date: 2026-06-26

AI Technical Summary

Technical Problem

The facial recognition efficiency and user experience of existing smart locks are low, and keys or access cards are easily damaged, invalidated, and inconvenient to carry. Biometric identification methods need to be improved.

Method used

By acquiring environmental information, facial images are obtained and processed to determine their similarity. Once preset conditions are met, corresponding operations are performed, including repeated image acquisition to ensure accurate recognition.

Benefits of technology

It improves the efficiency and user experience of facial recognition, reduces misidentification and user inconvenience, and enhances the security and convenience of smart security devices.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure CN115577337B_ABST

Patent Text Reader

Abstract

The embodiment of the specification discloses an intelligent security device control method and system based on face recognition, relates to the field of data processing, and the method comprises the following steps: acquiring environment information; acquiring a first image based on the environment information, wherein the first image comprises a first target face image; performing face recognition on the first target face image, controlling a device to perform a first response operation based on the recognition result of the first target face image; if the first response operation is not a target operation, repeatedly performing the steps of acquiring a second image, the second image comprising a second target face image, and determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition; performing face recognition on the second target face image whose similarity meets the preset condition, and controlling the device to perform a second response operation based on the recognition result of the second target face image.

Need to check novelty before this filing date? Find Prior Art

Description

[0001] Case Analysis

[0002] This application is a divisional application of Chinese Patent Application No. 202111493108.5, entitled "A control method and system for intelligent security equipment based on face recognition", filed on December 8, 2021. Technical Field

[0003] This specification relates to the field of data processing, and in particular to a control method and system for intelligent security equipment based on facial recognition. Background Technology

[0004] Currently, users need to carry keys or access cards to open smart locks, but keys or access cards are inconvenient to carry and are prone to damage, malfunction, or loss. With increasing consumer acceptance of smart locks, especially contactless biometric methods, facial recognition is gradually becoming the primary control method for high-end smart locks. However, facial recognition scenarios for smart locks are quite complex, and improvements are needed in facial recognition efficiency and user experience.

[0005] Therefore, there is a need to provide a control method and system for intelligent security devices based on facial recognition, in order to improve the efficiency of facial recognition and thus improve the user experience of intelligent security devices based on facial recognition. Summary of the Invention

[0006] One embodiment of this specification provides a control method for an intelligent security device based on face recognition. The method includes: acquiring environmental information; acquiring a first image based on the environmental information, the first image including a first target face image; controlling the device to perform a first response operation based on the first target face image; if the first response operation is not a target operation, repeatedly acquiring a second image, the second image including a second target face image, and determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition; when the similarity meets the preset condition, controlling the device to perform a second response operation based on the second target face image.

[0007] One embodiment of this specification provides a face recognition-based intelligent security device control system, comprising: an information acquisition module for acquiring environmental information; an image acquisition module for acquiring a first image based on the environmental information, the first image including a first target face image; and an intelligent security device control module for controlling the device to perform a first response operation based on the first target face image, and further for repeatedly controlling the image acquisition module to acquire a second image, the second image including a second target face image, if the first response operation is not a target operation, and determining the similarity between the first target face image and the second target face image until the similarity meets a preset condition, and further for controlling the device to perform a second response operation based on the second target face image when the similarity meets the preset condition.

[0008] One embodiment of this specification provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that the processor executes the computer program to implement the above-described intelligent security device control method based on face recognition.

[0009] One embodiment of this specification provides a computer-readable storage medium that stores computer instructions. The characteristic of this specification is that when a computer reads the computer instructions in the storage medium, the computer executes the above-described intelligent security device control method based on face recognition. Attached Figure Description

[0010] This specification will be further described by way of exemplary embodiments, which will be described in detail with reference to the accompanying drawings. These embodiments are not limiting; in these embodiments, the same reference numerals denote the same structures, wherein:

[0011] Figure 1 This is a schematic diagram illustrating application scenarios of a face recognition-based intelligent security equipment control system according to some embodiments of this specification;

[0012] Figure 2 This is an exemplary block diagram of a face recognition-based intelligent security device control system according to some embodiments of this specification;

[0013] Figure 3 This is an exemplary flowchart of a face recognition-based intelligent security device control method according to some embodiments of this specification;

[0014] Figure 4 This is an exemplary flowchart illustrating the acquisition of a first image based on a target shooting area according to some embodiments of this specification;

[0015] Figure 5This is a schematic diagram of a third machine learning model according to some embodiments of this specification. Detailed Implementation

[0016] To more clearly illustrate the technical solutions of the embodiments in this specification, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the drawings described below are merely some examples or embodiments of this specification. For those skilled in the art, these drawings can be applied to other similar scenarios without creative effort. Unless obvious from the context or otherwise specified, the same reference numerals in the drawings represent the same structures or operations.

[0017] It should be understood that the terms “system,” “device,” “unit,” and / or “module” used herein are one way to distinguish different components, elements, parts, sections, or assemblies at different levels. However, if other terms can achieve the same purpose, they may be replaced by other expressions.

[0018] As indicated in this specification and claims, unless the context clearly indicates otherwise, the words "a," "an," "an," and / or "the" do not specifically refer to the singular and may also include the plural. Generally speaking, the terms "comprising" and "including" only indicate the inclusion of expressly identified steps and elements, which do not constitute an exclusive list, and the method or apparatus may also include other steps or elements.

[0019] Flowcharts are used in this specification to illustrate the operations performed by the system according to embodiments of this specification. It should be understood that the preceding or following operations are not necessarily performed in exact order. Instead, the steps can be processed in reverse order or simultaneously. Furthermore, other operations can be added to these processes, or one or more steps can be removed from them.

[0020] Figure 1 This is a schematic diagram illustrating application scenarios of a face recognition-based intelligent security equipment control system 100 according to some embodiments of this specification.

[0021] In some embodiments, the face recognition-based intelligent security device control system 100 can implement face recognition-based intelligent security device control by implementing the methods and / or processes disclosed in this specification.

[0022] like Figure 1 As shown, the intelligent security equipment control system 100 based on face recognition may include a server 110, a network 120, a user terminal 130, a storage device 140, an information acquisition device 150, a camera acquisition device 160, and a smart lock 170.

[0023] Server 110 can be used to process data and / or information from at least one component of the face recognition-based intelligent security device control system 100 or an external data source (e.g., a cloud data center). For example, server 110 can be used to acquire environmental information of the detection area from information acquisition device 150. It can also be used to acquire a first image from camera acquisition device 160. Furthermore, server 110 can be used to control the device to perform a first response operation based on a first target face image. Additionally, server 110 can be used to repeatedly control camera acquisition device 160 to acquire a second image, including a second target face image, and determine the similarity between the first target face image and the second target face image until the similarity meets a preset condition. When the similarity meets the preset condition, server 110 controls the device to perform a second response operation based on the second image. In some embodiments, during processing, server 110 may obtain data (such as instructions) from storage device 140 or save data (e.g., the result of determining whether to control smart lock 170) to storage device 140, or may read data (e.g., environmental information, etc.) from other sources such as user terminal 130 via network 120 or output data (e.g., face recognition results, etc.) to user terminal 130.

[0024] In some embodiments, server 110 may include a central processing unit (CPU), a digital signal processor (DSP), and / or any combination thereof. In some embodiments, server 110 may be local, remote, or implemented on a cloud platform.

[0025] Network 120 can provide a channel for information exchange. In some embodiments, server 110, user terminal 130, storage device 140, information acquisition device 150, camera acquisition device 160, and smart lock 170 can exchange information via network 120. (For example, server 110 can receive environmental information acquired by information acquisition device 150 via network 120. As another example, server 110 can read data stored in storage device 140 via network 120.)

[0026] User terminal 130 refers to one or more terminal devices or software used by a user. In some embodiments, user terminal 130 may be one or any combination of mobile devices, tablet computers, laptop computers, desktop computers, and other devices with input and / or output functions. In some embodiments, user terminal 130 may serve as a user's display terminal, used to obtain and display the results of facial recognition from server 110 via network 120. The above examples are only used to illustrate the breadth of the range of user terminal 130 devices and are not intended to limit its scope.

[0027] Storage device 140 can be used to store data and / or instructions. In some embodiments, storage device 140 may obtain data and / or instructions from, for example, user terminal 130, information acquisition device 150, and camera acquisition device 160. In some embodiments, storage device 140 may store data and / or instructions used by server 110 to execute or use in order to perform the exemplary methods described herein.

[0028] The information acquisition device 150 can be used to acquire environmental information, which may be information related to determining whether an obstacle exists within a detection area. In some embodiments, the information acquisition device 150 can send the environmental information to the server 110 via the network 120. Further description of the information acquisition device 150 can be found in [reference needed]. Figure 3 And its related descriptions.

[0029] The camera acquisition device 160 can be used to acquire a first image, which may include image information of the detection area. In some embodiments, the camera acquisition device 160 may include a camera device and a light source, wherein the light source can be used to illuminate the detection area, and the camera device can be used to acquire the light signal illuminating the detection area. In some embodiments, the camera acquisition device 160 may include multiple camera devices, one of which is a main camera device, and the remaining camera devices are all auxiliary camera devices. The shooting areas corresponding to the main camera device and the shooting areas corresponding to the auxiliary camera devices may partially overlap to reduce the monitoring blind spots of the camera acquisition device 160. For example, an auxiliary camera device can be installed below the main camera device. When the user (e.g., a child) is short, the user is located below the shooting area corresponding to the main camera device and cannot be captured by the main camera device. The auxiliary camera device can then be used to acquire the image of the user. In some embodiments, the camera device may include a depth camera and / or a planar camera. The depth camera is used to acquire depth information of the detection area. The planar camera is used to acquire a two-dimensional image of the detection area. In some embodiments, a depth camera may include a sensor that scans the spatial three-dimensional information of a detection area to obtain depth information (e.g., point cloud data) of the detection area. For example, a sensor that scans the detection area spatially using white light interferometry. Another example is a sensor that scans the detection area spatially using white light confocal scanning. Yet another example is a structured light camera that projects specific light information (e.g., crisscrossing laser lines, black and white squares, rings, etc.) onto the detection area using a projector. In some embodiments, the depth camera may also include a binocular camera, a TOF (Time of Fighting) camera, etc. In some embodiments, a planar camera may include a monochrome camera, a color camera, a scanner, etc., or any combination thereof. In some embodiments, a light source may project light information onto the detection area so that the depth camera and / or the planar camera can acquire information about the detection area. In some embodiments, the light source may include a visible light source, which can be used to project light visible to the human eye. The visible light source may include monochromatic light sources and composite light sources. Monochromatic light sources can be used for light of a single frequency (or wavelength) (e.g., red, orange, yellow, green, blue, violet, etc.), while composite light sources are used to project light composed of a mixture of monochromatic lights of different frequencies (or wavelengths). Examples include incandescent lamps and fluorescent lamps. In some embodiments, the light source may also include an invisible light source, which can be used to project light invisible to the human eye (e.g., radio waves, microwaves, infrared light, ultraviolet light, X-rays, gamma rays, far-infrared rays, etc.). In some embodiments, the number of light sources may include one or more. In some embodiments, the light source may include monochromatic light or composite light. In some embodiments, the colors of multiple light sources may be the same or different.For more details on the camera acquisition device 160, please refer to [link / reference]. Figure 3 , Figure 4 And its related descriptions.

[0030] In some embodiments, the camera acquisition device 160 can send the first image to the server 110 via the network 120. Further description of the camera acquisition device 160 can be found in [reference needed]. Figure 3 And its related descriptions.

[0031] The smart lock 170 can be used to perform operations and can be installed on a door. In some embodiments, the smart lock 170 may include an electronic control component and a locking component. The electronic control component can drive the locking component to perform certain operations, such as extending or retracting the bolt of the locking component. When the bolt of the mechanical lock is retracted, the smart lock 170 is in an open state; when the bolt of the mechanical lock is extended, the smart lock 170 is in a locked state. In some embodiments, the electronic control component may be a relay, an electromagnetic coil, etc. In some embodiments, the smart security device for performing a first response operation or a second response operation may include the smart lock 170, and may also include other components, such as a fingerprint recognition component, a prompting component for issuing prompt information (e.g., a speaker, an LED light), etc.

[0032] Figure 2 This is an exemplary block diagram of a face recognition-based intelligent security device control system 200, as shown in some embodiments of this specification.

[0033] like Figure 2 As shown, the intelligent security equipment control system 200 based on face recognition may include an information acquisition module 210, an image acquisition module 220, and an intelligent security equipment control module 230 based on face recognition.

[0034] The information acquisition module 210 can be used to acquire environmental information. For more details about the information acquisition module 210, please refer to [link / reference needed]. Figure 3 , Figure 4 And its related descriptions.

[0035] The image acquisition module 220 can be used to acquire a first image based on environmental information, the first image including a first target face image. In some embodiments, the image acquisition module 220 can determine whether a human body exists within a detection area based on environmental information; if a human body exists, the first image is acquired. For more descriptions of the detection area and the first image, see [link to documentation]. Figure 3 The details and related descriptions will not be repeated here.

[0036] In some embodiments, before acquiring the first image, the image acquisition module 220 may acquire a pre-image based on environmental information, determine the target shooting area based on the pre-image, and acquire the first image based on the target shooting area. In some embodiments, the image acquisition module 220 may determine human feature information based on the pre-image and determine the target shooting area based on the human feature information. In some embodiments, the image acquisition module 220 may acquire the current position of the human body based on the pre-image, and determine whether the human body is located within the target shooting area based on the current position and the target shooting area; if so, acquire the first image. Further descriptions of the pre-image, target shooting area, and human feature information can be found in [reference needed]. Figure 3 , Figure 4 And its related descriptions.

[0037] The intelligent security device control module 230 can be used to control the intelligent security device to perform a first response operation based on a first target face image. In some embodiments, the intelligent security device control module 230 can also be used to control the image acquisition module to repeatedly acquire a second image, including a second target face image, when the first response operation is not a target operation, and to determine the similarity between the first target face image and the second target face image until the similarity meets a preset condition. It is also used to determine whether to control the device to perform an operation based on the second image when the similarity meets the preset condition. For further description of the first response operation, target operation, second image, second target face image, similarity, preset condition, and controlling the device to perform a second response operation based on the second image, see [link to relevant documentation]. Figure 3 The details and related descriptions will not be repeated here.

[0038] In some embodiments, the intelligent security device control module 230 can acquire face images of at least one face in a first image; determine face state information of at least one face based on the face images of at least one face; and determine a first target face image from the face images of at least one face based on the face state information of at least one face. In some embodiments, the intelligent security device control module 230 can determine the legitimacy of the first target face image based on a set of legitimate faces, wherein the set of legitimate faces includes at least one legitimate face image, and control the device to perform a first response operation based on the legitimacy. Further descriptions of face state information, the first target face image, and the determination of the legitimacy of the first target face image can be found in [reference needed]. Figure 3 The details and related descriptions will not be repeated here.

[0039] Figure 3 This is an exemplary flowchart of a face recognition-based intelligent security device control method 300, as shown in some embodiments of this specification. Figure 3As shown, the face recognition-based intelligent security device control method 300 includes the following steps. In some embodiments, the face recognition-based intelligent security device control method 300 can be executed by the server 110.

[0040] Step 310: Obtain environmental information. In some embodiments, step 310 may be performed by the information acquisition module 210.

[0041] In some embodiments, the environmental information may be information related to determining whether an obstacle exists within the detection area, wherein the detection area may be an area in front of a door. In some embodiments, the information acquisition module 210 may acquire the environmental information through an information acquisition device (e.g., information acquisition device 150). In some embodiments, the information acquisition device 150 may integrate at least one sensor for acquiring obstacle information within the detection area. In some embodiments, the sensor may include an infrared sensor, an ultrasonic sensor, or a laser sensor, etc.

[0042] In some embodiments, to reduce false positives, the environmental information may also be information related to determining whether a living being exists in the detection area. In some embodiments, the sensor may also be used to acquire living information in the detection area. In some embodiments, living information may include body temperature, blood oxygen saturation, heart rate, finger veins, etc. In some embodiments, the sensor may also include an infrared pyroelectric sensor.

[0043] Step 320: Acquire a first image based on environmental information. In some embodiments, step 320 may be performed by the image acquisition module 220.

[0044] In some embodiments, the first image may be an image containing image information of the detection area. The format of the first image may include Joint Photographic Experts Group (JPEG), Tagged Image File Format (TIFF), Graphics Interchange Format (GIF), Kodak Flash PiX (FPX), Digital Imaging and Communications in Medicine (DICOM), etc. The first image may be a two-dimensional (2D) image or a three-dimensional (3D) image.

[0045] In some embodiments, the image acquisition module 220 may acquire a first image via a camera acquisition device (e.g., camera acquisition device 160). In some embodiments, when the information acquisition module 210 determines that an obstacle exists in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 may acquire an image or video, and the image acquisition module 220 may use the image acquired by the camera acquisition device 160 as the first image or extract at least one frame from the video acquired by the camera acquisition device 160 as the first image.

[0046] In some embodiments, to avoid non-living objects triggering the acquisition of the first image, when the information acquisition module 210 determines that there is a living body in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 can acquire the first image; when the information acquisition module 210 determines that there is no living body in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 may not acquire the first image.

[0047] In some embodiments, to avoid a living being other than a human (e.g., a pet) triggering the acquisition of the first image, the camera acquisition device 160 can acquire the first image when the information acquisition module 210 determines that a human is present in the detection area based on the environmental information acquired by the information acquisition device 150; when the information acquisition module 210 determines that no human is present in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 may not acquire the first image. For example, when the sensor of the information acquisition device 150 acquires at least one of blood oxygen, heart rate, and finger vein, the camera acquisition device 160 can acquire the first image; when the information acquisition module 210 determines that the information acquisition device 150 has not acquired at least one of blood oxygen, heart rate, and finger vein, the camera acquisition device 160 may not acquire the first image. In some embodiments, the camera device may include a binocular camera. When it is necessary to acquire the first image, the binocular camera can acquire at least two original images containing the detection area, and use one of the two original images as the main image and the other original image as the secondary image. The camera acquisition device 160 can calculate the disparity based on the corresponding pixels between the main image and the sub-image using a specific algorithm (e.g., SGBM (Semi-Global-Block Matching) algorithm, BM (Block Matching) algorithm). Then, based on the geometric relationship of parallel binocular vision, it determines the conversion formula between disparity and depth value (e.g., depth = (f * baseline) / disp, where depth represents the depth value; f represents the normalized focal length; baseline is the distance between the optical centers of the two binocular cameras; and disp is the disparity value). Based on this conversion formula, the disparity is converted into the depth value of the corresponding pixel. Finally, based on the main image, the sub-image, and the depth value of the corresponding pixel, a first image containing the depth information of the detection area is obtained.

[0048] In some embodiments, the image acquisition device 160 may also be a structured light camera, which includes a projector and a camera. When it is necessary to acquire a first image, the projector can project a pattern with a special structure (e.g., discrete light spot, striped light, coded structured light, etc.) onto the detection area, and the camera is used to acquire a first image of the detection area with the projected pattern, wherein the first image is a two-dimensional color image.

[0049] In some embodiments, when the light source and the planar camera of the image acquisition device 160 need to acquire the first image, the light source can emit visible light (e.g., monochromatic light and composite light) to illuminate the detection area, and the planar camera is used to acquire the first image projected with visible light, wherein the first image can be a two-dimensional black and white image or a two-dimensional color image.

[0050] In some embodiments, the image acquisition module 220 may also preprocess the first image acquired by the camera acquisition device 160, wherein the preprocessing may include image denoising, image enhancement, etc.

[0051] Image denoising refers to removing interference information from a first image. Interference information in the first image degrades its quality. In some embodiments, the image acquisition module 220 can achieve image denoising using median filters, machine learning models, etc.

[0052] Image enhancement refers to adding missing information to a first image. Missing information in the first image can cause image blurring. In some embodiments, the image acquisition module 220 can achieve image enhancement using a smoothing filter, median filter, or the like.

[0053] In some embodiments, preprocessing may also include other operations (e.g., image segmentation, etc.).

[0054] In some embodiments, after acquiring a first image, the image acquisition module 220 can also determine whether the acquired first image meets the quality requirements. If it does not meet the quality requirements, the image acquisition module 220 can control the camera acquisition device 160 to acquire the first image again until the first image meets the quality requirements.

[0055] In some embodiments, the quality of an image can be characterized by image quality features. In some embodiments, image quality features may include face features, noise features, grayscale distribution, global grayscale, resolution, and contrast, etc.

[0056] A face refers to a face image contained in the first image. If the first image does not contain at least one face image, subsequent face recognition based on the first target face image is not possible.

[0057] Noise is interfering information in an image. Noise in the first image not only degrades its quality and affects its visual effect, but also impacts the efficiency of subsequent processing, such as the recognition of the target face image. Noise features are used to describe the noise information in an image and are a numerical representation of noise-related information in the image. In some embodiments, noise features may include noise distribution, global noise intensity, noise level, and noise rate.

[0058] Gray-level distribution features reflect the distribution of gray values among pixels in an image. These features can be obtained through image processing. For example, the mean or standard deviation of gray values in an image can also be used as gray-level distribution features.

[0059] Global grayscale refers to the average or weighted average grayscale value of all pixels in an image. The larger the global grayscale value, the darker the image; the smaller the global grayscale value, the brighter the image.

[0060] Resolution refers to the amount of information stored in an image. In some embodiments, resolution can be characterized by the number of pixels contained in an image per unit area. It is understood that the higher the resolution, the clearer the image.

[0061] Contrast ratio refers to the measurement of different brightness levels in an image, representing the magnitude of the grayscale contrast. In some embodiments, contrast ratio can be obtained using formulas such as Weber contrast ratio, root mean square contrast ratio, and Michelson contrast ratio.

[0062] In some embodiments, the image acquisition module 220 can analyze the quality features of the first image to determine whether the quality of the first image meets the requirements. For example, if the first image does not include facial features, then the first image does not meet the quality requirements. Also, for example, if the resolution of the first image is less than 1024x768, then the first image does not meet the quality requirements.

[0063] In some embodiments, before recognizing the first target face in the first image, the image acquisition module 220 first determines whether the first image meets the quality requirements, thereby avoiding the recognition of invalid first images and improving the accuracy and efficiency of face recognition.

[0064] In some embodiments, the first image may include a first target face image.

[0065] The first target face image can be a face image in the first image used for subsequent face recognition. In some embodiments, the first image may include face images of at least one face, and the image acquisition module 220 can determine the first target face image from the face images of at least one face.

[0066] In some embodiments, the image acquisition module 220 can acquire a face image of at least one face from a first image. In some embodiments, the image acquisition module 220 can first extract multiple overlapping image patches from the first image. In some embodiments, the image acquisition module 220 can extract multiple image patches from the image using multi-scale sliding-window, selective search, neural networks, etc. Further, the image acquisition module 220 can further extract features from the multiple image patches, determine whether the image patches include a face, thereby acquiring at least one image patch containing a face, and using the at least one image patch containing a face as the face image of at least one face.

[0067] In some embodiments, the image acquisition module 220 may determine a first target face image from the face images of at least one face based on features of the face images of at least one face. For example, the image acquisition module 220 may determine the proportion of the face area in the face image to the total area of the face image, and select the face image with the largest proportion as the first target face. Alternatively, the image acquisition module 220 may determine whether the face image contains a complete image of facial features, and select the face image containing complete facial features as the first target face image.

[0068] In some embodiments, the image acquisition module 220 may also acquire a first target face image from the first image based on a first machine learning model. In some embodiments, the first machine learning model may include, but is not limited to, a Visual Geometry Group Network (VGG) model, an Inception NET model, a Fully Convolutional Network (FCN) model, a Segmentation Network (SegNet) model, and a Mask-Region Convolutional Neural Network (Mask-RCNN) model.

[0069] In some embodiments, when the image acquisition module 220 trains the first machine learning model, it can use multiple labeled sample images as training data to learn the model's parameters through common methods (e.g., gradient descent). The sample images may include face images containing at least one face, and the labels may be target face images of the sample images. In some embodiments, the first machine learning model may be trained in another device or module.

[0070] In some embodiments, the features of the face image may further include face state information, wherein the face state information can be used to characterize the state of the face when the first image is acquired.

[0071] In some embodiments, facial state information may include facial area, facial expression, and facial angle, wherein the facial expression may be determined based on facial features / key facial features (e.g., facial features) of the facial image, and the facial angle may be the angle of the face relative to the camera acquisition device 160.

[0072] In some embodiments, the image acquisition module 220 can determine facial state information based on a feature extraction algorithm. In some embodiments, the feature extraction algorithms for extracting facial state information from a face image include, but are not limited to, Histogram of Oriented Gradients (HOG), Local Binary Pattern (LBP) algorithm, Scale Invariant Feature Transform (SIFT) algorithm, Haar-like algorithm, Gray-level Co-occurrence Matrix (GLCM) method, Hough transform, Fourier transform, Fourier shape descriptors, shape factor, Finite Element Method (FEM), Turning function, and Wavelet Descriptor, etc.

[0073] In some embodiments, the image acquisition module 220 can determine a first target face image from face images of at least one face based on face state information of at least one face. In some embodiments, the image acquisition module 220 can determine the recognition intention of a face image based on face state information, wherein the recognition intention is used to characterize the degree to which the person corresponding to the face image is willing to be the object of face recognition. In some embodiments, the image acquisition module 220 can determine the recognition intention of a face image based on at least one of face area, face expression, and face angle. For example, the larger the face area, the greater the recognition intention. Also, the calmer the face expression, the greater the recognition intention. Also, the smaller the face angle, the greater the recognition intention.

[0074] In some embodiments, the image acquisition module 220 can also normalize the face area, facial expression, and face angle, and determine the willingness to be identified based on the weighted result of the normalized face area, facial expression, and face angle. For example, the image acquisition module 220 can determine the willingness to be identified based on the following formula:

[0075] Q = aX + bY + cZ;

[0076] Where Q represents the intention to be identified, X represents the normalized face area, Y represents the normalized face expression, Z represents the normalized face angle, a represents the weight of the normalized face area, b represents the weight of the normalized face expression, and c represents the weight of the normalized face angle.

[0077] In some embodiments, the image acquisition module 220 may determine the first target face based on the willingness to be identified. For example, the image acquisition module 220 may use the face image with the highest willingness to be identified as the first target face. Alternatively, the image acquisition module 220 may use face images with a willingness to be identified greater than a preset willingness threshold as the first target face.

[0078] In some embodiments, the image acquisition module 220 determines the willingness of a face image to be recognized based on face state information, and then determines the first target face image from the face images of at least one face in the first image based on the willingness to be recognized, which can more accurately determine the person who needs to be recognized.

[0079] Step 330: Based on the first target face image, control the device to perform a first response operation. In some embodiments, step 320 may be performed by the intelligent security device control module 230.

[0080] In some embodiments, the intelligent security device control module 230 can determine whether the first target face image is a legitimate face image, and control the device to perform a first response operation based on the determination result. The first response operation can be an operation performed by the intelligent security device. For example, this could be unlocking the smart lock 170, further locking the smart lock 170 (e.g., controlling the second bolt to extend while the first bolt is extended), activating the fingerprint recognition component, or issuing a prompt message through the prompt component. For instance, when the intelligent security device control module 230 determines that the first target face is a legitimate target face, it can control the smart lock (e.g., smart lock 170) to unlock and / or activate the fingerprint recognition component; when the intelligent security device control module 230 determines that the first target face is not a legitimate target face, it can control the smart lock (e.g., smart lock 170) to further lock and / or issue a prompt message (e.g., voice message, light message, etc.). The legitimate face image can be a pre-stored face image.

[0081] In some embodiments, the intelligent security device control module 230 can determine whether the first target face image is legitimate based on a set of legitimate faces, wherein the set of legitimate faces may include at least one legitimate face image.

[0082] In some embodiments, the intelligent security device control module 230 may obtain a set of legitimate faces from one or more components of the intelligent security device control system 100 (e.g., user terminal 130, storage device 140, etc.) or from an external source (e.g., a database) via the network 120.

[0083] In some embodiments, the intelligent security device control module 230 can determine whether the first target face image is legitimate by judging whether there is a legitimate face image similar to the first target face image in the set of legitimate faces. For example, when there is a legitimate face image similar to the first target face image in the set of legitimate faces, the intelligent security device control module 230 can determine that the first target face image is legitimate; when there is no legitimate face image similar to the first target face image in the set of legitimate faces, the intelligent security device control module 230 can determine that the first target face image is not legitimate.

[0084] In some embodiments, the intelligent security device control module 230 can determine whether the first target face image is legitimate based on the first image features of the first target face image and the second image features of at least one legitimate face image.

[0085] In some embodiments, the intelligent security device control module 230 can determine whether a first target face image is legitimate based on a first image feature of the first target face image and a second image feature of at least one legitimate face image using a second machine learning model. In some embodiments, the input to the second machine learning model is the first image feature of the first target face image and the second image feature of at least one legitimate face image, and the output of the second machine learning model is whether the first target face image is legitimate. In some embodiments, the second machine learning model may include, but is not limited to, Visual Geometry Group Network (VGG) models, Inception NET models, Fully Convolutional Networks (FCN) models, Segmentation Networks (SegNet) models, and Mask-Region Convolutional Neural Networks (Mask-RCNN) models, etc.

[0086] In some embodiments, when the image acquisition module 220 trains the second machine learning model, it can use multiple labeled sample images as training data to learn the model's parameters through common methods (e.g., gradient descent). The sample images may include face images containing at least one face, and the labels may indicate whether the sample image is legitimate. In some embodiments, the first machine learning model may be trained in another device or module.

[0087] Step 340: If the first response operation is not the target operation, the process of acquiring the second image (including the second target face image) is repeated, and the similarity between the first target face image and the second target face image is determined until the similarity meets a preset condition. When the similarity meets the preset condition, the intelligent security device is controlled to perform the second response operation based on the second target face image. In some embodiments, this step 340 can be performed by the intelligent security device control module 230.

[0088] In some embodiments, the target operation can be the operation performed by the smart security device after determining that the first target face image is a legitimate face image (e.g., unlocking the smart lock 170 and / or activating the fingerprint recognition component, etc.).

[0089] In some embodiments, if a user's face is obscured (e.g., wearing a mask), preventing the first target face image from being recognized as a legitimate face image, the user still needs to leave and return to the detection area even after removing the obscuration, triggering the next face recognition attempt, resulting in a poor user experience. To improve the user experience, after recognizing the first target face image as invalid, a second target face image can be acquired. This allows the intelligent security device control module 230 to directly recognize the second target face image after the user removes the face obscuration, without requiring the user to leave the detection area.

[0090] The second image can be an image containing image information of the detection area that is acquired again by the camera acquisition device (e.g., camera acquisition device 160) after acquiring the first image.

[0091] In some embodiments, if the first response operation is not the target operation, the intelligent security device control module 230 can control the camera acquisition device (e.g., camera acquisition device 160) to acquire the second image. The way the camera acquisition device (e.g., camera acquisition device 160) acquires the second image is similar to the way the camera acquisition device (e.g., camera acquisition device 160) acquires the first image. For more details on acquiring the second image, please refer to the relevant description on acquiring the first image.

[0092] The second image may include a second target face image, which may be a face image within the second image. In some embodiments, the second image may include face images of at least one face, and the image acquisition module 220 may determine the second target face image from the face images of at least one face. The method of determining the second target face from the second image is similar to the method of determining the first target face from the first image; for further description of acquiring the second target face, please refer to the relevant description of acquiring the first target face.

[0093] In some embodiments, the intelligent security device control module 230 can also determine the similarity between the first target face image and the second target face image. In some embodiments, the intelligent security device control module 230 can determine the similarity between the first target face image and the second target face image based on the face image features of the first target face image (i.e., the third image feature) and the face image features of the second target face image (i.e., the fourth image feature). In some embodiments, the third image feature may include facial features / key facial features (e.g., facial features), face position, etc.

[0094] In some embodiments, the intelligent security device control module 230 can determine the similarity between a first target face image and a second target face image based on at least one of facial features / key facial features and face position. For example, the more similar the facial features / key facial features of the first target face image are to the facial features / key facial features of the second target face image, the higher the similarity between the first target face image and the second target face image. Also, for example, the closer the face position of the first target face image is to the face position of the second target face image, the higher the similarity between the first target face image and the second target face image.

[0095] In some embodiments, the preset condition can be a condition used to determine whether it is necessary to acquire the second image again. In some embodiments, the preset condition can be that the similarity between the first target face image and the second target face image is less than a preset similarity threshold (e.g., 50%). For example, when the similarity between the first target face image and the second target face image is less than the preset similarity threshold (e.g., 50%), the intelligent security device control module 230 can perform face recognition again based on the second target face image of the second image; when the similarity between the first target face image and the second target face image is greater than the preset similarity threshold (e.g., 50%), the intelligent security device control module 230 can control the camera acquisition device (e.g., camera acquisition device 160) to acquire a second second image and determine the similarity between the second target face image in the second image and the first target face image, until the similarity between a certain second image acquired by the camera acquisition device and the first target face image is less than the preset similarity threshold.

[0096] In some embodiments, the intelligent security device control module 230 can determine the similarity between the first target face image and the second target face image based on a third machine learning model and determine whether the second image meets the preset conditions.

[0097] like Figure 5 As shown, the third machine learning model may include a feature extraction layer 503, a similarity calculation layer 506, and a discrimination layer 508.

[0098] In some embodiments, the third machine learning model can analyze the first target face image and the second target face image respectively to obtain the similarity between the third image feature of the first target face image and the fourth image feature of the second target face image in the image pair.

[0099] The feature extraction layer 503 can be used to process the first target face image 501 and the second target face image 502 to obtain the third image features and the fourth image features, respectively.

[0100] The third image feature is the facial features / key facial features (e.g., facial features) of the face in the first target face image 501. The fourth image feature is the facial features / key facial features (e.g., facial features) of the face in the second target face image 502.

[0101] In some embodiments, the feature extraction layer 503 may include a convolutional neural network (CNN) model such as ResNet, ResNeXt, SE-Net, DenseNet, MobileNet, ShuffleNet, RegNet, EfficientNet, or Inception, or a recurrent neural network model.

[0102] The input to the feature extraction layer 503 can be a first target face image 501 and a second target face image 502. For example, the first target face image 501 and the second target face image 502 can be concatenated and then input into the feature extraction layer 503. The output of the feature extraction layer 503 can be a third image feature 504 of the first target face image and a fourth image feature 505 of the second target face image.

[0103] The similarity calculation layer 506 can be used to calculate the similarity between the third image feature 504 of the first target face image and the fourth image feature 505 of the second target face image. For example, the intelligent security device control module 230 can input the third image feature 504 of the first target face image and the fourth image feature 505 of the second target face image into the similarity calculation layer 506, and the similarity calculation layer 506 can output the similarity 507 between the first target face image 501 and the second target face image 502.

[0104] The discriminant layer 508 can determine whether the similarity 507 meets the preset conditions. Specifically, the discriminant layer 508 can compare the similarity 507 with the similarity threshold. If the similarity 507 is greater than the similarity threshold, then the similarity 507 does not meet the preset conditions.

[0105] In some embodiments, the third machine learning model can be a machine learning model with preset parameters. Preset parameters refer to the model parameters that the machine learning model learns during training. Taking a neural network as an example, model parameters include weights and biases. The preset parameters of the third machine learning model are generated through the training process. For example, the intelligent security device control module 230 can train an initial third machine learning model based on multiple labeled training samples to obtain the third machine learning model.

[0106] The training samples consist of one or more labeled pairs of sample images. Each pair includes a first sample image and a second sample image. The first sample image may contain a first face image. The second sample image contains a second face image. The labels on the training samples indicate whether the similarity between the second sample image and the first sample image meets a preset condition.

[0107] In some embodiments, the intelligent security device control module 230 can input training samples into an initial third machine learning model, and update the parameters of the initial feature extraction layer, initial similarity calculation layer, and initial discrimination layer through training until the updated third machine learning model meets preset conditions. The updated third machine learning model can be designated as the trained third machine learning model, wherein the preset conditions can be that the loss function of the updated third machine learning model is less than a threshold, convergence, or the number of training iterations reaches a threshold.

[0108] In some embodiments, the intelligent security device control module 230 can train the initial feature extraction layer, initial similarity calculation layer, and initial discrimination layer in the third machine learning model through an end-to-end training method. End-to-end training means inputting training samples into the initial model, determining the loss value based on the output of the initial model, and updating the initial model based on the loss value. The initial model may contain multiple sub-models or modules for performing different data processing operations, which are treated as a whole and updated simultaneously during training. For example, in the initial third machine learning model, a first sample image and at least one second sample image can be input into the initial feature extraction layer, a loss function can be established based on the output of the initial discrimination layer and the label, and the parameters of each initial layer in the third machine learning model can be updated simultaneously based on the loss function.

[0109] In some embodiments, the third machine learning model may be pre-trained by a processing device or a third party and stored in a storage device, and the processing device may directly call the third machine learning model from the storage device.

[0110] In some embodiments, when the similarity between the first target face image and the second target face image is less than a preset similarity threshold (e.g., 50%), it can be determined that the face undergoing face recognition has changed significantly (e.g., the face corresponding to the first target face image has been unobstructed, or other faces requiring face recognition have appeared). The intelligent security device control module 230 can then control the intelligent security device to perform a second response operation based on the second target face image. This second response operation can be an operation performed by the intelligent security device based on the result of face recognition based on the second target face image. Examples include unlocking, further locking (e.g., controlling the second lock tongue to extend while the first lock tongue is extended), activating fingerprint recognition, and issuing a prompt message.

[0111] In some embodiments, the intelligent security device control module 230 can determine whether the second target face image is legitimate by judging whether there is a legitimate face image similar to the second target face image in the legitimate face set, and thus determine whether to control the device to perform a second response operation. For example, when there is a legitimate face image similar to the second target face image in the legitimate face set, the intelligent security device control module 230 can determine that the second target face image is legitimate and control the smart lock 170 to perform operations such as unlocking and / or activating the fingerprint recognition component; when there is no legitimate face image similar to the second target face image in the legitimate face set, the intelligent security device control module 230 can determine that the second target face image is not legitimate and control the smart lock 170 to perform operations such as further locking and / or controlling the prompting component to issue prompt information (e.g., voice information, light information, etc.).

[0112] In some embodiments, the intelligent security device control module 230 can also use a second machine learning model to determine whether the second target face image is legitimate based on a set of legitimate faces.

[0113] In some embodiments, the intelligent security device control module 230 can control the camera acquisition device to repeatedly acquire the second image until the similarity between the first target face image and the second target face image meets a preset condition. This avoids performing face recognition again after the user completes face recognition once, provided that the user's position or face has not changed significantly. This effectively reduces the number of face recognition attempts and improves unlocking efficiency.

[0114] In some embodiments, after determining that the first target face image is not a legitimate face image, the face recognition-based intelligent security device control module 230 can repeatedly acquire images for face recognition until the face recognition passes (i.e., the face is legitimate). In some embodiments, a preset recognition count threshold can be used to prevent the intelligent security device control module 230 from performing too many face recognitions. If the number of face recognitions already completed is close to the recognition count threshold (e.g., the difference is less than 2), the intelligent security device control module 230 can stop acquiring images and performing face recognition. In some embodiments, after stopping face recognition, the intelligent security device control module 230 can also issue a prompt message (e.g., voice message, light message, etc.) to notify the user that face recognition has stopped.

[0115] In some embodiments, the user can determine whether facial recognition has succeeded based on the response actions performed by the device. In some embodiments, the user can also determine the number of facial recognition attempts based on the number of response actions performed by the device. For example, if the smart security device sequentially issues two prompt messages and unlocks the door once, the user can determine that three facial recognition attempts were performed, and that the first and second attempts failed, while the third attempt succeeded.

[0116] In some embodiments, in order to improve the completeness and clarity of the first target face image and / or the second target face image acquired by the camera acquisition device (e.g., camera acquisition device 160) and facilitate subsequent face recognition based on the first target face image and / or the second target face image, the image acquisition module 220 may also determine the target shooting area before the first target face image is acquired by the camera acquisition device (e.g., camera acquisition device 160). When a human body stands in or near the target shooting area, the camera acquisition device can acquire the first target face image and / or the second target face image.

[0117] Figure 4 This is an exemplary flowchart illustrating the acquisition of a first image based on a target shooting area, according to some embodiments of this specification. Figure 4 As shown, process 400 includes the following steps. In some embodiments, process 400 may be executed by server 110.

[0118] Step 410: Acquire a pre-image based on environmental information. In some embodiments, this step 410 may be performed by the image acquisition module 220.

[0119] In some embodiments, before acquiring the first image, the image acquisition module 220 may acquire a pre-image based on environmental information. The pre-image may be an image containing a human body located within the detection area acquired by the camera acquisition device (e.g., camera acquisition device 160) before acquiring the first image.

[0120] In some embodiments, when the information acquisition module 210 determines that there is an obstacle in the detection area based on the environmental information acquired by the information acquisition device 150, the image acquisition module 220 can control the camera acquisition device 160 to acquire a pre-image. In some embodiments, when the image acquired by the main camera of the camera acquisition device 160 does not include a human body, the camera acquisition device 160 can acquire a pre-image through the secondary camera, which facilitates the subsequent image acquisition module 220 in determining the current distance between the human body and the camera acquisition device based on the pre-image, and at the same time, it can also minimize the possibility of the acquired pre-image not containing a human body.

[0121] In some embodiments, to avoid non-living obstacles triggering the acquisition of pre-images, when the information acquisition module 210 determines that there is a living body in the detection area based on the environmental information acquired by the information acquisition device 150, the image acquisition module 220 can control the camera acquisition device 160 to acquire pre-images.

[0122] In some embodiments, to avoid obstacles such as living beings (e.g., pets) triggering the acquisition of pre-images, when the information acquisition module 210 determines that a human body is present in the detection area based on the environmental information acquired by the information acquisition device 150, the image acquisition module 220 can control the camera acquisition device 160 to acquire a pre-image; when the information acquisition module 210 determines that no human body is present in the detection area based on the environmental information acquired by the information acquisition device 150, the camera acquisition device 160 may not acquire a pre-image. For example, when the sensor of the information acquisition device 150 acquires at least one of blood oxygen, heart rate, and finger vein, the image acquisition module 220 can control the camera acquisition device 160 to acquire a pre-image.

[0123] Step 420: Determine the target shooting area based on the pre-image. In some embodiments, this step 420 may be performed by the image acquisition module 220.

[0124] In some embodiments, the pre-image may include an image of at least one human body. In some embodiments, the image acquisition module 220 may determine the image of a target human body from the image of at least one human body. In some embodiments, the image acquisition module 220 may determine the target human body from the at least one human body based on features of the face image of the at least one human body. For example, the image acquisition module 220 may determine the proportion of the face area of each face in the face image to the total area of the pre-image, and take the human body corresponding to the face image with the largest proportion as the target human body. Alternatively, the image acquisition module 220 may determine whether the image of at least one human body contains a complete image of facial features, and take the human body corresponding to the image containing the complete facial features as the target human body.

[0125] In some embodiments, the image acquisition module 220 can determine the identification intention of each human body based on an image of at least one human body, and determine the target human body from the at least one human body based on the identification intention. For a more detailed description of determining the identification intention, please refer to [link to relevant documentation]. Figure 3 And its related descriptions.

[0126] In some embodiments, the target shooting area is a region within the detection area. In some embodiments, when the target human body is located in the target shooting area, the first image and / or the second image acquired by the camera acquisition device 160 are more complete and clearer, making it easier for the subsequent intelligent security equipment control module 230 to perform face recognition based on the first target face image and / or the second target face image.

[0127] In some embodiments, the image acquisition module 220 may determine the target shooting area based on a pre-image.

[0128] In some embodiments, the image acquisition module 220 can determine human body feature information based on a pre-image, wherein the human body feature information can be information related to the body shape of the target human body, such as height.

[0129] In some embodiments, the pre-image may be a depth map containing depth information. In some embodiments, the image acquisition module 220 may determine human feature information based on the depth information of the depth map. For example, the image acquisition module 220 may use the camera acquisition device as the coordinate origin, convert the depth map into point cloud data based on coordinate transformation, and acquire the point cloud data of the target human body from the point cloud data. The image acquisition module 220 may acquire the three-dimensional coordinates of the highest point and the lowest point of the human body in the point cloud data of the target human body, and calculate the height difference (i.e., human height) between the highest and lowest points of the human body.

[0130] In some embodiments, the image acquisition module 220 can determine the target shooting area based on the target angle of the target human body relative to the camera acquisition device 160 and the target distance between the target human body and the camera acquisition device 160. In some embodiments, the image acquisition module 220 can pre-store the target angle of the target human body relative to the camera acquisition device 160 (e.g., -5° to 5°). In some embodiments, the image acquisition module 220 can determine the target distance between the target human body and the camera acquisition device 160 based on human body feature information. For example, the image acquisition module 220 can determine the target distance between the target human body and the camera acquisition device 160 based on height. For example, when the height is 1 meter, the target distance is 1.2 meters. As another example, when the height is 1.8 meters, the target distance is 1.5 meters.

[0131] In some embodiments, the image acquisition module 220 can determine the target distance between the target human body and the camera acquisition device 160 based on the following formula:

[0132] L = aH + N;

[0133] Where L is the target distance, a is a preset coefficient (a is a positive number), H is the height, and N is a preset fixed distance (e.g., 1 meter).

[0134] In some embodiments, the image acquisition module 220 determines the target shooting area based on human body feature information, so that the target shooting area can be adjusted according to different people, thereby making the acquired images of people with different human body features more complete, clear, and convenient for subsequent face recognition based on the first target face image and / or the second target face image.

[0135] Step 430: Acquire a first image based on the target shooting area. In some embodiments, this step 430 may be performed by the image acquisition module 220.

[0136] In some embodiments, when the target human body approaches the target shooting area, the image acquisition module 220 can acquire a first image.

[0137] In some embodiments, the image acquisition module 220 can acquire the current position of the target human body. In some embodiments, the image acquisition module 220 can determine the current position of the target human body based on a pre-image, wherein the current position of the target human body can be information related to the relative position of the target human body with respect to the camera acquisition device, such as the current distance between the target human body and the camera acquisition device, or the current angle between the target human body and the camera acquisition device.

[0138] In some embodiments, the pre-image may be a depth map containing depth information. In some embodiments, the image acquisition module 220 may determine the current distance between the target human body and the camera acquisition device based on the pixel values of the target human body image in the pre-image.

[0139] In some embodiments, the image acquisition module 220 can determine the angle between the target human body and the camera acquisition device based on the depth information of the depth map. For example, the image acquisition module 220 can use the camera acquisition device as the coordinate origin, convert the depth map into point cloud data based on coordinate transformation, and acquire the point cloud data of the target human body from the point cloud data. Based on the three-dimensional coordinates of the target human body's point cloud data, the angle between the target human body and the camera acquisition device is determined. For example, the image acquisition module 220 can randomly select several points from the point cloud data of the target human body as sampling points, and use the average of the three-dimensional coordinates of the several sampling points as the three-dimensional coordinates used to represent the target human body, and calculate the angle between the three-dimensional coordinates used to represent the target human body and the origin (i.e., the current angle between the target human body and the camera acquisition device).

[0140] In some embodiments, the image acquisition module 220 may also acquire the current distance between the target human body and the camera acquisition device in other ways, such as by using laser, structured light, signal interference, etc.

[0141] In some embodiments, the image acquisition module 220 can determine whether the target human body is close to the target shooting area based on the current distance and current angle between the target human body and the camera acquisition device. For example, if the absolute value of the difference between the current angle between the target human body and the camera acquisition device and the target angle is greater than an angle difference threshold (e.g., 3°), the image acquisition module 220 can determine that the target human body is not close to the target shooting area. As another example, if the absolute value of the difference between the current distance between the target human body and the camera acquisition device and the target distance is greater than a distance difference threshold (e.g., 20cm), the image acquisition module 220 can determine that the target human body is not close to the target shooting area.

[0142] In some embodiments, when the image acquisition module 220 determines that the target human body is not located in the target shooting area, the image acquisition module 220 can also generate a prompt message to prompt the user to move closer to the target shooting area based on the current position of the target human body and the target shooting area. In some embodiments, the image acquisition module 220 can determine the prompt message based on the difference between the current angle and the target angle between the target human body and the camera acquisition device, and the difference between the current distance and the target distance between the target human body and the camera acquisition device, so as to guide the user to quickly adjust from the current position to the target shooting area.

[0143] In some embodiments, the prompt information may be voice information, text information, and image information. For example, the prompt information may be the voice message "Please walk 10cm straight ahead." Another example is the text message "Please walk 5cm to the left" displayed on the client's screen. Yet another example is the prompt information may be light projected onto the target shooting area.

[0144] In some embodiments, the image acquisition module 220 can also display the real-time image of the user captured by the camera acquisition device 160 on a display screen installed outside the door, and mark the target shooting area on the real-time image. The user can quickly adjust from the current position to the target shooting area by watching the real-time image marked with the target shooting area displayed on the display screen.

[0145] The basic concepts have been described above. Obviously, for those skilled in the art, the detailed disclosure above is merely illustrative and does not constitute a limitation of this specification. Although not explicitly stated herein, those skilled in the art may make various modifications, improvements, and corrections to this specification. Such modifications, improvements, and corrections are suggested in this specification and therefore remain within the spirit and scope of the exemplary embodiments described herein.

[0146] Furthermore, this specification uses specific terms to describe embodiments thereof. For example, "an embodiment," "one embodiment," and / or "some embodiments" refer to a particular feature, structure, or characteristic associated with at least one embodiment of this specification. Therefore, it should be emphasized and noted that references to "an embodiment," "one embodiment," or "an alternative embodiment" in different locations throughout this specification do not necessarily refer to the same embodiment. Moreover, certain features, structures, or characteristics in one or more embodiments of this specification can be appropriately combined.

[0147] Furthermore, unless expressly stated in the claims, the order of processing elements and sequences, the use of numbers and letters, or other names described in this specification are not intended to limit the order of the processes and methods described herein. Although various examples have been discussed in the foregoing disclosure of some embodiments of the invention that are currently considered useful, it should be understood that such details are for illustrative purposes only, and the appended claims are not limited to the disclosed embodiments; rather, the claims are intended to cover all modifications and equivalent combinations that conform to the spirit and scope of the embodiments described herein. For example, while the system components described above can be implemented using hardware devices, they can also be implemented solely using software solutions, such as installing the described system on existing servers or mobile devices.

[0148] Similarly, it should be noted that, in order to simplify the description disclosed herein and thus aid in the understanding of one or more embodiments of the invention, the foregoing description of embodiments in this specification may sometimes combine multiple features into a single embodiment, drawing, or description thereof. However, this method of disclosure does not imply that the subject matter of this specification requires more features than those mentioned in the claims. In fact, the embodiments contain fewer features than all the features of a single embodiment disclosed above.

[0149] In some embodiments, numbers describing the quantity of components and attributes are used. It should be understood that such numbers used in the description of embodiments are modified in some examples with the terms "approximately," "approximately," or "generally." Unless otherwise stated, "approximately," "approximately," or "generally" indicates that the numbers are allowed to vary by ±20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximate values, which may be changed depending on the characteristics required by individual embodiments. In some embodiments, numerical parameters should take into account specified significant digits and employ a general method of digit reservation. Although the numerical ranges and parameters used to confirm their breadth of range in some embodiments of this specification are approximate values, in specific embodiments, such values are set as precisely as feasible.

[0150] For each patent, patent application, patent application publication, and other material, such as articles, books, specifications, publications, and documents, referenced in this specification, the entire contents of which are incorporated herein by reference. This excludes historical application documents that are inconsistent with or conflict with the content of this specification, as well as documents that limit the broadest scope of the claims in this specification (currently or subsequently appended to this specification). It should be noted that in the event of any inconsistency or conflict between the descriptions, definitions, and / or terminology used in the supplementary materials to this specification and the content of this specification, the descriptions, definitions, and / or terminology used in this specification shall prevail.

[0151] Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments described herein. Other variations may also fall within the scope of this specification. Therefore, alternative configurations of the embodiments described herein are intended to be illustrative rather than limiting, and should be considered consistent with the teachings of this specification. Accordingly, the embodiments described herein are not limited to those explicitly introduced and described herein.

Claims

1. A control method for intelligent security equipment based on face recognition, characterized in that, include: Obtain environmental information; A first image is obtained based on the environmental information, and the first image includes a first target face image; Perform face recognition on the first target face image, and control the device to perform a first response operation based on the recognition result of the first target face image; If the first response operation is not the target operation, the process of obtaining the second image, which includes the second target face image, is repeated until the similarity between the first target face image and the second target face image is determined, until the similarity meets a preset condition. The target operation is the operation performed after determining that the first target face image is a legitimate face image; the preset condition includes that the similarity between the first target face image and the second target face image is less than a preset similarity threshold; Face recognition is performed on the second target face image whose similarity meets the preset conditions, and the device is controlled to perform a second response operation based on the recognition result of the second target face image.

2. The method as described in claim 1, characterized in that, The process of acquiring the first image based on the environmental information includes: Based on the environmental information, determine whether there are living organisms within the detection area; If it is determined that a living organism exists within the detection area, the first image is acquired.

3. The method as described in claim 1, characterized in that, The environmental information includes at least information related to determining whether there are obstacles in the detection area and / or information related to determining whether there are living beings in the detection area.

4. The method according to claim 1, characterized in that, The process of acquiring the first image based on the environmental information includes: Based on the environmental information, determine whether a human body is present in the detection area; If a human body is present, acquire the first image.

5. The method according to claim 4, characterized in that, The acquisition of the first image includes: Pre-images are obtained based on the environmental information; The target shooting area is determined based on the pre-image; The first image is acquired based on the target shooting area.

6. The method according to claim 5, characterized in that, Determining the target shooting area based on the pre-image includes: Human feature information is determined based on the pre-image; The target shooting area is determined based on the human body feature information.

7. The method according to claim 5, characterized in that, The process of acquiring the image further includes: The current position of the human body is obtained based on the pre-image; The first image is acquired based on the current location and the target shooting area.

8. The method according to claim 1, characterized in that, The step of controlling the device to perform the first response operation based on the first target face image includes: The legitimacy of the first target face image is determined based on a set of legitimate faces, wherein the set of legitimate faces includes at least one legitimate face image; Based on the legality judgment, the device is controlled to execute the first response operation.

9. The method according to claim 1, characterized in that, The first image includes at least one human face; Obtaining the first target face image from the first image includes: Obtain a face image of at least one face in the first image; Based on the facial images of the at least one face, determine the facial state information of the at least one face; The first target face image is determined from the face images of the at least one face based on the face state information of the at least one face.

10. A control system for intelligent security equipment based on facial recognition, characterized in that, include: The information acquisition module is used to acquire environmental information; The image acquisition module is used to acquire a first image based on the environmental information, wherein the first image includes a first target face image; The intelligent security equipment control module is used to perform face recognition on the first target face image and control the device to perform a first response operation based on the recognition result of the first target face image; If the first response operation is not the target operation, the process of acquiring the second image, which includes a second target face image, is repeated until the similarity between the first target face image and the second target face image is determined, and the similarity meets a preset condition. The target operation is the operation performed after determining that the first target face image is a legitimate face image. The preset condition includes that the similarity between the first target face image and the second target face image is less than a preset similarity threshold. The process is used to perform face recognition on the second target face image whose similarity meets the preset condition, and to control the device to perform the second response operation based on the recognition result of the second target face image.

11. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the method as described in any one of claims 1-9.

12. A computer-readable storage medium storing computer instructions, characterized in that, After the computer reads the computer instructions from the storage medium, the computer executes the method as described in any one of claims 1-9.