Face recognition method, electronic device, storage medium and program product
By introducing an active defense architecture of instruction separation and behavioral counter-evidence into the face recognition system, and utilizing the semantic adversarial between the inducement display system and the real instruction system, combined with the dual judgment of the arbitration system, the problem of low security in traditional face recognition is solved, and the system's identity verification and intrusion detection are improved simultaneously.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- INDUSTRIAL AND COMMERCIAL BANK OF CHINA
- Filing Date
- 2026-02-04
- Publication Date
- 2026-06-19
AI Technical Summary
Traditional facial recognition technology relies on liveness detection schemes that depend on a single hardware and software system, which are easily compromised by attackers, resulting in low security.
The system employs a proactive defense architecture that separates instructions and uses behavioral counter-evidence. It presents users with semantically opposite instructions and real instructions through a physically isolated induction display system and a real instruction system, respectively. An arbitration system is used to detect attack behavior in real time, enabling dual judgment of correct and incorrect behavior.
While completing identity verification, the system also simultaneously detects whether the system itself has been compromised, improving the overall robustness and proactive defense capabilities of the system and enhancing the security of facial recognition.
Smart Images

Figure CN122244962A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of artificial intelligence technology, and more particularly to a face recognition method, electronic device, storage medium, and program product. Background Technology
[0002] Facial recognition technology, due to its convenience and contactless nature, has been widely used in fields with extremely high security requirements, such as financial payments and access control security. However, with the continuous evolution of attack methods, traditional face liveness detection solutions that rely on a single hardware and software system to complete instruction generation, presentation, collection, and comparison are facing increasingly severe security challenges.
[0003] In related technologies, the entire verification process is typically completed within an integrated system: the system displays random action commands on the screen, such as "blink" or "nod," and uses a camera on the same device to capture video of the user executing the command. Finally, an algorithm analyzes whether the actions in the video match the command to complete the liveness verification. Although this centralized architecture is simple to deploy, its command generation, display, acquisition, and judgment logic all run within the same trusted environment.
[0004] In the above process, the risk of single point of failure in the system is extremely high. Once an attacker breaks through the single system by forging the interface, intruding into the backend, or hijacking the camera, they can completely control the instruction content and video data, thereby easily bypassing liveness detection, resulting in low security of facial recognition. Summary of the Invention
[0005] This application provides a face recognition method, electronic device, storage medium, and program product to address the issue of low security in face recognition technologies.
[0006] Firstly, this application provides a face recognition method, including:
[0007] In response to the trigger command, the system generates and displays the induced action command through the induced display system and initiates the video capture operation. It also generates and displays the real action command through the real command system. The real action command and the induced action command have contradictory semantics.
[0008] In response to the user's execution of real action commands, a video stream containing the user's actions is acquired through a guided display system;
[0009] The system obtains video stream and real action instructions through arbitration, performs authentication and intrusion detection, and obtains authentication results and intrusion detection results. Authentication is used to compare whether the user's actions in the video stream are consistent with the real action instructions, and intrusion detection is used to determine whether the user's actions in the video stream are consistent with the induced action instructions.
[0010] Based on the authentication results and intrusion detection results, execute the corresponding security response actions.
[0011] In one possible implementation, before generating and displaying induced action commands via the induced display system in response to a trigger command, initiating video capture operations, and generating and displaying real action commands via the real command system, the method further includes:
[0012] In response to the user's physical button operation, the trigger system generates and sends trigger commands to the guidance display system and the actual command system respectively.
[0013] In one possible implementation, information about the video stream and real-time action commands is obtained through an arbitration system, and authentication and intrusion detection are performed to obtain authentication results and intrusion detection results, including:
[0014] Extracting user facial motion feature sequences from video streams;
[0015] Semantic matching is performed between the facial action feature sequence and the real action command to calculate the first matching degree;
[0016] Determine whether the first matching degree is greater than or equal to the first preset threshold. If yes, the authentication result is determined to be successful; otherwise, the authentication result is determined to be unsuccessful.
[0017] Semantically match the facial motion feature sequence with the induced motion command, and calculate the second matching degree;
[0018] Determine whether the second matching degree is greater than or equal to the second preset threshold. If yes, determine that the intrusion detection result is to trigger an intrusion alarm; otherwise, determine that the intrusion detection result is not to trigger an intrusion alarm.
[0019] In one possible implementation, the method further includes, prior to extracting the user's facial motion feature sequence from the video stream:
[0020] The arbitration system obtains the induced action instructions and corresponding first timestamps from the induced display system, and extracts the acquisition timestamp sequence from the video stream.
[0021] The arbitration system obtains the actual action instructions and corresponding second timestamps from the real instruction system.
[0022] Based on the sequence of first timestamp, second timestamp, and acquisition timestamp, the video stream, induced action commands, and real action commands are time-axis aligned.
[0023] In one possible implementation, the Authentic Command System demonstrates its authority to the user through a physical anti-counterfeiting label attached to the display interface, which contains a unique identification code.
[0024] In one possible implementation, the induced action command is dynamically generated by the induced display system, and the induced action command is different from the real action command.
[0025] In one possible implementation, based on the authentication result and the intrusion detection result, corresponding security response operations are performed, including:
[0026] If the authentication result is successful and the intrusion detection result is that no intrusion alarm was triggered, the user is determined to be a legitimate user and authorized to perform subsequent operations.
[0027] If the authentication result is that the authentication failed and the intrusion detection result is that no intrusion alarm was triggered, then the authentication is deemed to have failed and authorization to perform subsequent operations is denied.
[0028] If the intrusion detection result triggers an intrusion alarm, the user account or access permissions associated with the current verification session will be frozen, and an alert message will be sent, including the intrusion alarm level and the identifier of the suspected intruded system.
[0029] Secondly, this application provides a face recognition device, comprising:
[0030] The trigger module is used to respond to the trigger command, generate and display the induced action command through the induced display system, and start the video acquisition operation, and generate and display the real action command through the real command system. The real action command and the induced action command have contradictory semantics.
[0031] The acquisition module is used to respond to the user's execution of real action commands by acquiring a video stream containing the user's actions through a guided display system;
[0032] The arbitration module is used to obtain information about video streams and real action commands through the arbitration system, perform authentication and intrusion detection, and obtain authentication results and intrusion detection results. Authentication is used to compare whether the user's actions in the video stream are consistent with the real action commands, and intrusion detection is used to determine whether the user's actions in the video stream are consistent with the induced action commands.
[0033] The execution module is used to perform corresponding security response operations based on the authentication results and intrusion detection results.
[0034] In one possible implementation, the triggering module is further configured to:
[0035] In response to the user's physical button operation, the trigger system generates and sends trigger commands to the guidance display system and the actual command system respectively.
[0036] In one possible implementation, the arbitration module is specifically used for:
[0037] Extracting user facial motion feature sequences from video streams;
[0038] Semantic matching is performed between the facial action feature sequence and the real action command to calculate the first matching degree;
[0039] Determine whether the first matching degree is greater than or equal to the first preset threshold. If yes, the authentication result is determined to be successful; otherwise, the authentication result is determined to be unsuccessful.
[0040] Semantically match the facial motion feature sequence with the induced motion command, and calculate the second matching degree;
[0041] Determine whether the second matching degree is greater than or equal to the second preset threshold. If yes, determine that the intrusion detection result is to trigger an intrusion alarm; otherwise, determine that the intrusion detection result is not to trigger an intrusion alarm.
[0042] In one possible implementation, the arbitration module is also used for:
[0043] The arbitration system obtains the induced action instructions and corresponding first timestamps from the induced display system, and extracts the acquisition timestamp sequence from the video stream.
[0044] The arbitration system obtains the actual action instructions and corresponding second timestamps from the real instruction system.
[0045] Based on the sequence of first timestamp, second timestamp, and acquisition timestamp, the video stream, induced action commands, and real action commands are time-axis aligned.
[0046] In one possible implementation, the Authentic Command System demonstrates its authority to the user through a physical anti-counterfeiting label attached to the display interface, which contains a unique identification code.
[0047] In one possible implementation, the induced action command is dynamically generated by the induced display system, and the induced action command is different from the real action command.
[0048] In one possible implementation, the execution module is specifically used for:
[0049] If the authentication result is successful and the intrusion detection result is that no intrusion alarm was triggered, the user is determined to be a legitimate user and authorized to perform subsequent operations.
[0050] If the authentication result is that the authentication failed and the intrusion detection result is that no intrusion alarm was triggered, then the authentication is deemed to have failed and authorization to perform subsequent operations is denied.
[0051] If the intrusion detection result triggers an intrusion alarm, the user account or access permissions associated with the current verification session will be frozen, and an alert message will be sent, including the intrusion alarm level and the identifier of the suspected intruded system.
[0052] Thirdly, this application provides an electronic device, including: a processor, and a memory communicatively connected to the processor;
[0053] The memory stores instructions that the computer executes;
[0054] The processor executes computer-executable instructions stored in memory to implement any of the methods of the first aspect.
[0055] Fourthly, this application provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any of the first aspects.
[0056] Fifthly, this application provides a computer program product, including a computer program that, when executed by a computer, implements the method as described in any of the first aspects.
[0057] This application provides a face recognition method, electronic device, storage medium, and program product. In response to a trigger command, an induced action command is generated and displayed via an induced display system, and a video capture operation is initiated. A real action command is also generated and displayed via a real command system, the semantics of which contradict the induced action command. In response to the user's execution of the real action command, a video stream containing the user's actions is captured via the induced display system. Information from the video stream and the real action command is obtained through an arbitration system, and identity verification and intrusion detection are performed to obtain identity verification and intrusion detection results. Identity verification compares the user's actions in the video stream with the real action command, while intrusion detection determines whether the user's actions in the video stream match the induced action command. Based on the identity verification and intrusion detection results, corresponding security response operations are executed. This upgrades the traditional single check for correct behavior to a dual judgment of both correct and incorrect behavior. While completing identity verification, it simultaneously detects whether the system itself has been compromised, fundamentally improving the overall robustness and proactive defense capabilities of the system, and enhancing the security of face recognition. Attached Figure Description
[0058] The accompanying drawings, which are incorporated in and form part of this specification, illustrate embodiments consistent with this application and, together with the description, serve to explain the principles of this application.
[0059] Figure 1 A schematic diagram of a face recognition architecture provided for an embodiment of this application;
[0060] Figure 2 A flowchart illustrating a face recognition method provided in this application. Figure 1 ;
[0061] Figure 3 A flowchart illustrating a face recognition method provided in this application. Figure 2 ;
[0062] Figure 4 This application provides a schematic diagram of the structure of a face recognition device.
[0063] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application.
[0064] The accompanying drawings illustrate specific embodiments of this application, which will be described in more detail below. These drawings and descriptions are not intended to limit the scope of the concept in any way, but rather to illustrate the concept of this application to those skilled in the art through reference to particular embodiments. Detailed Implementation
[0065] Exemplary embodiments will now be described in detail, examples of which are illustrated in the accompanying drawings. When the following description relates to the drawings, unless otherwise indicated, the same numbers in different drawings denote the same or similar elements. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with this application. Rather, they are merely examples of apparatuses and methods consistent with some aspects of this application as detailed in the appended claims.
[0066] It should be noted that the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data used for analysis, data stored, data displayed, etc.) involved in this application are all information and data authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of the relevant data all comply with the relevant laws, regulations, and standards of the relevant countries and regions, have taken necessary confidentiality measures, do not violate public order and good morals, and provide corresponding operation access points for users to choose to authorize or refuse.
[0067] Furthermore, the technical solution involved in this application, which involves big data analysis of user information (including but not limited to personal biometrics, identity data, consumption data, asset data, electronic terminal operation data, etc.) and the use of artificial intelligence technology for automated decision-making, and makes decisions that have a significant impact on personal rights based on the results of automated decision-making, provides users with corresponding operation entry points for users to choose to agree to or reject the results of automated decision-making; if the user chooses to reject, the process will proceed to the expert decision-making process.
[0068] It should be noted that the face recognition method, electronic device, storage medium and program product provided in this application can be used in the field of artificial intelligence, or in any field other than artificial intelligence. The application field of the face recognition method, electronic device, storage medium and program product in this application is not limited.
[0069] Facial recognition technology, due to its convenience and contactless nature, has been widely used in fields with extremely high security requirements, such as financial payments and access control security. However, with the continuous evolution of attack methods, traditional face liveness detection solutions that rely on a single hardware and software system to complete instruction generation, presentation, collection, and comparison are facing increasingly severe security challenges.
[0070] In related technologies, the entire verification process is typically completed within an integrated system: the system displays random action commands on the screen, such as "blink" or "nod," and uses a camera on the same device to capture video of the user executing the command. Finally, an algorithm analyzes whether the actions in the video match the command to complete the liveness verification. Although this centralized architecture is simple to deploy, its command generation, display, acquisition, and judgment logic all run within the same trusted environment.
[0071] In the above process, the risk of single point of failure in the system is extremely high. Once an attacker breaks through the single system by forging the interface, intruding into the backend, or hijacking the camera, they can completely control the instruction content and video data, thereby easily bypassing liveness detection, resulting in low security of facial recognition.
[0072] To address the aforementioned technical issues, this application provides a face recognition method that constructs an active defense architecture based on instruction separation and behavioral counter-evidence. Through a physically isolated inducement display system and a real instruction system, semantically opposite inducement instructions and real instructions are presented to the user. The user only needs to execute the real instructions, while the system uses the inducement instructions as traps to actively detect attack behavior. When an attacker intrudes into the inducement system and follows its instructions, their behavior will deviate from the real instructions, thus being detected and alerted in real time by an independent arbitration system. This mechanism upgrades the traditional single verification of correct behavior to a dual judgment of both correct and incorrect behavior. While completing identity verification, it simultaneously detects whether the system itself has been intruded upon, thereby fundamentally improving the overall robustness and active defense capabilities of the system and enhancing the security of face recognition.
[0073] Below, in conjunction with Figure 1 The system architecture of facial recognition will be illustrated with an example.
[0074] Figure 1 This is a schematic diagram of a face recognition architecture provided for an embodiment of this application. Please refer to [link / reference]. Figure 1 , Figure 1It can include triggering systems, induction display systems, real command systems, and arbitration systems.
[0075] The triggering system serves as the initiation hub for the entire verification process and typically includes a physical button or other form of hardware triggering device. The triggering system responds to user actions, such as pressing a button, generating and synchronously sending start signals (trigger commands) to both the guidance display system and the actual command system, ensuring that the two systems can begin their respective processes nearly simultaneously.
[0076] A guided display system can be physically a standalone unit, typically comprising a display screen and a camera. Upon receiving a trigger command, the system dynamically generates a guided or confusing action instruction and immediately displays this instruction on its integrated display screen. Simultaneously, the system activates the integrated camera to begin recording a video stream of the user facing the screen. This system does not handle verification; one of its core tasks is data acquisition.
[0077] The authentic command system can be a separate, physically isolated unit from the induced display system. It typically consists of only a single display screen, no camera, and its display interface is affixed with a special physical anti-counterfeiting label. Upon receiving the same command from the triggering system, this system generates an authentic action command that is semantically opposite or different from the induced command and displays it on its dedicated screen. This screen provides the user with an authoritative identifier through the physical label, guiding the user to perform actions solely based on the instructions on this screen. This system does not collect any data.
[0078] The arbitration system can be an independent back-end analysis and decision-making module. It independently accesses video stream data collected by the induced display system and real action command information obtained from the real command system via a network or dedicated communication link. The system performs dual-track parallel analysis: first, traditional authentication, analyzing whether the user's actions in the video match the real commands; second, innovative intrusion detection, analyzing whether the user's actions in the video abnormally match the induced commands. Based on the results of these two analyses, the arbitration system makes a final ruling and triggers the corresponding security response mechanism.
[0079] This architecture decouples and distributes functions such as instruction generation, instruction prompting, data collection, and security arbitration to physically isolated systems, and sets up adversarial logic of real and induced at the instruction level, thus constructing a strong security facial recognition framework that can not only verify user identity but also proactively detect whether the system itself has been compromised.
[0080] The technical solution of this application and how the technical solution of this application solves the above-mentioned technical problems are described in detail below with specific embodiments. These specific embodiments can be combined with each other, and the same or similar concepts or processes may not be described again in some embodiments. The embodiments of this application will now be described with reference to the accompanying drawings.
[0081] Figure 2 A flowchart illustrating a face recognition method provided in this application. Figure 1 ,like Figure 2 As shown, the method includes:
[0082] S201. In response to the trigger command, generate and display the induced action command through the induced display system, and start the video acquisition operation, and generate and display the real action command through the real command system.
[0083] The semantics of the actual action command and the induced action command are contradictory.
[0084] The semantic contradiction between real and induced action commands is a key design feature of proactive defense mechanisms. Different behavioral paths are established for normal users and potential attackers: normal users will ignore induced commands and execute real commands; while attackers, if they have hijacked the induced display system, may drive a fake to execute the induced command, thereby exposing their attack behavior.
[0085] The induced action commands are dynamically generated by the induced display system, and the induced action commands are different from the actual action commands.
[0086] A guidance display system can refer to a standalone computing device or module that integrates a display screen and a camera. It can be used to display guidance commands and capture user video using its own camera, but the displayed commands are not necessarily the correct commands expected of the user.
[0087] A real instruction system can refer to an independent computing device or module that is physically isolated from the induction display system, and usually only has display functions.
[0088] The authentic instruction system displays the correct instructions that the user should follow on the screen, i.e., authentic action instructions, and provides the user with authenticity identification through physical anti-counterfeiting labels.
[0089] The Authentic Command System demonstrates its authority to users through a physical anti-counterfeiting label attached to the display interface. This physical anti-counterfeiting label contains a unique identification code.
[0090] Semantic contradiction can refer to two action instructions that are opposite, different, or exclusive in terms of the expected human action.
[0091] Trigger commands can come from a variety of sources.
[0092] Optionally, before generating and displaying induced action instructions via the induced display system in response to the trigger command, initiating video capture operation, and generating and displaying real action instructions via the real command system, the method further includes:
[0093] In response to the user's physical button operation, the trigger system generates and sends trigger commands to the guidance display system and the actual command system respectively.
[0094] Optionally, the trigger command can be initiated by the server at regular intervals or by the user clicking a virtual button on the front-end interface; there are no restrictions on this.
[0095] The induction display system is physically and logically isolated from the actual instruction system and operates independently.
[0096] In response to a trigger command, the prompting display system can dynamically generate a prompting action command locally and immediately display it on its built-in screen. Simultaneously, the system automatically activates the integrated camera to prepare for video capture. The real-action command system can then generate a real-action command locally that contradicts the semantics of the prompting action command and display it on its dedicated screen. This screen typically features a physical anti-counterfeiting label that is difficult to forge, guiding the user to trust and follow only this command.
[0097] For example, the induced action command is "Please turn your head to the left," while the actual action command is "Please turn your head to the right."
[0098] S202. In response to the user's execution of a real action command, a video stream containing the user's actions is acquired through the induced display system.
[0099] Users can view the screen of the real command system and perform corresponding biometric actions based on the displayed real action instructions. During this time, the camera of the induction display system is continuously recording. When the user faces both screens, the entire process of executing the real instructions is captured by the camera of the induction display system, thus forming a video stream containing the user's live movements.
[0100] S203. Obtain information on video streams and real action commands through the arbitration system, perform authentication and intrusion detection, and obtain authentication results and intrusion detection results.
[0101] Authentication is used to compare user actions in the video stream with actual action commands, while intrusion detection is used to determine whether user actions in the video stream are consistent with induced action commands.
[0102] An arbitration system can refer to a separate back-end service or module.
[0103] The arbitration system can be used to aggregate data from the induced display system and the real command system, and perform core logical analysis and security decisions.
[0104] Arbitration systems can be used for authentication and system intrusion detection.
[0105] The authentication result can refer to the conclusion of the arbitration system in determining whether a user is a legitimate user.
[0106] Intrusion detection results can refer to the arbitration system's conclusion that the system may have been compromised or manipulated by an attacker.
[0107] The arbitration system can acquire video streams from the induction display system and real action command information from the real command system via a secure communication channel. The arbitration system analyzes the video streams, extracts the characteristics of the user's actions, and then compares the identified actions with the real action commands obtained from the real command system. If they match, the authentication result is successful; otherwise, authentication fails. This process completes the traditional liveness verification function. Simultaneously, the arbitration system determines whether the user's actions in the video stream are consistent with the induction action commands. If the system detects that the actions in the video highly match the induction commands, it determines that the induction display system may have been compromised, and the attacker is providing fake video feedback based on the displayed induction commands. In this case, the intrusion detection result is an intrusion alarm triggered; otherwise, no intrusion alarm is triggered.
[0108] S204. Based on the authentication result and intrusion detection result, execute the corresponding security response operation.
[0109] It can obtain the mapping relationship and execute the corresponding security response operation based on the mapping relationship, authentication result and intrusion detection result.
[0110] Optionally, the corresponding security response operations can be performed based on the authentication result and the intrusion detection result as follows: if the authentication result is successful and the intrusion detection result is that no intrusion alarm has been triggered, the user is determined to be a legitimate user and authorized to perform subsequent operations; if the authentication result is unsuccessful and the intrusion detection result is that no intrusion alarm has been triggered, the authentication result is determined to be unsuccessful and authorization to perform subsequent operations is denied; if the intrusion detection result is that an intrusion alarm has been triggered, the user account or access permissions associated with the current authentication session are frozen, and a warning message is sent, which includes the intrusion alarm level and the identifier of the suspected intrusion system.
[0111] If the arbitration system determines that identity verification is successful and no intrusion alarm is triggered, it indicates that the system is currently secure and the user's identity has been successfully verified. In this case, the system determines that the current user is a legitimate user. The security response will authorize the user to perform subsequent preset operations.
[0112] If the arbitration system determines that authentication failed but no intrusion alarm was triggered, it indicates that the system itself has not been compromised, but the user failed to correctly complete the liveness verification. This situation is usually caused by user error, biometric mismatch, or simple impersonation attempts. The system classifies this as a normal authentication failure. The corresponding security response is to deny authorization, preventing the user from performing subsequent operations, and may return a standard failure message to the user interface, allowing the user to retry within a certain number of attempts.
[0113] If the arbitration system determines that an intrusion alarm has been triggered, then regardless of whether the "authentication" result is successful or unsuccessful, it is considered a high-risk event. This indicates that the system has a very high probability of having been attacked. At this point, the system will immediately initiate the highest level of security emergency response.
[0114] Security emergency response can include immediate freezing, proactive warning, initiating secondary authentication, and so on.
[0115] This embodiment provides a face recognition method that, in response to a trigger command, generates and displays induced action commands through an induced display system and initiates video capture operations, and generates and displays real action commands through a real command system, the real action commands being semantically contradictory to the induced action commands; in response to the user's execution of the real action commands, the induced display system captures a video stream containing the user's actions; an arbitration system obtains information from the video stream and the real action commands, performs identity verification and intrusion detection, and obtains identity verification results and intrusion detection results. Identity verification is used to compare whether the user's actions in the video stream are consistent with the real action commands, and intrusion detection is used to determine whether the user's actions in the video stream are consistent with the induced action commands; based on the identity verification results and intrusion detection results, corresponding security response operations are executed. This upgrades the traditional single check of correct behavior to a dual judgment of both correct and incorrect behavior, simultaneously detecting whether the system itself has been compromised while completing identity verification, thereby fundamentally improving the overall robustness and proactive defense capabilities of the system and enhancing the security of face recognition.
[0116] Below, in conjunction with Figure 3 This document explains the process of obtaining video streams and real action commands through the arbitration system, performing authentication and intrusion detection, and obtaining authentication and intrusion detection results.
[0117] Figure 3 A flowchart illustrating a face recognition method provided in this application. Figure 2 ,like Figure 3 As shown, in this embodiment... Figure 2 Based on the embodiments, a face recognition method is described in detail, the method including:
[0118] S301. Extract the user's facial motion feature sequence from the video stream.
[0119] It can locate the user's face region in each frame of the video and perform stable tracking between consecutive frames. For the tracked face region, dynamic features that can represent the action are extracted, and these time-varying features are combined to form a complete facial action feature sequence.
[0120] Extracting dynamic features can include extracting key point motion trajectories, local motion features, and performance change features.
[0121] The key point motion trajectory extraction method can extract the coordinates of key facial points such as eyebrows, eyes, nose, mouth, and cheeks, and calculate the displacement, velocity, and acceleration sequences of these points over time.
[0122] Local motion features can be extracted by calculating the motion vector sequence of pixels in a specific region of a face using optical flow.
[0123] Extracting apparent change features allows for the analysis of changes in the texture, shape, or area of a specific region over time.
[0124] Optionally, before extracting the user's facial motion feature sequence from the video stream, the method further includes time-axis alignment processing of the video stream, induced motion instructions, and real motion instructions: obtaining the induced motion instructions and corresponding first timestamps from the induced display system through an arbitration system, and extracting the acquisition timestamp sequence from the video stream; obtaining the real motion instructions and corresponding second timestamps from the real instruction system through an arbitration system; and performing time-axis alignment processing of the video stream, induced motion instructions, and real motion instructions based on the first timestamp, second timestamp, and acquisition timestamp sequence.
[0125] In a distributed system architecture, the induction display system, the actual command system, and the arbitration system may have independent clock sources. Inevitably, there are slight delays in command generation, display, video acquisition, and data transmission. If misaligned, the arbitration system may incorrectly compare a user's action at one moment in the video with a system command at another moment, leading to misjudgment.
[0126] The aforementioned timeline alignment process effectively solves the timing consistency problem in multi-system collaboration. It ensures that the "user actions" and "system instructions" compared by the arbitration system correspond strictly in time logic, significantly reducing the risk of false alarms or missed alarms caused by asynchrony between systems, thereby significantly improving the accuracy and robustness of the entire verification scheme in real complex network environments.
[0127] S302. Perform semantic matching between the facial action feature sequence and the real action command, and calculate the first matching degree.
[0128] After acquiring the facial motion feature sequence, the arbitration system can perform the core comparison operation for identity verification. The system parses the textual semantics of the real motion commands obtained from the real command system and transforms them into a pre-defined, standardized motion feature template. The algorithm then compares the actual extracted facial motion feature sequence with this pre-defined template.
[0129] This comparison is not a simple binary judgment, but rather an evaluation of the degree of consistency between the two through a specific similarity calculation model, outputting a quantitative numerical result, which is the first matching degree.
[0130] The higher the matching degree, the better the user's actions match the actual instructions.
[0131] S303. Determine whether the first matching degree is greater than or equal to the first preset threshold. If yes, determine that the authentication result is successful; otherwise, determine that the authentication result is unsuccessful.
[0132] The system can compare the calculated first matching degree with a predefined first preset threshold. This threshold is an empirical value set based on a large amount of experimental data, used to strike a balance between security and user experience. Its purpose is to distinguish between valid actions and invalid or erroneous actions. If the first matching degree reaches or exceeds the threshold, the user is deemed to have successfully responded to the genuine instruction, and the authentication result is considered successful. Conversely, if the matching degree is below the threshold, the authentication result is considered unsuccessful.
[0133] This threshold mechanism allows the system to tolerate a certain degree of action execution deviation or environmental interference, enhancing the robustness of the solution.
[0134] S304. Perform semantic matching between the facial action feature sequence and the induced action command, and calculate the second matching degree.
[0135] In parallel or sequentially, the arbitration system performs intrusion detection analysis. The system acquires induced action instructions and transforms their semantics into a feature template representing either an error or an induced action. Subsequently, the system performs the same similarity calculation between the same facial action feature sequence and this induced action template, obtaining a second quantified value, i.e., the second matching degree. This matching degree reflects the degree of consistency between the user's action and the induced instruction.
[0136] S305. Determine whether the second matching degree is greater than or equal to the second preset threshold. If yes, determine that the intrusion detection result is to trigger an intrusion alarm. If no, determine that the intrusion detection result is not to trigger an intrusion alarm.
[0137] The second matching degree can be compared with another independent second preset threshold, which may be set more sensitively or strictly to capture suspicious behaviors that clearly follow inducement instructions. If the second matching degree reaches or exceeds this threshold, it indicates that the user behavior in the video abnormally matches the inducement instructions. At this time, the system determines that the intrusion detection result is to trigger an intrusion alarm. If the second matching degree is lower than the threshold, it is considered that no such abnormal matching behavior has been found, and the intrusion detection result is that no intrusion alarm has been triggered.
[0138] The implementation details of each step in this application embodiment can be found in the description of the corresponding steps or operations in the above method embodiments; repeated content will not be repeated.
[0139] This embodiment provides a face recognition method that extracts a user's facial action feature sequence from a video stream; performs semantic matching between the facial action feature sequence and real action commands to calculate a first matching degree; determines whether the first matching degree is greater than or equal to a first preset threshold; if yes, the identity verification result is determined to be successful; otherwise, the identity verification result is determined to be unsuccessful; performs semantic matching between the facial action feature sequence and induced action commands to calculate a second matching degree; determines whether the second matching degree is greater than or equal to a second preset threshold; if yes, the intrusion detection result is determined to be triggered by an intrusion alarm; otherwise, the intrusion detection result is determined to be not triggered by an intrusion alarm.
[0140] Figure 4 This is a schematic diagram of the structure of a face recognition device provided in this application, as shown below. Figure 4 As shown, the face recognition device 400 provided in this embodiment includes a trigger module 401, a data acquisition module 402, an arbitration module 403, and an execution module 404.
[0141] Trigger module 401 is used to respond to trigger command, generate and display induced action command through induced display system and start video acquisition operation, and generate and display real action command through real command system, the real action command and induced action command semantically contradict each other;
[0142] The acquisition module 402 is used to acquire a video stream containing the user's actions in response to the user's execution of real action commands through the guided display system;
[0143] Arbitration module 403 is used to obtain information about video stream and real action instructions through the arbitration system, perform authentication and intrusion detection, and obtain authentication results and intrusion detection results. Authentication is used to compare whether the user's actions in the video stream are consistent with the real action instructions, and intrusion detection is used to determine whether the user's actions in the video stream are consistent with the induced action instructions.
[0144] Execution module 404 is used to perform corresponding security response operations based on the authentication result and the intrusion detection result.
[0145] In one possible implementation, the trigger module 401 is further configured to:
[0146] In response to the user's physical button operation, the trigger system generates and sends trigger commands to the guidance display system and the actual command system respectively.
[0147] In one possible implementation, the arbitration module 403 is specifically used for:
[0148] Extracting user facial motion feature sequences from video streams;
[0149] Semantic matching is performed between the facial action feature sequence and the real action command to calculate the first matching degree;
[0150] Determine whether the first matching degree is greater than or equal to the first preset threshold. If yes, the authentication result is determined to be successful; otherwise, the authentication result is determined to be unsuccessful.
[0151] Semantically match the facial motion feature sequence with the induced motion command, and calculate the second matching degree;
[0152] Determine whether the second matching degree is greater than or equal to the second preset threshold. If yes, determine that the intrusion detection result is to trigger an intrusion alarm; otherwise, determine that the intrusion detection result is not to trigger an intrusion alarm.
[0153] In one possible implementation, the arbitration module 403 is also used for:
[0154] The arbitration system obtains the induced action instructions and corresponding first timestamps from the induced display system, and extracts the acquisition timestamp sequence from the video stream.
[0155] The arbitration system obtains the actual action instructions and corresponding second timestamps from the real instruction system.
[0156] Based on the sequence of first timestamp, second timestamp, and acquisition timestamp, the video stream, induced action commands, and real action commands are time-axis aligned.
[0157] In one possible implementation, the Authentic Command System demonstrates its authority to the user through a physical anti-counterfeiting label attached to the display interface, which contains a unique identification code.
[0158] In one possible implementation, the induced action command is dynamically generated by the induced display system, and the induced action command is different from the real action command.
[0159] In one possible implementation, execution module 404 is specifically used for:
[0160] If the authentication result is successful and the intrusion detection result is that no intrusion alarm was triggered, the user is determined to be a legitimate user and authorized to perform subsequent operations.
[0161] If the authentication result is that the authentication failed and the intrusion detection result is that no intrusion alarm was triggered, then the authentication is deemed to have failed and authorization to perform subsequent operations is denied.
[0162] If the intrusion detection result triggers an intrusion alarm, the user account or access permissions associated with the current verification session will be frozen, and an alert message will be sent, including the intrusion alarm level and the identifier of the suspected intruded system.
[0163] The face recognition device provided in this embodiment can execute the method provided in the above method embodiment. Its implementation principle and technical effect are similar, and will not be described in detail here.
[0164] Figure 5 This is a schematic diagram of the structure of an electronic device provided in an embodiment of this application. Please refer to... Figure 5 Electronic device 500 may include: memory 501, processor 502, and transceiver 503.
[0165] Memory 501 is used to store program instructions;
[0166] The processor 502 is used to execute the program instructions stored in the memory so that the electronic device 500 performs the above-described method.
[0167] Transceiver 503 may include a transmitter and / or a receiver. The transmitter may also be referred to as a transmitter, transmitter port, or transmitter interface, and the receiver may also be referred to as a receiver port, receiver interface, or similar descriptions. Exemplarily, memory 501, processor 502, and transceiver 503 are interconnected via bus 504.
[0168] In the above embodiments, it should be understood that the processor can be a Central Processing Unit (CPU), or other general-purpose processors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), etc. The general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules within the processor.
[0169] The memory may include random access memory (RAM) and may also include non-volatile memory (NVM), such as at least one disk storage device.
[0170] The bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus, etc. Buses can be categorized as address buses, data buses, control buses, etc. For ease of illustration, the buses shown in the accompanying drawings are not limited to a single bus or a single type of bus.
[0171] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that this application is not limited to the described order of actions, as some steps may be performed in other orders or simultaneously according to this application. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are all optional embodiments, and the actions and modules involved are not necessarily essential to this application.
[0172] It should be further noted that although the steps in the flowchart are shown sequentially according to the arrows, these steps are not necessarily executed in the order indicated by the arrows. Unless explicitly stated herein, there is no strict order restriction on the execution of these steps, and they can be executed in other orders. Moreover, at least some steps in the flowchart may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily completed at the same time, but can be executed at different times. The execution order of these sub-steps or stages is not necessarily sequential, but can be performed alternately or in turn with other steps or at least some of the sub-steps or stages of other steps.
[0173] It should be understood that the above-described device embodiments are merely illustrative, and the device of this application can also be implemented in other ways. For example, the division of units / modules in the above embodiments is only a logical functional division, and there may be other division methods in actual implementation. For example, multiple units, modules, or components may be combined, or integrated into another system, or some features may be ignored or not executed.
[0174] Furthermore, unless otherwise specified, the functional units / modules in the various embodiments of this application can be integrated into one unit / module, or each unit / module can exist physically separately, or two or more units / modules can be integrated together. The integrated units / modules described above can be implemented in hardware or as software program modules.
[0175] When integrated units / modules are implemented in hardware, the hardware can be digital circuits, analog circuits, etc. The physical implementation of the hardware structure includes, but is not limited to, transistors, memristors, etc. Unless otherwise specified, the processor can be any suitable hardware processor, such as a CPU, GPU, FPGA, DSP, and ASIC, etc. Unless otherwise specified, the storage unit can be any suitable magnetic or magneto-optical storage medium, such as Resistive Random Access Memory (RRAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), Enhanced Dynamic Random Access Memory (EDRAM), High-Bandwidth Memory (HBM), Hybrid Memory Cube (HMC), etc.
[0176] If the integrated unit / module is implemented as a software program module and sold or used as an independent product, it can be stored in a computer-readable storage device (CMD). Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or all or part of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a memory and includes several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) to execute all or part of the steps of the methods of the various embodiments of this application. The aforementioned memory includes various media capable of storing program code, such as a USB flash drive, read-only memory (ROM), random access memory (RAM), portable hard drive, magnetic disk, or optical disk.
[0177] In the above embodiments, the descriptions of each embodiment have their own emphasis. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments. The technical features of the above embodiments can be combined arbitrarily. For the sake of brevity, not all possible combinations of the technical features in the above embodiments are described. However, as long as the combination of these technical features does not contradict each other, it should be considered within the scope of this specification.
[0178] Other embodiments of this application will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of this application that follow the general principles of this application and include common knowledge or customary techniques in the art not disclosed herein. The specification and examples are to be considered exemplary only, and the true scope and spirit of this application are indicated by the following claims.
[0179] It should be understood that this application is not limited to the precise structure described above and shown in the accompanying drawings, and various modifications and changes can be made without departing from its scope. The scope of this application is limited only by the appended claims.
Claims
1. A face recognition method, characterized in that, include: In response to a trigger command, an induced action command is generated and displayed through an induced display system, and a video capture operation is initiated. A real action command is generated and displayed through a real command system, wherein the real action command is semantically contradictory to the induced action command. In response to the user's execution of the real action command, the guiding display system acquires a video stream containing the user's actions; The video stream and the real action command information are obtained through the arbitration system, and authentication and intrusion detection are performed to obtain authentication results and intrusion detection results. The authentication is used to compare whether the user action in the video stream is consistent with the real action command, and the intrusion detection is used to determine whether the user action in the video stream is consistent with the induced action command. Based on the authentication result and the intrusion detection result, execute the corresponding security response operation.
2. The method according to claim 1, characterized in that, Before responding to the trigger command, generating and displaying the induced action command through the induced display system, initiating the video capture operation, and generating and displaying the real action command through the real command system, the process also includes: In response to the user's physical button operation, the trigger system generates and sends the trigger command to the guidance display system and the real command system respectively.
3. The method according to claim 1, characterized in that, The video stream and the actual action commands are obtained through the arbitration system. Authentication and intrusion detection are performed to obtain authentication results and intrusion detection results, including: Extract the user's facial motion feature sequence from the video stream; The facial motion feature sequence is semantically matched with the real motion command, and a first matching degree is calculated. Determine whether the first matching degree is greater than or equal to the first preset threshold. If yes, determine that the identity verification result is successful; otherwise, determine that the identity verification result is unsuccessful. The facial motion feature sequence is semantically matched with the induced motion command, and a second matching degree is calculated. Determine whether the second matching degree is greater than or equal to the second preset threshold. If yes, determine that the intrusion detection result is to trigger an intrusion alarm; otherwise, determine that the intrusion detection result is not to trigger an intrusion alarm.
4. The method according to claim 3, characterized in that, Before extracting the user's facial motion feature sequence from the video stream, the method further includes: The arbitration system obtains the guidance action instruction and the corresponding first timestamp from the guidance display system, and extracts the acquisition timestamp sequence from the video stream. The arbitration system obtains the real action command and the corresponding second timestamp from the real command system. Based on the first timestamp, the second timestamp, and the acquisition timestamp sequence, the video stream, the induced action command, and the real action command are time-axis aligned.
5. The method according to any one of claims 1-4, characterized in that, The authentic command system demonstrates its authority to users through a physical anti-counterfeiting label attached to the display interface, and the physical anti-counterfeiting label contains a unique identification code.
6. The method according to any one of claims 1-4, characterized in that, The induced action command is dynamically generated by the induced display system, and the induced action command is different from the actual action command.
7. The method according to any one of claims 1-4, characterized in that, Based on the authentication result and the intrusion detection result, execute the corresponding security response operation, including: If the authentication result is successful and the intrusion detection result is that no intrusion alarm was triggered, then the user is determined to be a legitimate user and authorized to perform subsequent operations. If the authentication result is that the authentication failed and the intrusion detection result is that no intrusion alarm was triggered, then the authentication is deemed to have failed and authorization to perform subsequent operations is denied. If the intrusion detection result triggers an intrusion alarm, the user account or access permissions associated with the current verification session are frozen, and a warning message is sent, which includes the intrusion alarm level and the identifier of the suspected intrusion system.
8. An electronic device, characterized in that, include: Memory, processor; The memory stores computer-executed instructions; The processor executes computer execution instructions stored in the memory, causing the processor to perform the method as described in any one of claims 1-7.
9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions, which, when executed by a processor, are used to implement the method as described in any one of claims 1-7.
10. A computer program product, characterized in that, Includes a computer program that, when executed by a processor, implements the method described in any one of claims 1-7.