Device and method for biometric identification of meeting participants, meeting recording and summary generation
A multi-modal biometric system with facial and fingerprint recognition, along with encryption, addresses security and accuracy issues in meeting participant identification and documentation, ensuring secure and accurate meeting records.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- BIOMETRIC WITNESS SIA
- Filing Date
- 2025-12-15
- Publication Date
- 2026-06-18
AI Technical Summary
Existing meeting technologies lack robust security and accuracy in participant identification and documentation, leading to potential unauthorized access and inaccurate meeting records.
A multi-modal biometric system combining facial recognition and fingerprint scanning, with data encryption and secure storage, to authenticate participants and generate comprehensive, encrypted meeting records.
Enhances security and accuracy of participant identification, reduces false acceptance/rejection rates, and provides tamper-resistant, verifiable meeting documentation.
Smart Images

Figure 00000013_0000 
Figure 00000014_0000 
Figure 00000015_0000
Abstract
Description
[0001] DEVICE AND METHOD FOR BIOMETRIC IDENTIFICATION OF MEETING PARTICIPANTS, MEETING RECORDING AND SUMMARY GENERATION
[0002] DESCRIPTION
[0003] Field of the invention
[0004]
[0001] The present invention relates to a portable meeting device with biometric identification of participants and its operating method.
[0005] Background of the invention
[0006]
[0002] US patent US 9064160 B2 discloses an arrangement and corresponding method, which arrangement is configured to recognize a conference participant who is currently talking during a conference session. The arrangement comprises an identifying unit including a biometric detector adapted to capture at least one biometric characteristic of the participant and a comparison unit adapted to compare the biometric characteristic to stored biometric characteristics in a database each stored characteristic being associated with an owner identity.
[0007]
[0003] US patent application US 2019190908 Al discloses a system for managing an access control a meeting. The system may include a communication interface that receives video and audio of the meeting, a processor that executes instructions to generate a biometric characteristic for an attendee based on at least one of the video and the audio, and to associate identity information of the attendee with the biometric characteristic based on a comparison of the biometric characteristic with stored biometric characteristics of known users. The processor may also execute the instructions to generate a data stream that includes at least one of the video and the audio of the attendee, to tag the data stream with the identity information based on the associated biometric characteristic, and to selectively cause the data stream to be shown on a display based on selection of the tag.
[0008]
[0004] US patent US 11315366 B2 discloses a method of obtaining a multimedia file corresponding to a conference, the multimedia file includes video data and audio data. Personal identity information of each person is identified according to the facial features and the voice features of each person. Once the audio data corresponding to each person is converted into text information, the posture language, the personal identity information, and the text information corresponding to each person are output.
[0005] The present invention provides an alternative improved way of meeting participation control and meeting overview generation.
[0009] Summary of the invention
[0010]
[0006] The present invention generally refers to a device and method for authorizing, initializing, conducting, and recording meetings, using biometrics technology for participant identification. It is an object of the present invention to provide an advanced meeting device and an associated method designed to improve the security, accuracy, and accountability of meetings by authenticating participants through their biometric data. It is hereby clarified that participants of the meeting are users of the meeting device described in the present invention.
[0011]
[0007] The meeting device described in the present application includes: a camera on each side of the device, a microphone on each side of the device, a central processing unit (CPU), a fingerprint scanner, a RFID reader, a memory card and a USB-C port. The device is capable of an advanced biometric authentication via combination of results of a facial recognition and a fingerprint scanning to securely verify participants’ identities. The device allows for automated generation of detailed meeting minutes, derived from recorded audio and video data, based on machine intelligence, therefore, capturing essential contributions and decisions made during the meeting.
[0012]
[0008] A method described in the present application includes: capturing image information by at least one said camera and biometric data by the fingerprint scanner about the participant joining the meeting; identifying or verifying participants identity by comparing the information retrieved with the one stored in the database; verification of the results in order to grant or deny access to the meeting. The method further includes a user-initiated meeting documentation step, where the activated cameras and microphones of the meeting device capture video and audio of the meeting. The data collected are stored in the device's memory and subsequently utilized to create an overview of the meeting. Furthermore, the video captured by cameras is stitched for seamless panoramic view of the meeting, and the audio received from the microphones is transcripted and summarized by machine intelligence into meetings minutes.
[0013]
[0009] The device according to present invention employs a multi-modal biometric recognition system combining facial recognition and fingerprint recognition to authenticate meeting participants. The comparison is carried out by a multi-modal biometric fusion algorithm, which combines matching scores obtained from facial recognition and from fingerprint recognition according to predefined weighting coefficients and matching thresholds. Only when both biometric modalities match the corresponding reference templates in accordance with these thresholds is the person authenticated as a meeting participant and granted access to the meeting. The resulting authenticated participant identifier is stored and used throughout the meeting to tag all subsequent speech segments and events associated with that participant, thereby providing a persistent and verifiable link between biometric authentication and the recorded content.
[0014]
[0010] Data encryption is fundamental for safeguarding sensitive information processed and stored by the meeting support device. After multi-modal biometric verification and during the meeting, the device records audio and video streams and associates time segments of these streams with identifiers of authenticated participants. Before storage, the device encrypts a data structure that comprises the recorded audio, the recorded video and metadata linking each time segment and each entry in the meeting minutes to the corresponding authenticated participant identifier. In this way, not only are the media files encrypted, but also the logical association between the spoken contributions, their transcription in the meeting minutes and the biometric identity of the speaker is cryptographically protected. Secure key management, for example using a hardware security module or a trusted key store, ensures that only authorized entities can decrypt and access the bound audio-visual and identity information.
[0015] [Oi l] The verification process implemented in the device leverages the advantages of multimodal biometrics over single-modality systems. By jointly analysing facial features and fingerprint patterns of a participant, the device achieves a lower false acceptance rate and a lower false rejection rate than systems relying solely on face recognition or solely on fingerprint recognition. Furthermore, because the same authenticated participant identifier is reused for labelling each recorded speech segment and each item in the generated meeting minutes, attempts to impersonate another participant or to modify the attribution of statements after the meeting are technically hindered. The combination of multi-modal verification, continuous audio-visual recording and cryptographically protected identity tagging therefore provides an enhanced level of security and trustworthiness for applications requiring precise identity verification and accountable meeting documentation.
[0016]
[0012] As one of the essential features of the present invention, the meeting support device automatically documents the course of the meeting in the form of meeting minutes and associated audio-visual records. The synchronised audio and video streams are analysed to detect when each authenticated participant is speaking, and the spoken content is transcribed into text. The device then generates a time-ordered sequence of entries, each entry comprising a transcribed spoken contribution, a time stamp, and the identifier of the authenticated participant to whom the contribution is attributed. The meeting minutes are stored together with references to the corresponding encrypted audio-visual segments so that each textual entry can be verified against the underlying recorded media and the biometric authentication event of the speaker. This identity- annotated and media-linked documentation significantly improves the accuracy, integrity and comprehensiveness of the meeting record compared to conventional systems that lack a secure binding between participants, identities, their contributions and the recorded evidence.
[0017] Brief description of the drawings
[0018]
[0013] The drawings illustrate generally, by way of the example, but not by way of limitation, various embodiments of the invention.
[0019]
[0014] Fig. 1 is a process flowchart of the meeting device.
[0020]
[0015] Fig. 2 is a connectivity framework of the meeting device.
[0021]
[0016] Fig. 3 is an operational overview of the meeting device.
[0022]
[0017] Fig. 4 is a flowchart of participant verification method step implemented by the device.
[0018] Fig. 5 is a flowchart of meeting documentation method step implemented by the device.
[0023] Detailed description of the invention
[0024]
[0019] The present invention provides a meeting device and with it associated implementing method. By utilizing advanced biometric authentication - such as facial recognition and fingerprint scanning - the invention ensures that only authorized individuals gain access to confidential discussions and sensitive information, thereby mitigating the risk of unauthorized attendance. In conjunction with video and audio capture, the device and associated computer implemented method creates a comprehensive and accurate record of the meeting, including contributions and interactions from all participants. Such innovative approach improves the efficiency of meetings. Through automatic recording and minute generation of the meeting, the device enhances transparency, providing an accurate record that can be referenced for follow-up actions and accountability. Reliable framework allows users to maintain a consistent meeting history, aiding in decision-making and reinforcing security standards. Ultimately, the combination of biometric identification, simultaneous various data capture, and intelligent summarization significantly elevates the meeting experience, making it more efficient, secure, and productive.
[0020] Meeting device can include, by way of example and not limitation, a number of different components including meeting software and portable meeting hardware. The present invention encompasses meeting hardware designed to facilitate the execution of meetings in accordance with the features described herein. This hardware may include, but is not limited to, CPU, at least one integrated wide-angle lens camera with the frame rate of at least 60 frames per second (FPS) for smooth video recording, positioned on each side of the device, which can be selectively activated or deactivated based on specific requirements for video capture and facial recognition, or depending on necessity of its use. Additionally, the meeting hardware incorporates at least one built-in omnidirectional microphone array with sensitivity at least about -40 dBV / Pa for audio capture and a fingerprint scanner to further enhance functionality. The fingerprint scanner has a resolution of at least 500 DPI. The disclosed device operates on a rechargeable battery, enabling autonomous functionality and efficient data recording through the use of a SDXC memory card of class 10 or higher with UHS-I or UHS-II support. To enhance security of the information disclosed during meetings a robust encryption for all data is provided, ensuring its secure storage and transmission. Additionally, the device supports integration with multi -standard RFID readers supporting a wide range of RFID standards, such as HF and UHF frequencies, to enhance access control capabilities, and includes a USB-C port for charging and data transfer, providing versatility and ease of use in various operational environments. The USBC port is in a direct connection to a power management system and CPU of the meeting device. This combination of features ensures that the device is not only user-oriented but also prioritizes security and reliability in data management.
[0021] The interaction of the components can be divided into 5 main steps as seen in process flowchart of Fig. 1. The steps are as follows:
[0025] Initialization: the system powers on and initiates a self-check to ensure all components are operational. During this step, the CPU engages all connected peripherals, including said cameras, said microphones, the fingerprint scanner, and RFID reader, preparing them for subsequent data capture.
[0026] Data capture: once initialization is complete, said cameras and said omnidirectional microphones commence capturing video and audio data from the meeting environment. This ensures that all participants are clearly visible and audible. Meanwhile, the fingerprint scanner and RFID reader remain in standby mode, ready to activate when participant identification is required.
[0027] Data processing: the captured audio and video data is sent to the CPU for processing. The CPU employs algorithms to stitch the images captured by the cameras, creating a comprehensive panoramic view. Simultaneously, biometric processing occurs, utilizing facial recognition and fingerprint scanning to authenticate participants. This multi-faceted processing ensures accurate representation and verification of all attendees. Data storage: following said data processing, the system encrypts the data to protect sensitive information during storage and transmission. The encrypted data is then securely stored on the memory card, facilitating efficient retrieval while maintaining compliance with data privacy regulations.
[0028] User interaction: finally, users can access the stored data through web or mobile interfaces, allowing for easy review and management of meeting content. This feature enhances user experience by providing flexibility in accessing meeting minutes, video recordings, and audio files, all from a convenient platform.
[0029]
[0022] The hardware of the meeting device operates in conj unction with software of the meeting device to enable various meeting-related functions, such as participant identification, initiation of meetings, audio and video recording, and further generation of meeting minutes. Moreover, the system allows for generation of a footage of the meeting, irrespectively of the positioning of the participants during the meeting, contributing to a more dynamic and efficient post-meeting experience.
[0030]
[0023] The present invention discloses the meeting device that integrates multiple functionalities through a connectivity framework as seen in Fig. 2 accordingly.
[0031]
[0024] The components of the device are interconnected as further described. The device comprises four wide-angle lens cameras, each positioned on a different side of the unit. The cameras are interfaced with the CPU via high-speed serial connections, utilizing protocols such as MIPI (Mobile Industry Processor Interface) or USB. Each camera can be individually activated or deactivated as per the requirements of the meeting. An integration of four omnidirectional microphones is a key feature of the device, facilitating the capture of audio from all directions. Each microphone is connected to the CPU through either digital or analog interfaces, specifically I2S (Inter-IC Sound) or PDM (Pulse Density Modulation). A fingerprint scanner is incorporated into the device, connecting to the CPU via a serial interface such as UART, I2C or SPI. An inclusion of the multi -standard RFID reader, connected to the CPU via a serial connection (UART or I2C), allows for efficient access management. This feature streamlines the identification process of participants using RFID tags. The device is equipped with an SDXC memory card slot that interfaces with the CPU via the SD or SDIO interface. This functionality permits the storage of various data types, including recorded audio, video, and meeting documentation, facilitating easy retrieval and management of meeting records. A USBC port serves as a dual-purpose connection point, linking both the power management system and the CPU. This port supports data transfer and charging capabilities, providing versatility for connectivity with other devices.
[0032]
[0025] Fig. 3 is an operational overview of the meeting device that illustrates the comprehensive functionality of the same, highlighting its capabilities in data capture, processing, storage, and user interaction.
[0026] Flowcharts shown in Figs. 4-5 depict exemplary methods of operation for various embodiments of the present invention. To improve clarity, these flowcharts highlight certain steps while omitting others that would be evident to those skilled in the art. As such, the flowcharts should not be interpreted as mandating all the illustrated steps or excluding any unillustrated steps. Additionally, the sequence of steps presented is not fixed, as many steps can be performed independently of one another.
[0033]
[0027] Fig 4. illustrates a method of identity verification having a first step of initiation of identity verification by a user. The meeting device initializes cameras in from of each participant which captures an image of the participant’s face and processes features like the distance between eyes, nose, mouth, and other unique facial attributes. At the same time, the system initializes the fingerprint scanner to scan the individual's fingerprint. Data obtained by both the camera and the fingerprint scanner are sent to the system. The system further performs a step of comparison of obtained data with pre-stored biometric characteristics of the participant. The CPU retrieves prestored matching biometric data from the memory of the device or database. In turn, for captured data, both the facial recognition score and fingerprint matching score are evaluated. The system combines these two scores to make a final determination about whether the person is authentic or not. The algorithm used for such multi-modal biometric fusion combines the results from both the facial recognition and fingerprint matching. This process involves techniques like weighted summation and machine learning models. The results from facial and fingerprint matching are assigned weights (importance), and the scores from both modalities are summed up. If the combined score exceeds a certain threshold, when compared with the pre-stored biometric data, the identity is verified. Machine learning models utilized provide how the scores from different modalities are to be combined for a precise result. This process increases the reliability of identity verification, reducing the chances of false positives (incorrectly verifying an individual) or false negatives (incorrectly denying access). The facial recognition technology employs Convolutional Neural Networks (CNNs), with models like FaceNet chosen for their accuracy in variable lighting and facial angles, frequently encountered in meeting settings. For fingerprint identification minutiae-based algorithms are utilized to identify key features, such as ridge endings and bifurcations, to create a precise match between the captured fingerprint and stored templates. The fingerprint scanners utilize UART, I2C, or SPI, for interfacing with advanced software libraries, such as Neurotechnology’s VeriFinger SDK, therefore, providing robust tools for accurate, fast, and secure fingerprint identification. Together, these modalities provide a robust and reliable access control system, implemented using OpenCV with TensorFlow or Py Torch frameworks for facial recognition and VeriFinger SDK for fingerprint matching. To implement multi-modal biometric fusion, BioAPI and OpenBR software platforms and frameworks are used to support the different authentication modalities and facilitate the combination of biometric data. Data encryption is fundamental for safeguarding sensitive biometric information during both processing and storage. The encryption process uses AES 256bit encryption, a robust standard that protects against unauthorized access. Key management follows stringent protocols, with options to integrate Hardware Security Modules (HSM) to securely generate, store, and manage encryption keys. For further security, Public Key Infrastructure (PKI) standards are implemented to protect key exchange and digital signature processes, ensuring that only authorized devices and personnel can access the stored data. This multilayered security framework provides comprehensive data protection, ensuring confidentiality and compliance with data privacy regulations. After the step of data comparison, the system verifies the identity of the participant or marks the participant as unrecognized and further an explicit consent from other verified participants must be received for such participant to be allowed to attend the meeting.
[0034]
[0028] Fig. 5 is a flowchart of yet another example procedure of the disclosed method for obtaining event documentation and minutes of the meeting. The procedure has a first step of initiation of the event documentation by a user. Then the system activates and records video and audio by means of four cameras and four omnidirectional microphones. The recorded video and audio data are further sent to the system and stored in the memory of the meeting device. The stored video and audio data are further processed by the system to obtain 360-degree video with synchronized audio. The 360-degree image is obtained by a stitching process that aligns and blends overlapping areas from multiple video frames captured simultaneously by the device’s cameras, creating a continuous 360-degree view. The stitching algorithms, including SIFT (Scale-Invariant Feature Transform) and SURF (Speeded-Up Robust Features), detect distinctive key points within each frame, matching them across images for precise alignment. After alignment, multi-band blending techniques are applied to eliminate visible seams, resulting in a seamless panoramic view suitable for fully capturing meetings. This process is optimized for closed environments like conference rooms where overlapping visuals are essential for context. This is particularly useful for environments requiring 360-degree visual coverage, such as in the context of meeting room devices with multiple cameras. Technologically, the process is implemented but not limited to using libraries like OpenCV, which includes dedicated stitching modules. These modules provide the necessary tools for feature detection, image alignment, and blending, or alternatively, specialized software solutions can be utilized for more complex stitching requirements. This technology enhances the visual continuity of multi-camera systems, offering comprehensive and unified video output for applications like meeting recordings and surveillance.
[0035]
[0029] The present invention includes precise audio synchronization with captured video, ensuring cohesive alignment of multimedia data. The input for this synchronization process consists of timestamped audio data from multiple microphones and corresponding video frames. Digital Signal Processing (DSP) algorithms process these timestamps, aligning the audio and video to eliminate any discrepancies. This synchronization improves clarity and coherence, enabling accurate playback and documentation of recorded content. Technologies such as beamforming for directional focus and noise reduction methods like Wiener filtering further enhance audio quality, isolating speaker voices from background noise for a more polished output. The technology that supports audio synchronization includes DSP processors and software libraries like PortAudio or Py Audio, which provide the necessary tools for real-time audio processing and synchronization. By ensuring that audio and video are seamlessly aligned, above-mentioned feature greatly enhances the user experience in applications such as video conferencing, event recording, and multimedia production, leading to more engaging and professional-quality content.
[0036]
[0030] As one of the essential features of the present invention is considered to be speech transcription to written text. The software on a meeting room device enables a meeting minutes to be generated and to be sent to the attendees. The meeting minutes include a timeline of events that took place during the meeting and, additionally, the content discussed throughout the meeting. In operation, when a meeting begins, the meeting room device, through its meeting software, automatically monitors and records events occurring during the meeting, generating a timeline of these events. An addition of a geotag is possible. The generation of meeting minutes is intricately linked to the processes of image stitching and audio synchronization. Video images captured from four cameras are stitched together to create a cohesive visual representation of the meeting, which is then synchronized with audio data collected from four microphones. Leveraging artificial intelligence, the system can intelligently generate meeting minutes by identifying the standard introductory and concluding phrases, which are used for ensuring a coherent and organized summary of the discussions held during the meeting. This combined approach of synchronized audio and video enhances the accuracy and comprehensiveness of the meeting documentation.
[0037]
[0031] In particular, the integration of multi-modal biometric identification of all meeting participants with 360-degree audio-visual recording and automatic generation of meeting minutes achieves a synergistic technical effect. Each participant is first authenticated in a reliable manner by means of both facial recognition and fingerprint scanning, after which recorded spoken contributions are securely linked to the corresponding authenticated identity in encrypted storage. As a result, misattribution of spoken content is avoided and false acceptance and false rejection rates are reduced in comparison with systems based on a single biometric modality or on audio-visual recording that is not identity -linked, thereby providing a technically verifiable and tamper-resistant record of the meeting.
[0032] While the invention may be susceptible to various modifications and alternative forms, specific embodiments of which have been shown by way of example in the figures and have been described in detail herein, it should be understood that the invention is not intended to be limited to the particular forms disclosed. Rather, the invention includes all modifications, equivalents, and alternatives falling within the scope of the invention as defined by the following claims.
Claims
CLAIMS1. A meeting device comprising: a body of the meeting device, at least one microphone on each side of the meeting device adapted to capture audio of meeting participants, at least one camera on each side of the meeting device adapted to capture video of the meeting participants, wherein each camera is deactivated when not in use, a fingerprint scanner arranged on one side of the meeting device, an RFID reader, a memory slot and a memory card, a central processing unit (CPU) in communication with said at least one microphone on each side of the meeting device, at least one camera on each side of the meeting device, fingerprint scanner, RFID reader and the memory slot and the card, wherein the CPU being configured to execute the computer-executable instructions to generate a visual biometric characteristic of the participant based on the image of a video stream captured; generate a biometric characteristic of the participant based on a fingerprint pattern captured by the fingerprint scanner; retrieve pre-stored matching biometric data from a memory of the device and send to the CPU; perform multi-modal biometric fusion of said generated biometric characteristics; compare the obtained biometric data with a pre-stored participant’s identity; verify identity of the participant or mark the participant as unrecognised; grant an identifier to an authenticated participant; based on the verification, allow or deny the participant an access to the meeting, wherein in case an access to the meeting is denied the CPU is being configured to request a permission for access from verified participants of the meeting, wherein the CPU executes the computer-executable instructions to detect key points in images or video frames of the video stream captured from each of said cameras; align said images or video frames and blend transitions between said images or video frames to obtain panoramic video stream of the meeting; synchronize audio captured by each of said microphones based on timing of video frames of said video captured; wherein the CPU executes the computer-executable instructions to monitor and record events; generate a timeline of the same; generate minutes of the meeting by converting the audio captured into text information, wherein the text is generated using a machine intelligence, extracting key information of the meeting,wherein the CPU is configured to store the identifier of the authenticated participant and to use said identifier during the meeting to associate recorded speech and time segments with the corresponding participant, and wherein the CPU encrypts a data structure comprising the recorded images or video frames and metadata, linking each speech and time segments to the corresponding authenticated participant identifier, wherein an access to the information of the minutes is participant identifier dependent.
2. A method of identification of meeting participants, meeting recording and meeting summary generation, comprising: generating a visual biometric characteristic of the participant based on the image of a video stream captured; generating a biometric characteristic of the participant based on a fingerprint pattern captured by the fingerprint scanner; retrieving pre-stored matching biometric data from a memory of the device and sending to the CPU; performing multi-modal biometric fusion of said generated biometric characteristics; comparing the obtained data with a pre-stored participant’s identity; verifying identity of the participant or marking the participant as unrecognised; allowing or denying the participant an access to the meeting, wherein in case an access to the meeting is denied the CPU is being configured to request a permission for access from verified participants of the meeting; wherein the method further comprises detecting key points in images or video frames of the video stream captured from each of said cameras; aligning said images or video frames and blending transitions between said images or video frames to obtain panoramic video stream of the meeting; synchronizing audio captured by each of said microphones based on timing of video frames of said video captured; wherein the method further comprises monitoring and recording events; generating a timeline of the same; generating minutes of the meeting by converting the audio captured into text information, wherein the text is generated using a machine intelligence, extracting key information of the meeting, wherein the CPU stores the identifier of the authenticated participant and uses said identifier during the meeting to associate recorded speech and time segments with the corresponding participant, and wherein the CPU encrypts a data structure comprising the recorded images or video frames and metadata, linking each speech and time segments to the corresponding authenticated participant identifier, wherein an access to the information of the minutes is participant identifier dependent.