A method, system and device for detecting sound leakage of a secure conference room and a storage medium
By acquiring image and acoustic spectrum data, combining them with acoustic imaging data, marking the location of sound sources, and performing noise reduction processing, the problem of outdoor sound leakage at confidential meetings was solved, enabling effective monitoring and leakage assessment of outdoor sound conditions at confidential meetings.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- BEIJING TIANDAQINGYUAN COMM TECH
- Filing Date
- 2023-05-24
- Publication Date
- 2026-06-26
AI Technical Summary
How to monitor sound conditions within a limited area outside a secure meeting room, and how to address the issue that while secure meeting rooms are designed to minimize sound leakage during construction, leakage may still occur during actual meetings.
By acquiring image data and field of view data, combined with acoustic spectrum data and acoustic imaging data, and based on preset matching rules and image analysis rules, the location of the sound source is marked, and the target audio data is obtained through inverse transformation of the acoustic spectrum transformation model, thereby realizing the monitoring of the outdoor sound situation of the confidential meeting.
It enables effective monitoring of sound conditions within a limited area outside a confidential meeting room, and obtains clear audio data after noise reduction processing to determine whether there is any leakage.
Smart Images

Figure CN116559287B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the technical field of sound leakage detection, and in particular to a method, system, device and storage medium for sound leakage detection in a secure conference room. Background Technology
[0002] Confidential conference rooms are venues for holding secret meetings, effectively protecting against sound, light, and electromagnetic interference generated during the meeting. A secure conference room includes a shielding enclosure, ventilation, power distribution, fire protection, interior design, and integration of security equipment. Although various methods are employed during construction to minimize sound leakage, leaks can still occur during actual meetings. These leaks can be unintentional, such as sound transmission through the air or solids, which unauthorized personnel may unintentionally or intentionally overhear; or intentional, where eavesdropping devices are installed to actively transmit sound information for eavesdropping. Common sources of leakage include doors, windows, walls, floors, air conditioning ducts, ventilation ducts, heating pipes, water pipes, and vents. Doors, windows, and pipes are considered the weakest points.
[0003] However, sound propagation is limited to a certain range. How to monitor the sound situation in the limited area outside the confidential conference room and make a judgment on sound leakage in the conference room is a problem that needs to be solved.
[0004] The existing technical solutions mentioned above have the following drawbacks: how to monitor the sound situation in a limited area outside a confidential meeting room. Summary of the Invention
[0005] To address the issue of monitoring sound conditions within a limited area outside a secure meeting room, this application provides a method, system, device, and storage medium for detecting sound leakage in a secure meeting room.
[0006] In a first aspect of this application, a method for detecting sound leakage in a secure conference room is provided. The method includes:
[0007] Acquire image data and corresponding field-of-view data, wherein the field-of-view data represents the shooting range of the image data;
[0008] Based on the image data, the corresponding sound data is retrieved. The sound data includes acoustic spectrum data and acoustic imaging data. The acoustic spectrum data is the acoustic spectrum diagram of the sound, and the acoustic imaging data can show the image of the distribution location of the sound source.
[0009] Based on preset matching rules, the location of the sound source within the field of view range data is matched according to the field of view range data and the acoustic imaging data, and the target sound source is marked in the acoustic imaging data.
[0010] Based on preset image analysis rules and the acoustic imaging data, the acoustic spectrum data is adjusted to obtain the target acoustic spectrum data;
[0011] Based on a preset acoustic spectrum transformation model, the target acoustic spectrum data is inversely transformed to obtain target audio data corresponding to the image data.
[0012] As described above, the technical solution involves acquiring image data, viewing angle data, and sound data. The sound data includes acoustic spectrum data and acoustic imaging data. Based on preset matching rules, the sound source locations within the viewing angle data are matched according to the acoustic imaging data in the sound data, and the target sound source is marked in the acoustic imaging data. The acoustic spectrum data is adjusted according to preset image analysis rules and the acoustic imaging data to obtain target acoustic spectrum data. Based on a preset acoustic spectrum transformation model, an inverse transformation is performed on the target acoustic spectrum data to obtain target audio data corresponding to the image data. Noise-related data is obtained by comparing the sound source locations and viewing angle data in the acoustic imaging data with the acoustic spectrum data. Then, the acoustic spectrum data is denoised, and the denoised target acoustic spectrum data is inversely transformed to obtain the target audio data. This process obtains the sound conditions within a limited area of the secure conference room, enabling monitoring of the sound conditions within a limited area outside the secure conference room.
[0013] In one possible implementation, the image data includes an image start time and an image end time; the sound data includes an audio start time and an audio end time.
[0014] In one possible implementation, retrieving the corresponding sound data based on the image data includes:
[0015] Based on the image start time and the image end time, the corresponding audio data is retrieved. The audio start time of the audio data is the same as the image start time, and the audio end time of the audio data is the same as the image end time.
[0016] In one possible implementation, the step of matching the sound source location within the viewpoint range data based on the preset matching rules, according to the viewpoint range data and the acoustic imaging data, and marking the target sound source in the acoustic imaging data, includes:
[0017] The acoustic imaging data is an image containing the locations of multiple sound sources and the changes in the corresponding sounds.
[0018] Sequentially determine whether the locations of multiple sound sources in the acoustic imaging data are within the range corresponding to the viewing angle range data;
[0019] If so, then mark the location of the sound source as a target sound source.
[0020] In one possible implementation, the method further includes: marking the sound source location as a noise source when the sound source location in the acoustic imaging data is not within the range corresponding to the viewing angle range data.
[0021] In one possible implementation, adjusting the acoustic spectrum data according to preset image analysis rules and the acoustic imaging data to obtain target acoustic spectrum data includes:
[0022] Obtain the noise data corresponding to the noise source markers in the acoustic imaging data;
[0023] Based on the noise data, the acoustic spectrum data is subjected to noise reduction processing to obtain the target acoustic spectrum data.
[0024] In one possible implementation, the step of performing an inverse transformation on the target acoustic spectrum data based on a preset acoustic spectrum transformation model to obtain target audio data corresponding to the image data includes:
[0025] Acquire historical sound data, which includes historical spectrogram data and historical audio data, and there is a corresponding relationship between the historical spectrogram data and the historical audio data;
[0026] The historical data is input into a preset machine learning model to obtain a spectral transformation model;
[0027] The target acoustic spectrum data is input into the acoustic spectrum transformation model to obtain the target audio data.
[0028] In a second aspect of this application, a sound leakage detection system for a secure conference room is provided. The system includes:
[0029] The data acquisition module is used to acquire image data and the viewing angle range data corresponding to the image data, wherein the viewing angle range data represents the shooting range of the image data;
[0030] The data retrieval module is used to retrieve corresponding sound data based on the image data. The sound data includes acoustic spectrum data and acoustic imaging data. The acoustic spectrum data is an acoustic spectrum diagram of the sound, and the acoustic imaging data can show an image of the distribution location of the sound source.
[0031] The data analysis module is used to match the location of the sound source within the field of view data based on the preset matching rules, according to the field of view data and the acoustic imaging data, and to mark the target sound source in the acoustic imaging data.
[0032] The noise reduction processing module is used to adjust the acoustic spectrum data according to the preset image analysis rules and the acoustic imaging data to obtain the target acoustic spectrum data;
[0033] The data processing module is used to perform an inverse transformation on the target acoustic spectrum data based on a preset acoustic spectrum transformation model to obtain target audio data corresponding to the image data.
[0034] In a third aspect of this application, an electronic device is provided. The electronic device includes a memory and a processor, wherein the memory stores a computer program, and the processor executes the program to implement the method described above.
[0035] In a fourth aspect of this application, a computer-readable storage medium is provided having a computer program stored thereon that, when executed by a processor, implements the method according to the first aspect of this application.
[0036] In summary, this application includes at least one of the following beneficial technical effects:
[0037] By comparing the sound source location and viewing angle data in the acoustic imaging data and combining them with the acoustic spectrum data, the noise-related data is obtained. Then, the acoustic spectrum data is denoised, and the denoised target acoustic spectrum data is inversely transformed to obtain the target audio data, thus realizing the monitoring of the sound situation in a limited area outside the confidential conference room. Attached Figure Description
[0038] Figure 1 This is a flowchart illustrating the sound leakage detection method for the confidential conference room provided in this application.
[0039] Figure 2 This is a schematic diagram of the sound leakage detection system for the confidential conference room provided in this application.
[0040] Figure 3 This is a schematic diagram of the structure of the electronic device provided in this application.
[0041] In the diagram, 200 is the sound leakage detection system for the secure conference room; 201 is the data acquisition module; 202 is the data retrieval module; 203 is the data analysis module; 204 is the noise reduction module; 205 is the data processing module; 301 is the CPU; 302 is the ROM; 303 is the RAM; 304 is the I / O interface; 305 is the input section; 306 is the output section; 307 is the storage section; 308 is the communication section; 309 is the driver; and 310 is the removable media. Detailed Implementation
[0042] To make the objectives, technical solutions, and advantages of the embodiments of this application clearer, the technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0043] Furthermore, the term "and / or" in this article is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, or B existing alone. Additionally, the character " / " in this article, unless otherwise specified, generally indicates that the preceding and following related objects have an "or" relationship.
[0044] The embodiments of this application will now be described in further detail with reference to the accompanying drawings.
[0045] This application provides a method for detecting sound leakage in a secure conference room. The main process of the method is described below.
[0046] like Figure 1 As shown:
[0047] Step S101: Obtain image data and the corresponding view range data.
[0048] Specifically, image data includes image start time and image end time. This image data refers to the video content captured by monitoring equipment or other devices with audio and video recording capabilities. For a segment of video content, there are a start time and an end time, which are the aforementioned image start time and image end time. The aforementioned viewpoint range data represents the shooting range of the image data. It can be understood that the actual space captured by the device is three-dimensional, but the image data obtained after the device captures the image is composed of multiple two-dimensional images. Therefore, establishing a planar coordinate system is beneficial for describing the viewpoint range data. Using the monitoring equipment as the origin, a planar coordinate system is established on a plane passing through the origin and parallel to the ground. The shooting range of the monitoring equipment, i.e., the viewpoint range data, is represented on this planar coordinate system. For example, the viewpoint range data is 2≤x≤5 and 2≤y≤5. After the planar coordinate system is established, the content that can be monitored in three-dimensional space is projected along a direction perpendicular to the plane containing the planar coordinate system. The position projected onto the planar coordinate system is the coordinate position corresponding to that content. The aforementioned planar coordinate system can also be established by capturing a frame of a two-dimensional image and establishing a planar coordinate system on that two-dimensional image. The above-mentioned planar coordinate system is established to facilitate the description of the shooting range, i.e., the viewing angle range data, of the monitoring equipment. Other methods can also be used to describe the viewing angle range data, and no restrictions are imposed here.
[0049] The aforementioned image data includes all image content within a limited area of the secure conference room. This limited area refers to the range within which sound leaking from the secure conference room can be heard, but not outside that range. The limited area is set based on the actual conditions of the secure conference room. It is understandable that image monitoring has a certain viewing angle, but sound acquisition is omnidirectional, meaning it can capture sound content outside the image frame. This can affect the judgment of the sound corresponding to the image data. Therefore, further processing of the sound data corresponding to the image data is necessary.
[0050] Step S102: Retrieve the corresponding sound data based on the image data.
[0051] Specifically, the sound data represents the data related to the audio content corresponding to the aforementioned image data. The sound data includes the audio start time and audio end time. This means that an audio segment has a start time and an end time for recording. The sound data also includes spectrogram data and acoustic imaging data. The spectrogram data is the spectrogram of the sound, corresponding to the audio content. The acoustic imaging data shows the distribution of sound sources. This acoustic imaging data can be obtained using an acoustic imaging device. It is based on microphone array measurement technology. By measuring the phase difference of the sound waves reaching each microphone within a certain space, the location of the sound source is determined according to the phased array principle. The amplitude of the sound source is measured, and the spatial distribution of the sound source is displayed as an image, i.e., a spatial sound field distribution cloud map—an acoustic image—is obtained. The color and brightness of the image represent intensity. In other words, the acoustic imaging data is an image including the location of the sound sources, the amplitude of the sound sources, and the intensity of the sound source amplitude. The acoustic imaging data contains multiple sound source locations and the changes in the corresponding sounds.
[0052] Based on the image start time and the image end time, the corresponding audio data is retrieved. That is, when the audio start time of the audio data is the same as the image start time and the audio end time of the audio data is the same as the image end time, it indicates that there is a correspondence between the image data and the audio data.
[0053] Understandably, for a monitoring device with storage and recording functions, after recording sound and images, they will be stored in the corresponding storage space, and the sound and image data will be retrieved from that storage space.
[0054] Step S103: Based on the preset matching rules, match the sound source location within the view range data according to the view range data and the acoustic imaging data, and mark the target sound source in the acoustic imaging data.
[0055] Specifically, it is determined sequentially whether the locations of multiple sound sources in the aforementioned acoustic imaging data are within the range corresponding to the aforementioned viewpoint range data; if so, the sound source locations are marked as target sound sources. If the sound source locations in the aforementioned acoustic imaging data are not within the range corresponding to the aforementioned viewpoint range data, the sound source locations are marked as noise source locations.
[0056] It is understandable that acoustic imaging data is a two-dimensional image, and a planar coordinate system corresponding to the image data needs to be established on the image. The process of establishing the corresponding planar coordinate system can be as follows: the monitoring device generates a sound significantly different from that occurring in the everyday environment, thus allowing the location of the monitoring device to be clearly located in the acoustic imaging data, thereby establishing the corresponding coordinate system; or the origin of the coordinate system can be determined based on the relative positional relationship between the monitoring device (or sound source) emitting a fixed sound in the monitoring environment and the monitoring device, thus establishing the corresponding coordinate system. Alternatively, the acoustic imaging data can be manually analyzed to establish the corresponding planar coordinate system. Other methods for establishing the planar coordinate system can also be used, and are not limited here.
[0057] After establishing the planar coordinate system, each sound source location has corresponding coordinates (or a range of coordinates). For a given sound source, this isn't necessarily a single point; it could be a region. Therefore, a sound source location might correspond to a specific coordinate point or a range. When the coordinate point falls within the range corresponding to the viewpoint data, the sound source location is marked as a target sound source. When the coordinate range overlaps with the viewpoint data, the sound source location is marked as a target sound source. When the coordinate point is outside the range corresponding to the viewpoint data, the sound source location is marked as a noise source. When the coordinate range does not overlap with the viewpoint data, the sound source location is marked as a noise source.
[0058] Step S104: Adjust the acoustic spectrum data according to the preset image analysis rules and acoustic imaging data to obtain the target acoustic spectrum data.
[0059] Specifically, noise data corresponding to the noise source markers in the aforementioned acoustic imaging data is obtained. This noise data includes the amplitude variation of the sound source. Based on the noise data, the acoustic spectrum data undergoes noise reduction processing to obtain the target acoustic spectrum data.
[0060] It is understandable that acoustic imaging data represents the relevant information of a sound source at a certain moment. Acoustic spectrum data includes time, frequency, and amplitude. After obtaining the time corresponding to the amplitude change of the noise in the acoustic imaging data, the frequency of the noise data with the corresponding amplitude change can be obtained from the acoustic spectrum data, thus yielding the noise acoustic spectrum data. The process of denoising the corresponding acoustic spectrum data based on the noise acoustic spectrum data is well-known to those skilled in the art and will not be elaborated here. Denoising methods include spectral subtraction, statistical real-time denoising, subspace algorithms, and machine learning-based denoising. Spectral subtraction further includes nonlinear spectral subtraction, multi-band spectral subtraction, MMSE spectral subtraction algorithm, extended spectral subtraction, adaptive gain averaging spectral subtraction, and selective spectral subtraction. Different denoising methods can be selected according to the specific monitoring environment, or the user can specify the denoising method.
[0061] Step S105: Based on the preset acoustic spectrum transformation model, perform inverse transformation on the target acoustic spectrum data to obtain the target audio data corresponding to the image data.
[0062] Specifically, historical sound data is acquired, including historical spectrogram data and historical audio data. There is a correspondence between the historical spectrogram data and historical audio data; that is, for a piece of audio content, a spectrogram corresponding to the audio content can be obtained. This historical data is input into a preset machine learning model to obtain a spectrogram transformation model. The target spectrogram data is then input into the spectrogram transformation model to obtain the target audio data. The training process of the aforementioned machine learning model is well-known to those skilled in the art and will not be elaborated upon here. It is understood that after denoising the aforementioned spectrogram data, the target spectrogram data is obtained. The target audio data obtained from the target spectrogram data is the denoised audio content. Based on the correspondence between sound data and image data, the target audio data and image data are associated.
[0063] For spectrogram data obtained using Fast Fourier Transform (FFT), the inverse Fast Fourier Transform (iFFT) can be used to convert the spectrogram data into the corresponding time-domain signal, i.e., the corresponding audio data. The inverse Fast Fourier Transform transforms the frame-by-frame FFT signal into small time-domain signals, and then concatenates them to obtain the corresponding audio data.
[0064] Understandably, secure meeting rooms are built to maintain confidentiality. Theoretically, one should not be able to hear sounds from inside the secure meeting room from outside. However, practical situations, such as open windows or doors, can lead to sound leaks. By monitoring surveillance footage within a limited area outside the secure meeting room and processing the corresponding audio, the system can determine the audio situation within that limited area. This audio situation includes sounds emanating from the secure meeting room as well as sounds emanating from outside. By ultimately acquiring the audio and video data, it can be determined whether sounds emanating from the meeting room are present in the audio data. If so, it can be further examined to determine if any sounds emanating from outside the meeting room are suspicious, thus determining whether an actual leak has occurred regarding the content within the secure meeting room.
[0065] This application provides a sound leakage detection system 200 for a secure conference room, referring to... Figure 2 The sound leakage detection system 200 for the secure conference room includes:
[0066] Data acquisition module 201 is used to acquire image data and viewing angle range data corresponding to the image data, wherein the viewing angle range data represents the shooting range of the image data;
[0067] The data retrieval module 202 is used to retrieve corresponding sound data based on the image data. The sound data includes acoustic spectrum data and acoustic imaging data. The acoustic spectrum data is an acoustic spectrum diagram of the sound, and the acoustic imaging data can show an image that reflects the distribution location of the sound source.
[0068] The data analysis module 203 is used to match the location of the sound source within the field of view data based on the preset matching rules, according to the field of view data and the acoustic imaging data, and to mark the target sound source in the acoustic imaging data.
[0069] The noise reduction processing module 204 is used to adjust the acoustic spectrum data according to the preset image analysis rules and the acoustic imaging data to obtain the target acoustic spectrum data;
[0070] The data processing module 205 is used to perform an inverse transformation on the target acoustic spectrum data based on a preset acoustic spectrum transformation model to obtain target audio data corresponding to the image data.
[0071] Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working process of the described module can be referred to the corresponding process in the foregoing method embodiments, and will not be repeated here.
[0072] This application discloses an electronic device. (Refer to...) Figure 3The electronic device includes a central processing unit (CPU) 301, which can perform various appropriate actions and processes based on a program stored in a read-only memory (ROM) 302 or a program loaded from a storage section 307 into a random access memory (RAM) 303. The RAM 303 also stores various programs and data required for system operation. The CPU 301, ROM 302, and RAM 303 are interconnected via a bus. An input / output (I / O) interface 304 is also connected to the bus.
[0073] The following components are connected to I / O interface 304: an input section 305 including a keyboard, mouse, etc.; an output section 306 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and speakers, etc.; a storage section 307 including a hard disk, etc.; and a communication section 308 including a network interface card such as a LAN card, modem, etc. The communication section 308 performs communication processing via a network such as the Internet. A drive 309 is also connected to I / O interface 304 as needed. A removable medium 310, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on drive 309 as needed so that computer programs read from it can be installed into storage section 307 as needed.
[0074] Specifically, according to embodiments of this application, the flowchart above refers to... Figure 1 The described process can be implemented as a computer software program. For example, embodiments of this application include a computer program product comprising a computer program carried on a machine-readable medium, the computer program containing program code for performing the methods shown in the flowchart. In such embodiments, the computer program can be downloaded and installed from a network via communication section 308, and / or installed from removable medium 310. When the computer program is executed by central processing unit (CPU) 301, it performs the functions defined in the apparatus of this application.
[0075] It should be noted that the computer-readable medium shown in this application can be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. A computer-readable storage medium can be, for example,—but not limited to—an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of a computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer disk, a hard disk, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination thereof. In this application, a computer-readable storage medium can be any tangible medium containing or storing a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In this application, a computer-readable signal medium can include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code. Such propagated data signals can take various forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination thereof. Computer-readable signal media can also be any computer-readable medium other than computer-readable storage media, which can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device. The program code contained on the computer-readable medium can be transmitted using any suitable medium, including but not limited to: wireless, wire, optical fiber, RF, etc., or any suitable combination thereof.
[0076] The above description is merely a preferred embodiment of this application and an explanation of the technical principles employed. Those skilled in the art should understand that the scope of this application is not limited to technical solutions formed by specific combinations of the above-described technical features, but should also cover other technical solutions formed by arbitrary combinations of the above-described technical features or their equivalents without departing from the foregoing application concept. For example, technical solutions formed by substituting the above features with (but not limited to) technical features with similar functions claimed in this application.
Claims
1. A method for detecting sound leakage in a secure conference room, characterized in that, include: Acquire image data and corresponding field-of-view data, wherein the field-of-view data represents the shooting range of the image data; Based on the image data, the corresponding sound data is retrieved. The sound data includes acoustic spectrum data and acoustic imaging data. The acoustic spectrum data is the acoustic spectrum diagram of the sound, and the acoustic imaging data is an image that can show the distribution location of the sound source. Based on preset matching rules, the location of the sound source within the field of view range data is matched according to the field of view range data and the acoustic imaging data, and the target sound source is marked on the acoustic imaging data. Based on preset image analysis rules and the acoustic imaging data, the acoustic spectrum data is adjusted to obtain the target acoustic spectrum data; Based on a preset acoustic spectrum transformation model, the target acoustic spectrum data is inversely transformed to obtain target audio data corresponding to the image data; The method based on preset matching rules, according to the viewing angle range data and the acoustic imaging data, matches the sound source location within the viewing angle range data, and marks the target sound source in the acoustic imaging data, including: The acoustic imaging data is an image containing the locations of multiple sound sources and the changes in the corresponding sounds. Sequentially determine whether the locations of multiple sound sources in the acoustic imaging data are within the range corresponding to the viewing angle range data; If so, mark the location of the sound source as a target sound source; The method further includes: when the location of the sound source in the acoustic imaging data is not within the range corresponding to the viewing angle range data, marking the location of the sound source as a noise source.
2. The sound leakage detection method for a secure conference room according to claim 1, characterized in that, The image data includes the image start time and the image end time; the sound data includes the audio start time and the audio end time.
3. The sound leakage detection method for a secure conference room according to claim 2, characterized in that, The step of retrieving the corresponding sound data based on the image data includes: Based on the image start time and the image end time, the corresponding audio data is retrieved. The audio start time of the audio data is the same as the image start time, and the audio end time of the audio data is the same as the image end time.
4. The sound leakage detection method for a secure conference room according to claim 1, characterized in that, The step of adjusting the acoustic spectrum data according to preset image analysis rules and the acoustic imaging data to obtain target acoustic spectrum data includes: Obtain the noise data corresponding to the noise source markers in the acoustic imaging data; Based on the noise data, the acoustic spectrum data is subjected to noise reduction processing to obtain the target acoustic spectrum data.
5. The sound leakage detection method for a secure conference room according to claim 1, characterized in that, The method of performing an inverse transformation on the target acoustic spectrum data based on a preset acoustic spectrum transformation model to obtain target audio data corresponding to the image data includes: Acquire historical sound data, which includes historical spectrogram data and historical audio data, and there is a corresponding relationship between the historical spectrogram data and the historical audio data; The historical sound data is input into a preset machine learning model to obtain a spectrogram transformation model; The target acoustic spectrum data is input into the acoustic spectrum transformation model to obtain the target audio data.
6. A sound leakage detection system for a secure conference room, characterized in that, include: The data acquisition module (201) is used to acquire image data and the viewing angle range data corresponding to the image data, wherein the viewing angle range data represents the shooting range of the image data; The data retrieval module (202) is used to retrieve the corresponding sound data based on the image data. The sound data includes acoustic spectrum data and acoustic imaging data. The acoustic spectrum data is an acoustic spectrum diagram of the sound, and the acoustic imaging data is an image that can reflect the distribution location of the sound source. The data analysis module (203) is used to match the location of the sound source within the field of view data based on the preset matching rules, according to the field of view data and the acoustic imaging data, and mark the target sound source in the acoustic imaging data. The noise reduction processing module (204) is used to adjust the acoustic spectrum data according to the preset image analysis rules and the acoustic imaging data to obtain the target acoustic spectrum data; The data processing module (205) is used to perform an inverse transformation on the target acoustic spectrum data based on a preset acoustic spectrum transformation model to obtain target audio data corresponding to the image data; The data analysis module (203) is specifically used for: The acoustic imaging data is an image containing the locations of multiple sound sources and the changes in the corresponding sounds. Sequentially determine whether the locations of multiple sound sources in the acoustic imaging data are within the range corresponding to the viewing angle range data; If so, mark the location of the sound source as a target sound source; It also includes: when the location of the sound source in the acoustic imaging data is not within the range corresponding to the viewing angle range data, marking the location of the sound source as a noise source.
7. An electronic device, characterized in that, It includes a memory and a processor, wherein the memory stores a computer program that can be loaded by the processor and executed according to any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that, The computer program is stored that can be loaded by a processor and executed according to any one of claims 1 to 5.