Audio processing method and device, computer device and computer readable storage medium
By receiving voice-changing requests and mixing audio, a multi-person conversation atmosphere is created, solving the problem of home conditions being exposed in smart door lock communication and improving users' sense of security.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- SHENZHEN TCL NEW-TECH CO LTD
- Filing Date
- 2022-05-20
- Publication Date
- 2026-06-16
AI Technical Summary
When users communicate with the smart lock through the client, their home situation is easily exposed, leading to security risks.
It receives voice-changing requests, obtains information about multi-person conversation scenarios, mixes input audio with preset mixed audio, and outputs multi-person conversation audio to create a multi-person conversation atmosphere and avoids revealing the real home situation.
To improve users' sense of security at home, reduce the dangers caused by exposure of their home conditions, and enhance home security.
Smart Images

Figure CN114882895B_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of communication technology, specifically to an audio processing method, apparatus, computer device, and computer-readable storage medium. Background Technology
[0002] With the increasing popularity of smart security devices, many people choose to install smart door locks. Users can view the peephole view of the smart door lock in real time through a client application, and users can also communicate with the smart door lock through the client application. However, the process of communicating with the smart door lock through the client application may expose the user's home situation, such as the user being a male living alone or only having a small child at home at the moment, which could put the user in danger. Summary of the Invention
[0003] This application provides an audio processing method, apparatus, computer device, and computer-readable storage medium that can avoid exposing the user's actual home situation and reduce the dangers caused by the exposure of the user's home situation.
[0004] An audio processing method provided in this application includes:
[0005] Receive a voice-changing request for a smart door lock, the voice-changing request carrying information about a multi-person conversation scenario;
[0006] The input audio is obtained according to the voice changing request, and the preset mixed audio corresponding to the multi-person conversation scene information is also obtained;
[0007] The input audio is mixed with the preset mixed audio to obtain multi-person conversation audio;
[0008] The audio of the multi-person conversation is output through the smart door lock.
[0009] Accordingly, this application also provides an audio processing apparatus, comprising:
[0010] A request receiving unit is used to receive a voice-changing request for a smart door lock, wherein the voice-changing request carries information about a multi-person conversation scenario.
[0011] An audio acquisition unit is used to acquire input audio according to the voice changing request, and to acquire a preset mixed audio corresponding to the multi-person conversation scenario information;
[0012] An audio mixing unit is used to mix the input audio with the preset mixed audio to obtain multi-person conversation audio.
[0013] An audio output unit is used to output the audio of the multi-person conversation through the smart door lock.
[0014] In one embodiment, the audio acquisition unit includes:
[0015] An initial audio acquisition subunit is used to acquire the initial audio input from the audio acquisition device according to the voice changing request;
[0016] An audio processing subunit is used to perform voice-changing processing on the initial audio according to the custom voice-changing parameters contained in the voice-changing mode to obtain the input audio.
[0017] In one embodiment, the multi-person conversation scene information includes the number of characters and their identities, and the audio acquisition unit includes:
[0018] The mixed audio acquisition subunit is used to acquire a corresponding number of initial mixed audio based on the number of characters included in the multi-person conversation scene information;
[0019] The timbre adjustment subunit is used to adjust the timbre of the initial mixed audio based on the role identity to obtain the preset mixed audio.
[0020] In one embodiment, the audio processing device further includes:
[0021] The parameter configuration page display unit is used to display the voice changer parameter configuration page for the smart door lock;
[0022] A custom voice-changing parameter acquisition unit is used to acquire custom voice-changing parameters in response to parameter configuration operations on the voice-changing parameter configuration page.
[0023] The mode generation unit is used to generate the voice changing mode based on the custom voice changing parameters.
[0024] In one embodiment, the audio processing device further includes:
[0025] A mixing configuration page display unit is used to display a preset mixing configuration page for the smart door lock;
[0026] The mixing and voice changing parameter acquisition unit is used to acquire the text to be synthesized and the mixing and voice changing parameters in response to the mixing configuration operation of the preset mixing configuration page.
[0027] A speech synthesis unit is used to perform speech synthesis on the text to be synthesized based on the mixing and voice-changing parameters to obtain the preset mixed audio.
[0028] In one embodiment, the audio mixing unit includes:
[0029] A loudness acquisition subunit is used to acquire the input loudness of the input audio.
[0030] A loudness adjustment subunit is used to adjust the audio loudness of the preset mixed audio based on the input loudness to obtain the adjusted mixed audio;
[0031] The mixing subunit is used to mix the input audio and the adjusted mixed audio to obtain multi-person conversation audio.
[0032] In one embodiment, the request receiving unit includes:
[0033] The communication page display subunit is used to display the communication page in the client corresponding to the smart door lock. The communication page includes scene selection controls and audio input controls.
[0034] The information determination subunit is used to determine the selected multi-person conversation scene information in response to the selection operation of the scene selection control;
[0035] A request generation subunit is used to generate the voice changing request in response to input operations to the audio input control and the multi-person conversation scenario information.
[0036] Accordingly, this application also provides a computer device including a memory and a processor; the memory stores a computer program, and the processor is used to run the computer program in the memory to execute any of the audio processing methods provided in this application.
[0037] Accordingly, embodiments of this application also provide a computer-readable storage medium for storing a computer program, which is loaded by a processor to execute any of the audio processing methods provided in embodiments of this application.
[0038] This application embodiment receives a voice-changing request for a smart door lock, the voice-changing request carrying information about a multi-person conversation scenario; obtains input audio based on the voice-changing request, and obtains a preset mixed audio corresponding to the multi-person conversation scenario information; mixes the input audio with the preset mixed audio to obtain multi-person conversation audio; and outputs the multi-person conversation audio through the smart door lock.
[0039] This solution mixes input audio with preset mixed audio to create a multi-person conversation atmosphere, avoiding the exposure of the user's actual home situation. This enhances the user's sense of security at home and reduces the dangers caused by the exposure of the user's home situation, thereby improving the user's home security. Attached Figure Description
[0040] To more clearly illustrate the technical solutions in the embodiments of this application, the accompanying drawings used in the description of the embodiments will be briefly introduced below. Obviously, the accompanying drawings described below are only some embodiments of this application. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort.
[0041] Figure 1 This is a flowchart of the audio processing method provided in the embodiments of this application;
[0042] Figure 2 This is a schematic diagram of the audio processing device provided in the embodiments of this application;
[0043] Figure 3 This is a schematic diagram of the structure of the computer device provided in the embodiments of this application. Detailed Implementation
[0044] The technical solutions of the embodiments of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, and not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
[0045] This application provides an audio processing method, apparatus, computer device, and computer-readable storage medium. The audio processing apparatus can be integrated into a computer device, which may be a server or a terminal, etc.
[0046] The terminal may include mobile phones, wearable smart devices, tablets, laptops, personal computers (PCs), and in-vehicle computers, etc.
[0047] The server can be a standalone physical server, a server cluster or distributed system composed of multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
[0048] The following sections provide detailed descriptions of each example. It should be noted that the order in which the embodiments are described is not intended to limit the preferred order of the embodiments.
[0049] This embodiment will be described from the perspective of an audio processing device, which can be integrated into a computer device, such as a server or a terminal.
[0050] This application provides an audio processing method, such as... Figure 1 As shown, the specific process of this audio processing method can be as follows:
[0051] 101. Receive a voice-changing request for the smart door lock. The voice-changing request carries information about a multi-person conversation scenario.
[0052] Among them, smart door locks can be door locks that support communication functions. For example, smart door locks can be controlled through a client installed on computer devices such as terminals, and audio collected by computer devices can be output through smart door locks.
[0053] Among them, voice changing requests can include requests to change the voice of input audio to create a multi-person conversation scenario.
[0054] The multi-person conversation scenario information may include information that indicates the need to create a multi-person conversation scenario, such as a multi-person conversation scenario identifier, or information such as the number of participants, their genders, and / or their voices.
[0055] For example, a user could trigger a voice-changing request for the smart lock via a client application on the terminal. This voice-changing request could be pre-loaded with information about a multi-person conversation scenario.
[0056] The terminal receives a voice-changing request and determines the operation to be performed based on the multi-person conversation scenario information carried in the voice-changing request.
[0057] 102. Obtain the input audio according to the voice changing request, and obtain the preset mixed audio corresponding to the multi-person conversation scene information.
[0058] The input audio can be audio input by the user through the terminal's audio acquisition device, or it can be audio obtained by synthesizing speech from text input by the user.
[0059] The preset mixed audio can be pre-set audio used to create a multi-person conversation scene.
[0060] For example, the terminal may respond to the voice changing request by acquiring the input audio through an audio acquisition device, and acquire the corresponding preset mixed audio based on the multi-person conversation scenario information. Optionally, the terminal may acquire the input audio based on the voice changing request, and acquire the preset mixed audio from the cloud.
[0061] 103. Mix the input audio with the preset mixed audio to obtain multi-person conversation audio.
[0062] For example, the input audio and the preset mixed audio can be mixed together to form a single audio file, resulting in a multi-person conversation audio file. This multi-person conversation audio file not only includes the user's input audio but also the preset mixed audio file, thus creating a scenario where multiple people are communicating.
[0063] Optionally, the terminal can obtain a preset mixed audio from the cloud and send the input audio to the cloud. The cloud then mixes the input audio with the preset mixed audio to obtain audio of a multi-person conversation.
[0064] 104. Output multi-person conversation audio via smart door lock.
[0065] For example, the terminal could send audio of a multi-person conversation to the smart lock, so that the smart lock can output the audio of the multi-person conversation.
[0066] In one embodiment, the user-input audio can be processed to alter the user's voice, making it impossible to identify the user's true gender and age, thereby further improving security. This step, 102, can specifically include:
[0067] Obtain the initial audio input from the audio acquisition device based on the voice changing request;
[0068] The initial audio is processed by changing the voice based on the custom voice changing parameters included in the voice changing mode to obtain the input audio.
[0069] The initial audio can be unprocessed audio input through an audio acquisition device (e.g., a microphone).
[0070] The voice changing mode can include multiple custom parameters, such as parameters corresponding to speed, timbre, and pitch.
[0071] For example, it could be that an audio acquisition device is activated upon a voice-changing request to obtain the initial audio input by the user, and the initial audio is processed based on multiple custom voice-changing parameters included in the voice-changing mode to obtain the input audio. This makes the input audio different from the initial audio, making it difficult to determine the speaker's identity based on the input audio, thus improving security.
[0072] In one embodiment, the user can pre-set custom parameters to suit their own needs. That is, before step 101, the audio processing method provided in this application embodiment may further include:
[0073] Displays the voice changer parameter configuration page for the smart door lock;
[0074] In response to parameter configuration operations on the voice changer parameter configuration page, retrieve custom voice changer parameters;
[0075] Generate a voice-changing mode based on custom voice-changing parameters.
[0076] The voice changer parameter configuration page can include a user interface for configuring custom voice changer parameters.
[0077] For example, it could display a configuration page for the smart lock's voice changing parameters. This page could include multiple voice changing modes, such as speed, pitch, and timbre. Users can input parameters on this page using a keyboard or by sliding buttons (i.e., parameter configuration operations).
[0078] The terminal responds to the user's parameter configuration operation, obtains the user's input custom voice changing parameters, and generates a voice changing mode according to the custom voice changing parameters corresponding to different voice changing methods.
[0079] In one embodiment, the preset mixed audio can be pre-set by the developer during program development or set by the user. The user can pre-configure the preset mixed audio according to their needs so that the terminal can process the input audio based on the preset mixed audio, thereby improving the flexibility of audio processing. That is, before step 101, the audio processing method provided in this application embodiment may further include:
[0080] Displays the preset mixing configuration page for smart locks;
[0081] In response to a mixing configuration operation on the preset mixing configuration page, obtain the text to be synthesized and the mixing and voice changing parameters;
[0082] Based on the mixing and voice-changing parameters, speech synthesis is performed on the text to be synthesized to obtain a preset mixed audio.
[0083] The preset mix configuration page may include a user interface for configuring preset mixed audio.
[0084] The mixing and voice changing parameters can include parameters such as speed, pitch, and timbre.
[0085] For example, the terminal could display a preset mixing configuration page corresponding to the smart door lock, allowing users to add preset mixed audio. For instance, users can upload audio as preset mixed audio on the preset mixing configuration page, and they can also input the text to be synthesized on the preset mixing configuration page, as well as set mixing and voice-changing parameters based on the characteristics of the audio to be synthesized. The terminal responds to the user's mixing configuration operation on the preset mixing configuration page, obtains the text to be synthesized input by the user, and the set mixing and voice-changing parameters.
[0086] Based on the mixing and voice-changing parameters, the synthesized text is used for speech synthesis to obtain a preset mixed audio.
[0087] In one embodiment, before triggering a voice-changing request for the smart lock, the user can select the number of people, their genders, and their identities in a multi-person conversation scenario. For example, they can select three men and two women to participate in the conversation, or two adults and one child to participate, to increase the variability of the scenario and adapt to different users. That is, the multi-person conversation scenario information includes the number of roles and their identities. Step 102, "obtaining the preset mixed audio corresponding to the multi-person conversation scenario information," can specifically include:
[0088] Obtain the corresponding number of initial mixed audio based on the number of characters included in the multi-person conversation scene information;
[0089] The initial mixed audio is adjusted based on the character's identity to obtain a preset mixed audio.
[0090] For example, it can be done by obtaining a corresponding number of initial mixed audios based on the number of characters in a multi-person conversation scene, and then adjusting the timbre of the initial mixed audios according to the character's identity so that the resulting preset mixed audio matches the character's identity.
[0091] Optionally, different roles can have corresponding voice-changing parameters, and based on these parameters, the initial mixed audio can be converted into a preset mixed audio that matches the role.
[0092] In one embodiment, to prevent the preset mixed audio from being too loud and causing the input audio to be indistinguishable and information to be unable to be transmitted, the loudness of the preset mixed audio can be adjusted according to the loudness of the input audio. That is, step 103, "mixing the input audio with the preset mixed audio to obtain multi-person conversation audio," can specifically include:
[0093] Get the input loudness of the input audio;
[0094] The loudness of the preset mixed audio is adjusted based on the input loudness to obtain the adjusted mixed audio;
[0095] The input audio and the adjusted mixed audio are combined to obtain audio of a multi-person conversation.
[0096] The input loudness can include the loudness of the input audio, and the audio loudness can include the loudness of a preset mixed audio.
[0097] For example, it could specifically obtain the loudness of the input audio and adjust the loudness of the preset mixed audio to below the input loudness, such as half, so that when the smart door lock outputs audio of multiple people talking, the input audio can be heard correctly and with a stronger sense of realism.
[0098] In one embodiment, the terminal can display a communication page corresponding to the smart lock. The communication page can include various scenarios, such as family gatherings, friend gatherings, multi-person conversations, and conversations involving 3 or 5 people. Based on the user's selection of the scenario, multi-person conversation scenario information can be obtained. Specifically, the step "receiving a voice-changing request for the smart lock" can include:
[0099] Displays the communication page in the client corresponding to the smart lock, which includes scene selection controls and audio input controls;
[0100] In response to a selection operation on the scene selection control, determine the selected multi-person conversation scene information;
[0101] In response to input operations to the audio input control and information from multi-person conversation scenarios, a voice changing request is generated.
[0102] The scene selection control allows users to choose a scene.
[0103] For example, the terminal could display the communication page in the client corresponding to the smart lock, respond to the selection operation of the scene selection control on the communication page, determine the multi-person scene selected by the user and obtain the corresponding multi-person conversation scene information, and generate a voice changing request in response to the input operation of the audio input control and the multi-person conversation scene information.
[0104] As can be seen from the above, the embodiments of this application receive a voice-changing request for a smart door lock, the voice-changing request carrying multi-person conversation scene information; obtain input audio according to the voice-changing request, and obtain a preset mixed audio corresponding to the multi-person conversation scene information; mix the input audio with the preset mixed audio to obtain multi-person conversation audio; and output the multi-person conversation audio through the smart door lock.
[0105] This solution mixes input audio with preset mixed audio to create a multi-person conversation atmosphere, avoiding the exposure of the user's actual home situation. This enhances the user's sense of security at home and reduces the dangers caused by the exposure of the user's home situation, thereby improving the user's home security.
[0106] To facilitate better implementation of the audio processing method provided in the embodiments of this application, an audio processing apparatus is also provided in one embodiment. The meanings of the terms used are the same as in the audio processing method described above, and specific implementation details can be found in the description of the method embodiments.
[0107] The audio processing device can be integrated into a computer device, such as... Figure 2 As shown, the audio processing device may include: a request receiving unit 301, an audio acquisition unit 302, an audio mixing unit 303, and an audio output unit 304, as detailed below:
[0108] (1) Request receiving unit 301: used to receive voice changing requests for smart door locks, the voice changing requests carrying multi-person conversation scene information.
[0109] In one embodiment, the request receiving unit 301 may include a communication page display subunit, an information determination subunit, and a request generation subunit, specifically:
[0110] Communication page display sub-unit: Used to display the communication page in the corresponding client of the smart lock. The communication page includes scene selection controls and audio input controls;
[0111] Information Determination Subunit: Used to determine the selected multi-person conversation scene information in response to the selection operation of the scene selection control;
[0112] Request generation subunit: Used to generate a voice changing request in response to input operations on the audio input control and information from multi-person conversation scenarios.
[0113] (2) Audio acquisition unit 302: used to acquire input audio according to voice change request, and to acquire preset mixed audio corresponding to multi-person conversation scene information.
[0114] In one embodiment, the audio acquisition unit 302 may include an initial audio acquisition subunit and an audio processing subunit, specifically:
[0115] Initial audio acquisition subunit: used to acquire the initial audio input from the audio acquisition device according to the voice changing request;
[0116] Audio processing subunit: Used to process the initial audio into voice based on the custom voice changing parameters contained in the voice changing mode, so as to obtain the input audio.
[0117] In one embodiment, the multi-person conversation scene information includes the number of characters and their identities. The audio acquisition unit 302 may include a mixed audio acquisition subunit and a timbre adjustment subunit, specifically:
[0118] Mixed audio acquisition subunit: used to acquire the corresponding number of initial mixed audio based on the number of characters included in the multi-person conversation scene information;
[0119] The timbre adjustment subunit is used to adjust the timbre of the initial mixed audio based on the role identity to obtain the preset mixed audio.
[0120] (3) Audio mixing unit 303: used to mix the input audio with the preset mixed audio to obtain multi-person conversation audio.
[0121] In one embodiment, the audio mixing unit 303 may include a loudness acquisition subunit, a loudness adjustment subunit, and a mixing subunit, specifically:
[0122] Loudness acquisition subunit: used to acquire the input loudness of the input audio;
[0123] Loudness adjustment subunit: used to adjust the audio loudness of a preset mixed audio based on the input loudness, to obtain the adjusted mixed audio;
[0124] Mixing subunit: Used to mix the input audio and the adjusted mixed audio to obtain audio of multiple people talking.
[0125] (4) Audio output unit 304: used to output audio of multi-person conversations through the smart door lock.
[0126] In one embodiment, the audio processing device may further include a parameter configuration page display unit, a custom voice changing parameter acquisition unit, and a pattern generation unit, specifically:
[0127] Parameter configuration page display unit: Used to display the voice changer parameter configuration page for smart door locks;
[0128] Custom voice changer parameter acquisition unit: Used to retrieve custom voice changer parameters in response to parameter configuration operations on the voice changer parameter configuration page;
[0129] Pattern generation unit: Used to generate voice changing patterns based on custom voice changing parameters.
[0130] In one embodiment, the audio processing apparatus may further include a mixing configuration page display unit, a mixing and voice changing parameter acquisition unit, and a speech synthesis unit, specifically:
[0131] Mixing configuration page display unit: Used to display the preset mixing configuration page for smart door locks;
[0132] Mixing and voice changing parameter acquisition unit: used to respond to mixing configuration operations on the preset mixing configuration page and acquire the text to be synthesized and the mixing and voice changing parameters;
[0133] Speech synthesis unit: used to synthesize speech from the text to be synthesized based on mixing and voice-changing parameters, and obtain a preset mixed audio.
[0134] In this embodiment, the audio processing device receives a voice-changing request for a smart door lock via a request receiving unit 301. The voice-changing request carries information about a multi-person conversation scenario. The audio acquisition unit 302 acquires the input audio based on the voice-changing request, and also acquires a preset mixed audio corresponding to the multi-person conversation scenario information. The audio mixing unit 303 mixes the input audio with the preset mixed audio to obtain the multi-person conversation audio. Finally, the audio output unit 304 outputs the multi-person conversation audio through the smart door lock.
[0135] This solution mixes input audio with preset mixed audio to create a multi-person conversation atmosphere, avoiding the exposure of the user's actual home situation. This enhances the user's sense of security at home and reduces the dangers caused by the exposure of the user's home situation, thereby improving the user's home security.
[0136] This application also provides a computer device, which can be a terminal or a server, such as... Figure 3 As shown, it illustrates a structural schematic diagram of the computer device involved in the embodiments of this application, specifically:
[0137] The computer device may include components such as a processor 1001 with one or more processing cores, a memory 1002 with one or more computer-readable storage media, a power supply 1003, and an input unit 1004. Those skilled in the art will understand that... Figure 3 The computer device structure shown does not constitute a limitation on the computer device and may include more or fewer components than shown, or combine certain components, or have different component arrangements. Wherein:
[0138] The processor 1001 is the control center of the computer device. It connects various parts of the computer device via various interfaces and lines, and performs various functions and processes data by running or executing software programs and / or modules stored in the memory 1002, and by calling data stored in the memory 1002, thereby providing overall monitoring of the computer device. Optionally, the processor 1001 may include one or more processing cores; preferably, the processor 1001 may integrate an application processor and a modem processor, wherein the application processor mainly handles the operating system, user interface, and computer programs, and the modem processor mainly handles wireless communication. It is understood that the modem processor may not be integrated into the processor 1001.
[0139] The memory 1002 can be used to store software programs and modules. The processor 1001 executes various functional applications and data processing by running the software programs and modules stored in the memory 1002. The memory 1002 may mainly include a program storage area and a data storage area. The program storage area may store the operating system, computer programs required for at least one function (such as sound playback function, image playback function, etc.), etc.; the data storage area may store data created according to the use of the computer device, etc. In addition, the memory 1002 may include high-speed random access memory, and may also include non-volatile memory, such as at least one disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 1002 may also include a memory controller to provide the processor 1001 with access to the memory 1002.
[0140] The computer equipment also includes a power supply 1003 that supplies power to the various components. Preferably, the power supply 1003 can be logically connected to the processor 1001 through a power management system, thereby enabling functions such as charging, discharging, and power consumption management through the power management system. The power supply 1003 may also include one or more DC or AC power supplies, recharging systems, power fault detection circuits, power converters or inverters, power status indicators, and other arbitrary components.
[0141] The computer device may also include an input unit 1004, which can be used to receive input digital or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.
[0142] Although not shown, the computer device may also include a display unit, etc., which will not be described in detail here. Specifically, in this embodiment, the processor 1001 in the computer device loads the executable files corresponding to the processes of one or more computer programs into the memory 1002 according to the following instructions, and the processor 1001 runs the computer programs stored in the memory 1002 to realize various functions, as follows:
[0143] Receive voice-changing requests for smart door locks, with the requests carrying information about a multi-person conversation scenario;
[0144] The input audio is obtained based on the voice changing request, as well as the preset mixed audio corresponding to the multi-person conversation scene information;
[0145] The input audio is mixed with a preset mixed audio to obtain audio of multiple people talking.
[0146] The smart door lock outputs audio of multi-person conversations.
[0147] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.
[0148] As can be seen from the above, the computer device in this application embodiment can receive a voice-changing request for a smart door lock, the voice-changing request carrying multi-person conversation scene information; obtain input audio according to the voice-changing request, and obtain a preset mixed audio corresponding to the multi-person conversation scene information; mix the input audio with the preset mixed audio to obtain multi-person conversation audio; and output the multi-person conversation audio through the smart door lock.
[0149] This solution mixes input audio with preset mixed audio to create a multi-person conversation atmosphere, avoiding the exposure of the user's actual home situation. This enhances the user's sense of security at home and reduces the dangers caused by the exposure of the user's home situation, thereby improving the user's home security.
[0150] According to one aspect of this application, a computer program product or computer program is provided, comprising computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium and executes the computer instructions, causing the computer device to perform the methods provided in the various optional implementations of the above embodiments.
[0151] Those skilled in the art will understand that all or part of the steps in the various methods of the above embodiments can be performed by a computer program, or by a computer program controlling related hardware. The computer program can be stored in a computer-readable storage medium and loaded and executed by a processor.
[0152] Therefore, embodiments of this application provide a computer-readable storage medium storing a computer program that can be loaded by a processor to execute any of the audio processing methods provided in embodiments of this application.
[0153] For details on the implementation of each of the above operations, please refer to the previous examples, which will not be repeated here.
[0154] The computer-readable storage medium may include: read-only memory (ROM), random access memory (RAM), disk or optical disk, etc.
[0155] Since the computer program stored in the computer-readable storage medium can execute any of the audio processing methods provided in the embodiments of this application, the beneficial effects that any of the audio processing methods provided in the embodiments of this application can achieve can be realized, as detailed in the preceding embodiments, and will not be repeated here.
[0156] The above provides a detailed description of an audio processing method, apparatus, computer device, and computer-readable storage medium provided in the embodiments of this application. Specific examples have been used to illustrate the principles and implementation methods of this application. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of this application. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of this application. Therefore, the content of this specification should not be construed as a limitation of this application.
Claims
1. An audio processing method, characterized in that, include: Receive a voice-changing request for a smart door lock, the voice-changing request carrying information about a multi-person conversation scenario; The input audio is obtained according to the voice changing request, and the preset mixed audio corresponding to the multi-person conversation scene information is also obtained; The input audio is mixed with the preset mixed audio to obtain multi-person conversation audio; The audio of the multi-person conversation is output through the smart door lock.
2. The method according to claim 1, characterized in that, The step of obtaining the input audio according to the voice changing request includes: The initial audio input from the audio acquisition device is obtained according to the voice changing request; The initial audio is processed by changing the voice according to the custom voice changing parameters included in the voice changing mode to obtain the input audio.
3. The method according to claim 2, characterized in that, Before receiving the voice-changing request for the smart lock, the method further includes: Displays the voice changer parameter configuration page for the smart door lock; In response to parameter configuration operations on the voice changer parameter configuration page, obtain custom voice changer parameters; The voice-changing mode is generated based on the custom voice-changing parameters.
4. The method according to claim 1, characterized in that, The multi-person conversation scene information includes the number of characters and their identities. The step of obtaining the preset mixed audio corresponding to the multi-person conversation scene information includes: Based on the number of characters included in the multi-person conversation scene information, obtain the corresponding number of initial mixed audio; The initial mixed audio is adjusted in timbre based on the character's identity to obtain the preset mixed audio.
5. The method according to claim 1, characterized in that, The step of mixing the input audio with the preset mixed audio to obtain multi-person conversation audio includes: Obtain the input loudness of the input audio; The loudness of the preset mixed audio is adjusted based on the input loudness to obtain the adjusted mixed audio; The input audio and the adjusted mixed audio are mixed to obtain audio of a multi-person conversation.
6. The method according to claim 1, characterized in that, Before receiving the voice-changing request for the smart lock, the method further includes: Displays a preset mixing configuration page for the smart lock; In response to the mixing configuration operation on the preset mixing configuration page, the text to be synthesized and the mixing and voice changing parameters are obtained; Based on the mixing and voice-changing parameters, the text to be synthesized is used for speech synthesis to obtain the preset mixed audio.
7. The method according to any one of claims 1-6, characterized in that, Receiving a voice-changing request for a smart door lock includes: Displays the communication page in the client corresponding to the smart door lock, the communication page including scene selection controls and audio input controls; In response to a selection operation on the scene selection control, the selected multi-person conversation scene information is determined; The voice-changing request is generated in response to input operations to the audio input control and the multi-person conversation scenario information.
8. An audio processing apparatus, characterized in that, include: A request receiving unit is used to receive a voice-changing request for a smart door lock, wherein the voice-changing request carries information about a multi-person conversation scenario. An audio acquisition unit is used to acquire input audio according to the voice changing request, and to acquire a preset mixed audio corresponding to the multi-person conversation scenario information; An audio mixing unit is used to mix the input audio with the preset mixed audio to obtain multi-person conversation audio. An audio output unit is used to output the audio of the multi-person conversation through the smart door lock.
9. A computer device, characterized in that, It includes a memory and a processor; the memory stores a computer program, and the processor is used to run the computer program in the memory to perform the audio processing method according to any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that, The computer-readable storage medium is used to store a computer program, which is loaded by a processor to perform the audio processing method according to any one of claims 1 to 7.