Control method of smart device, control apparatus of smart device, and smart device
By acquiring description text of smart device object models in a home setting and voice control commands, and performing inference to generate target control commands, the problem of smart devices' requirements for the form of voice control commands is solved, achieving higher recognition accuracy and user interaction experience.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- MIDEA GRP (SHANGHAI) CO LTD
- Filing Date
- 2026-03-10
- Publication Date
- 2026-06-12
AI Technical Summary
In existing technologies, smart devices have certain format requirements for voice control commands, which leads to insufficient generalization and understanding capabilities, making it impossible to accurately identify the user's control intentions and respond to user needs in a timely and effective manner.
By acquiring the description text of smart device object models in the home scene, and combining it with voice control commands for reasoning, target control commands are generated. It supports the conversion of irregular voice control commands into executable commands, including multi-dimensional reasoning based on location, device, function, and parameters, and performs verification to ensure accuracy.
It improves the recognition accuracy and success rate of voice control commands, enhances the user's interactive experience, and ensures the accuracy and efficiency of device response.
Smart Images

Figure CN122201286A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of voice interaction, and more specifically, to a control method for a smart device, a control device for a smart device, and a smart device in the field of voice interaction. Background Technology
[0002] With the development of the intelligent era, more and more smart devices are entering people's lives, aiming to provide users with an intelligent living environment. For example, in a smart home scenario, users can directly control various smart devices in their home via voice commands.
[0003] In related technologies, the main problem when users control smart devices via voice is that the voice control commands have certain format requirements. When the user's voice control commands deviate from the fixed format, due to insufficient generalization understanding, the device may not be able to accurately recognize the user's control intentions, thus failing to respond to the user's control needs in a timely and effective manner.
[0004] Therefore, improving the control accuracy of smart devices to enhance user experience is an urgent problem to be solved. Summary of the Invention
[0005] This application provides a control method, a control device, and a smart device for a smart device. The control method can improve the recognition accuracy of voice control commands, thereby increasing the success rate of voice-controlled smart devices and enhancing the user's interactive experience.
[0006] Firstly, a method for controlling a smart device is provided, the method comprising: In response to voice control commands, obtain the object model description text of M smart devices in the home scene. The object model description text is used to represent the capabilities and basic attributes of the smart devices, where M is an integer greater than or equal to 1. Reasoning is performed on the voice control commands and the object model description texts of M smart devices to obtain the target control commands for the target smart device among the M smart devices; Control the operation of the target intelligent device according to the target control instructions.
[0007] In the above technical solution, this application provides a method for controlling smart devices. This method receives a user's voice control command, combines the device model of the current home scene with the voice control command, and infers the target control command for the target smart device in the home scene. There are no restrictions on the form of the voice control command in the above process. That is, regardless of the format of the user's voice control command, this method can generate accurate control commands to control the devices in the home scene. Therefore, this control method can improve the recognition accuracy of voice control commands, thereby increasing the success rate of voice-controlled smart devices and enhancing the user's interactive experience.
[0008] In conjunction with the first aspect, in some possible implementations, reasoning is performed on the voice control commands and the object model description texts of M smart devices to obtain the target control commands for the target smart device among the M smart devices, including: Determine if the voice control command is a preset format command; When the voice control command is not a preset format command, reasoning is performed on the voice control command and the object model description text to obtain the target control command, which is a preset format command.
[0009] In the above technical solution, when reasoning about voice control commands, the control method first determines whether the format of the voice control command is a preset format. When the voice control command is not in the preset format, for such irregular voice control commands, the control method can use reasoning, combined with the capabilities of devices in the home environment, to deduce the target control command in the preset format. Therefore, this control method can ultimately convert irregular voice control commands into target control commands that can be used to control devices, thereby avoiding the problem of smart devices failing to respond when voice control commands are irregular, improving the success rate of voice interaction and the user's voice interaction experience.
[0010] In conjunction with the first aspect, in some possible implementations, the control method further includes: When the voice control command is a preset format command, the target smart device is controlled to operate according to the voice control command.
[0011] In the above technical solution, when the voice control command conforms to the preset format, there is no need to reason about the voice control command, but to directly control the target smart device to respond and execute, which can greatly improve the response efficiency of the smart device and improve the efficiency of voice interaction.
[0012] Combining the first aspect and the above implementation methods, in some possible implementation methods, reasoning is performed on the voice control commands and the object model description text to obtain the target control commands, including: The voice control commands are processed to obtain the corresponding text and the recognition result of the text. Based on the recognition results, or based on the recognition results and the object model description text, determine multiple inference information; Based on various inference information, target control commands are obtained.
[0013] In the above technical solution, when processing voice control commands, the corresponding text recognition result is obtained first, which can accurately capture the user's core needs and ensure that the generation of subsequent control commands is closely related to the user's core needs, thus avoiding the problem of inaccurate control of smart devices.
[0014] Combining the first aspect and the above implementation methods, in some possible implementation methods, for any one of the multiple inference information, the identification result includes at least one keyword and target adjustment requirement. Based on the identification result, or based on the identification result and the object model description text, multiple inference information is determined, including: When at least one keyword includes the target keyword corresponding to the reasoning information, the target keyword is identified as the reasoning information; When at least one keyword does not include the target keyword, obtain the device information of M smart devices; determine the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text.
[0015] Combining the first aspect and the above implementation methods, in some possible implementation methods, the inference information includes the target function, and the object model description text includes the voice functions supported by the M smart devices. The inference information is determined based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text, including: Match the target adjustment needs with the voice functions supported by M smart devices to determine the matching functions; The matching function is identified as the target function.
[0016] In the above technical solution, during the reasoning process for the target function, the target adjustment requirement is matched with the voice functions supported by M smart devices to obtain the target function. This reasoning process uses the voice functions supported by the M smart devices as a constraint, minimizing the possibility that the target function is out of sync with the capabilities of the devices in the current home environment. It also ensures that the matched function can be controlled by voice, guaranteeing the effectiveness of voice interaction. Furthermore, by filtering and determining suitable functions based on the user's actual adjustment intentions, a high degree of binding between the user's adjustment intentions and the functions is achieved, improving the accuracy of smart device control.
[0017] Combining the first aspect and the above implementation methods, in some possible implementation methods, the inference information also includes the target device type, and the object model description text also includes the device types and models of M intelligent devices. Based on the target adjustment requirements, the device information of the M intelligent devices, and at least one of the object model description text, the inference information is determined, including: Based on the device type, device model, and supported voice functions of M smart devices, determine the mapping relationship between the device type and the supported voice functions of the M smart devices; Match the target function with the mapping relationship to determine the matching device type; The matching device type is determined as the target device type.
[0018] In the above technical solution, the target device type is matched through the mapping relationship between the target function and the function and device type. Since the target function reflects the user's current functional needs, the above process essentially transforms the user's functional needs into a clear device type reference, achieving a match between user needs and device capabilities. Furthermore, in the above matching process, the device types of M smart devices are used to constrain the inferred target device type as much as possible, ensuring that the inference result is as consistent as possible with the current home scenario, guaranteeing the rationality of the target device type, and providing a precise control object for subsequent device control.
[0019] Combining the first aspect and the above implementation methods, in some possible implementation methods, the inference information also includes the target device location. The device information includes the device location. The inference information is determined based on at least one of the target adjustment requirements, the device information of the M intelligent devices, and the object model description text, including: Match the device types of M smart devices with the target device type to determine N matching devices, where N is an integer greater than or equal to 1 and less than or equal to M; When the quantity N equals 1, the device positions of the N matching devices are determined as the target device position; When the number N is greater than 1, obtain the user's historical preference information and environmental awareness information; determine the user's current location based on the environmental awareness information, historical preference information and the device locations of N matching devices; and determine the user's current location as the target device location.
[0020] In the above technical solution, by matching the device types of M smart devices with the target device type, N matching devices are quickly selected, effectively narrowing down the range of device locations. When the number of matching devices is 1, the location of that device is directly determined as the target device location, avoiding the complex reasoning process for device location. When the number of matching devices is greater than 1, the user's current location is accurately inferred by combining the user's usage habits of devices in the home scenario and the real-time environmental status. This allows the determination of device location to be highly matched with the user's preferences, achieving personalized determination of device location.
[0021] Combining the first aspect and the above implementation methods, in some possible implementation methods, the inference information also includes target parameters, and the device information includes operating status. Based on at least one of the target adjustment requirements, the device information of the M intelligent devices, and the object model description text, the inference information is determined, including: Acquire users' historical preference information and environmental awareness information; When historical preference information includes parameters from the user's previous use of the target function, those parameters are determined as the target parameters; or... When the historical preference information does not include the parameters used previously, determine the operating status of the target function of the target smart device from the operating status of M smart devices; determine the target parameters of the target function based on environmental perception information, or based on environmental perception information and the operating status of the target function.
[0022] In the above technical solution, during the determination of target parameters, if there are previous usage parameters for the target function, these parameters are directly determined as the target parameters. This approach maximizes the consideration of users' historical usage habits, ensuring that the adjustment results align with users' personal preferences. When no previous usage parameters exist, the target parameters are determined by combining environmental perception information and the actual operating status of the target function on the target smart device. This ensures that the determination process of the target parameters closely matches the actual environmental scenario, guaranteeing the accuracy of the adjustment.
[0023] Combining the first aspect and the above implementation methods, in some possible implementation methods, target control instructions are obtained based on multiple inference information, including: When the number of results for any one of the multiple inference information is equal to 1, a candidate control instruction containing multiple inference information is generated. When a candidate control command passes the verification, it is determined as the target control command.
[0024] In the above technical solution, candidate control commands are generated when the number of results for each inference information is 1. This ensures that the candidate control commands have clear directionality and uniqueness, avoiding the generation of ambiguous commands that the device cannot respond to. Further verification of the candidate control commands to determine whether they can be used as target control commands effectively avoids and detects unreasonable results in the inference process, ensuring the feasibility of the commands ultimately issued to the device, improving the device's response success rate to commands, and thus improving the success rate of voice interaction.
[0025] In combination with the first aspect and the above implementation methods, in some possible implementations, the control method further includes: When the number of results of at least one of the multiple inference information is greater than 1, a target prompt message is generated based on at least one inference information to prompt the user to enter the voice command again; Update the voice control commands to the voice commands that the user can re-enter.
[0026] In the above technical solution, when the number of at least one result among multiple inference information is greater than 1, it indicates that there is ambiguity in the function, device type, device location, or parameters. In this case, proactively triggering a prompt instead of directly executing the command can fundamentally avoid the problem of inaccurate command execution. Furthermore, proactively triggering a prompt to guide the user to output voice again can prevent a decline in user experience caused by device unresponsiveness, thus optimizing the user's interactive experience.
[0027] Combining the first aspect and the above implementation methods, in some possible implementations, the multiple inference information includes the target device type, target device location, target function, and target parameters. The control method also includes: Based on the target device type, target device location, and the device types and locations of M smart devices, determine whether the M smart devices include the target smart device; When M smart devices include the target smart device, if the communication status of the target smart device is normal, determine whether the voice function supported by the target smart device includes the target function. When the voice functions supported by the target smart device include the target function, determine whether the target parameters are within the preset parameter range corresponding to the target function; When the target parameter is within the preset parameter range, determine whether the candidate control command conflicts with the current working mode of the target intelligent device; If the candidate control command does not conflict with the current operating mode of the target smart device, the candidate control command is deemed to have passed the verification.
[0028] In the above technical solution, after receiving candidate control commands, it verifies whether the target smart device is among the M smart devices. This verifies the actual existence of the target smart device and avoids issuing commands to non-existent smart devices. After confirming the existence of the target smart device, its communication status is verified. Verification continues only when the communication status is normal, preventing command execution failure due to smart device offline or network anomalies. It verifies whether the target function falls within the voice function range supported by the target smart device, constraining the function within the actual capability boundaries of the device and avoiding generating function commands that the device cannot perform. Finally, it verifies whether the candidate control command conflicts with the current operating mode of the target device, preventing performance degradation caused by mode conflicts and ensuring compatibility between the control command and the current operating state of the device. Therefore, these multi-dimensional verifications ultimately ensure that the control command is correctly responded to by the target smart device, improving the success rate of command execution.
[0029] Secondly, a control device for a smart device is provided, the control device comprising: The acquisition module is used to acquire the object model description text of M smart devices in the home scene in response to voice control commands. The object model description text is used to represent the capabilities and basic attributes of the smart devices, where M is an integer greater than or equal to 1. The reasoning module is used to reason about the voice control commands and the object model description text of M smart devices to obtain the target control commands of the target smart device among the M smart devices. The control module is used to control the operation of the target intelligent device according to the target control instructions.
[0030] In conjunction with the second aspect, in some possible implementations, the reasoning module is specifically used to: determine whether the voice control command is a command in a preset format; when the voice control command is not a command in a preset format, reason about the voice control command and the object model description text to obtain the target control command, which is a command in a preset format.
[0031] In conjunction with the second aspect and the above implementation methods, in some possible implementation methods, the reasoning module is also used to: control the target smart device to operate according to the voice control command when the voice control command is a command in a preset format.
[0032] In combination with the second aspect and the above implementation methods, in some possible implementation methods, the reasoning module is specifically used for: recognizing and processing voice control commands to obtain the text corresponding to the voice control commands and the recognition result of the text; determining various reasoning information based on the recognition result, or based on the recognition result and the object model description text; and obtaining the target control command based on the various reasoning information.
[0033] Combining the second aspect and the above implementation methods, in some possible implementation methods, for any one of the multiple inference information, the identification result includes at least one keyword and the target adjustment requirement. The inference module is specifically used to: when at least one keyword includes the target keyword corresponding to the inference information, determine the target keyword as the inference information; when at least one keyword does not include the target keyword, obtain the device information of M smart devices; and determine the inference information based on at least one of the target adjustment requirement, the device information of the M smart devices, and the object model description text.
[0034] Combining the second aspect and the above implementation methods, in some possible implementation methods, the inference information includes the target function, and the object model description text includes the voice functions supported by M smart devices. The inference module is specifically used to: match the target adjustment requirements with the voice functions supported by the M smart devices to determine the matching function; and determine the matching function as the target function.
[0035] Combining the second aspect and the above implementation methods, in some possible implementation methods, the inference information also includes the target device type, and the object model description text also includes the device types and device models of M smart devices. The inference module is specifically used to: determine the mapping relationship between the device types and supported voice functions of the M smart devices based on the device types, device models and supported voice functions of the M smart devices; match the target function with the mapping relationship to determine the matching device type; and determine the matching device type as the target device type.
[0036] Combining the second aspect and the above implementation methods, in some possible implementation methods, the inference information also includes the target device location. The device information includes the device location. Specifically, the inference module is used to: match the device types of M smart devices with the target device type to determine N matching devices, where N is an integer greater than or equal to 1 and less than or equal to M; when the number N equals 1, determine the device locations of the N matching devices as the target device location; when the number N is greater than 1, obtain the user's historical preference information and environmental awareness information; determine the user's current location based on the environmental awareness information, historical preference information, and the device locations of the N matching devices; and determine the user's current location as the target device location.
[0037] Combining the second aspect and the above implementation methods, in some possible implementation methods, the inference information also includes target parameters, and the device information includes operating status. The inference module is specifically used to: obtain the user's historical preference information and environmental perception information; when the historical preference information includes the user's previous usage parameters for the target function, determine the previous usage parameters as the target parameters; or, when the historical preference information does not include the previous usage parameters, determine the operating status of the target function of the target smart device from the operating status of M smart devices; and determine the target parameters of the target function based on the environmental perception information, or based on the environmental perception information and the operating status of the target function.
[0038] In combination with the second aspect and the above implementation methods, in some possible implementation methods, the reasoning module is specifically used to: generate a candidate control instruction containing multiple reasoning information when the number of results of any reasoning information among multiple reasoning information is equal to 1; and determine the candidate control instruction as the target control instruction when the candidate control instruction passes the verification.
[0039] In conjunction with the second aspect and the above implementation methods, in some possible implementation methods, the reasoning module is also used to: when the number of results of at least one of the multiple reasoning information is greater than 1, generate target prompt information based on at least one reasoning information to prompt the user to input the voice command again; and update the voice control command to the voice command input by the user again.
[0040] Combining the second aspect and the above implementation methods, in some possible implementation methods, the multiple inference information includes the target device type, target device location, target function, and target parameters. The inference module is also used to: determine whether the M smart devices include the target smart device based on the target device type, target device location, and the device types and locations of the M smart devices; when the M smart devices include the target smart device, if the communication status of the target smart device is normal, determine whether the voice function supported by the target smart device includes the target function; when the voice function supported by the target smart device includes the target function, determine whether the target parameters are within the preset parameter range corresponding to the target function; when the target parameters are within the preset parameter range, determine whether the candidate control command conflicts with the current operating mode of the target smart device; when the candidate control command does not conflict with the current operating mode of the target smart device, determine that the candidate control command verification is successful.
[0041] Thirdly, a smart device is provided, including a memory and a processor. The memory is used to store executable program code, and the processor is used to call and run the executable program code from the memory, causing the smart device to perform the control method in the first aspect or any possible implementation thereof.
[0042] Fourthly, a computer program product is provided, comprising: computer program code, which, when executed on a computer, causes the computer to perform the control method described in the first aspect or any possible implementation thereof.
[0043] Fifthly, a computer-readable storage medium is provided that stores computer program code, which, when executed on a computer, causes the computer to perform the control method described in the first aspect or any possible implementation thereof. Attached Figure Description
[0044] Figure 1 This is a structural diagram of a smart home dialogue system provided in an embodiment of this application; Figure 2 This is a structural diagram of a control system for an intelligent device provided in an embodiment of this application; Figure 3 This is a schematic flowchart illustrating a control method for an intelligent device provided in an embodiment of this application; Figure 4 This is a schematic flowchart of another intelligent device control method provided in the embodiments of this application; Figure 5 This is a schematic diagram of the structure of a control device for a smart device provided in an embodiment of this application; Figure 6 This is a schematic diagram of the structure of a smart device provided in an embodiment of this application. Detailed Implementation
[0045] The technical solutions in this application will be clearly and thoroughly described below with reference to the accompanying drawings. In the description of the embodiments of this application, unless otherwise stated, " / " means "or," for example, A / B can mean A or B. "And / or" in the text is merely a description of the relationship between related objects, indicating that three relationships can exist. For example, A and / or B can represent: A existing alone, A and B existing simultaneously, and B existing alone. Furthermore, in the description of the embodiments of this application, "multiple" refers to two or more than two.
[0046] Hereinafter, the terms "first" and "second" are used for descriptive purposes only and should not be construed as implying or suggesting relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
[0047] Before introducing the solutions of the embodiments of this application, the following is a definition of the technical terms that may be involved in the embodiments of this application.
[0048] Smart home: Utilizing advanced computer, network communication, and artificial intelligence (AI) technologies, various devices in the home (such as lights, air conditioners, curtains, door locks, refrigerators, etc.) are connected together, enabling them to sense the environment, understand instructions, make autonomous decisions, and work together to create a more comfortable, convenient, safe, and energy-efficient living environment for users.
[0049] Object model: Also known as device model or digital twin model, it is a core abstract concept in the fields of smart home and Internet of Things. It is a digital abstraction of physical devices, used to describe the functions, states, and interaction capabilities of devices in the cloud (i.e., cloud platform). In simple terms, an object model is a standardized template used to describe what a device is, what it can do, and what information it can provide.
[0050] Large Language Model (LLM): A natural language processing model based on deep learning that learns patterns through pre-training on massive amounts of text data, enabling it to understand, generate, and reason about text.
[0051] Paradigm: refers to a set of theoretical frameworks, methods and values that are generally accepted by a discipline or scientific community and are used to guide research and practice.
[0052] After explaining the technical terms that may be involved in this application, the application scenarios of the embodiments of this application are described below.
[0053] With the development of the intelligent era, various smart devices based on IoT and AI technologies are gradually changing and updating people's lifestyles and quality of life, aiming to provide users with an intelligent living environment. For example, in smart home scenarios, common smart devices include, but are not limited to, smart speakers, smart air conditioners, smart refrigerators, and smart curtains.
[0054] The aforementioned smart devices, such as smart speakers, smart air conditioners, smart refrigerators, and smart curtains, differ primarily from traditional home devices in their interactivity. Specifically, traditional devices only support manual operation via remote control or physical buttons, lacking network connectivity. Smart devices, however, leverage network connectivity to integrate into the Internet of Things (IoT) ecosystem and support contactless control methods such as voice commands, mobile applications (APPs), or central control screens. They become a key component of the smart home's interactive system, possessing proactive responsiveness, perception, and decision-making capabilities.
[0055] The following will combine Figure 1 This application describes in detail the architecture and working process of an exemplary dialogue system provided in its embodiments.
[0056] Figure 1 This is a structural diagram of a smart home dialogue system provided in an embodiment of this application.
[0057] For example, such as Figure 1 As shown, the smart home dialogue system (hereinafter referred to as the "dialogue system") 100 mainly consists of the following parts: voice control commands 101, dialogue terminal 102, and multiple smart devices that support voice control, such as... Figure 1 The system includes smart lights 103, smart curtains 104, smart air conditioners 105, and smart refrigerators 106. The main functions of each component are as follows: Voice control command 101 is the user's voice input via natural language. It is the original interaction entry point of the dialogue system 100 and is used to carry the user's control intentions and needs. For example, voice control command 101 can be "Turn on the master bedroom light for me" or "I'm a little hot, adjust the temperature for me".
[0058] As the name suggests, the dialogue terminal 102 refers to a terminal device used for voice dialogue or voice interaction with a user, responsible for receiving voice control commands 101 issued by the user. Optionally, such as Figure 1 As shown, the dialogue terminal 102 can be a smart speaker, or other types of devices, such as a central control screen, a user's smartphone, a smart tablet, or a wearable device. The following example illustrates the dialogue terminal 102 as a smart speaker. Depending on the computing power of the dialogue terminal 102, its role in the dialogue system 100 will also vary.
[0059] Optionally, when the computing power of the dialogue terminal 102 is limited, the dialogue terminal 102 acts as a "voice relay station" in the dialogue system 100. Specifically, the dialogue terminal 102 communicates with the smart home cloud platform (… Figure 1 (Not shown in the image) A communication connection is established, which is responsible for receiving and recognizing voice control commands 101, obtaining the recognized text, and sending the recognized text to the cloud platform for processing. At this time, the cloud platform is equivalent to the core processing hub, serving as the processing terminal for voice control commands 101.
[0060] When the dialogue terminal 102 has sufficient computing power, it functions as the core processing hub, acting as the processing terminal for voice control commands 101. Regardless of whether the processing terminal is the dialogue terminal 102 or a cloud platform, it can communicate and connect with various controlled smart devices. After receiving the recognized text, the processing terminal processes the text to obtain specific control commands, and then sends these commands to the corresponding smart devices, such as… Figure 1 The smart air conditioner 105.
[0061] Therefore, the entire workflow of the dialogue system 100 can be summarized as follows: user outputs voice control command 101 → dialogue terminal 102 receives voice control command 101 → processing terminal processes voice control command 101 → processing terminal sends voice control command 101 to smart device.
[0062] In related technologies, the main problem when users control smart devices via voice is that the voice control commands have certain format requirements. That is, when the user's voice control commands meet these requirements, the smart device can respond correctly. When the user's voice control commands deviate from the fixed format, due to the insufficient generalization and understanding capabilities of the processing terminal, the terminal cannot understand what the user is trying to express. This results in the terminal's inability to accurately identify the user's control intent and thus cannot respond to the user's control needs in a timely and effective manner.
[0063] For example, a preset fixed format library might contain a template like, "Please turn on the kitchen light." The user's actual voice control command might be, "I want to cook, please turn on the light." Because the user's actual voice control command doesn't include the location of the light, the processing terminal cannot obtain an accurate control command when matching the text of the voice control command with the template.
[0064] In addition, the processing terminal lacks a certain reasoning ability. For example, if the user outputs a voice command such as "turn the air conditioner up a bit", the processing terminal cannot infer which function of the air conditioner needs to be adjusted.
[0065] In view of the technical problems existing in the above-mentioned related technologies, this application provides a control method, control device, and smart device for a smart device. The control method, upon receiving a user's voice control command, combines the device object model of the current home scene with the voice control command to infer the target control command for the target smart device in the home scene. There are no restrictions on the form of the voice control command in the above process. That is, regardless of the format of the voice control command output by the user, this control method can generate accurate control commands to control the devices in the home scene. Therefore, this control method can improve the recognition accuracy of voice control commands, thereby increasing the success rate of voice control of smart devices and enhancing the user's interactive experience.
[0066] After introducing the application scenarios of the embodiments of this application, the control method of the embodiments of this application will be introduced below.
[0067] It should be understood that, in the embodiments of this application, the smart device can specifically be a smart device in a smart home scenario (i.e., a home scenario), and may include, but is not limited to, smart devices in a smart home scenario. Figure 1The smart light 103, smart curtains 104, smart air conditioner 105, smart refrigerator 106, fresh air system, air purifier, humidifier, dehumidifier, robot vacuum cleaner, washing machine, television, water heater, audio system, smart door lock, smart socket, and other home appliances with network connectivity and automatic control functions are shown. It should be noted that the control method provided in this application embodiment is applicable not only to smart home scenarios but also to other smart devices with voice interaction functions, such as mobile terminals, service robots, etc.
[0068] It should also be understood that the control method of this application embodiment mainly relies on the control system of an intelligent device. Before introducing the method of this application embodiment, the following will first describe... Figure 2 This paper provides a detailed introduction to the structure and working principle of the control system of intelligent devices.
[0069] Figure 2 This is a structural diagram of a control system for an intelligent device provided in an embodiment of this application.
[0070] For example, such as Figure 2 As shown, combined with Figure 1 The control system of the intelligent device (hereinafter referred to as the "control system") is specifically integrated in the processing terminal.
[0071] The various components of the control system 200 are divided according to function, mainly including a configuration module 201, an inference module 202, and a verification module 203. The specific functions of each module are as follows: The configuration module 201 is a front-end module in the control system 200, mainly used to enable developers or technicians of intelligent devices to pre-configure relevant information about the intelligent devices. Optionally, the relevant information includes the device object model description file, device function conflict description, voice control function range, timing function range, and supported model range.
[0072] The device object model text is a structured text used to describe the device object model. Specifically, the device object model is a structured description of the capabilities supported or possessed by the intelligent device, its possible states, and its possible parameters. The object model consists of three components: attributes, operations (also known as services), and events.
[0073] The purpose of attributes is to describe specific information and states of a smart device during operation, supporting reading and setting. Examples include the current temperature of a smart air conditioner, the on / off status of a smart light, and the opening degree of smart curtains. In this embodiment, the developer defines the name, data type, value range, and alias of each attribute in the object model description text.
[0074] An operation refers to the capability or method of a smart device that can be invoked externally. Input and output parameters can be set in a service call. Input parameters are the parameters used during service execution, and output parameters are the results after service execution. Examples include "raise temperature" and "cooling mode" for a smart air conditioner, and "brightness adjustment," "on," and "off" for a smart light. In this embodiment, the developer will define the operation's function name, parameter list, value type, value range, and function alias in the object model description text.
[0075] An event refers to an event that a smart device can actively report during operation, such as device alarms and malfunction events. For example, an overload of current in a smart light will report an "overload alarm" event, and a malfunction of the temperature sensor in a smart air conditioner will report a "sensor malfunction" event. In this embodiment, the developer will define the event name, parameters, and triggering conditions in the object model description text.
[0076] It should be understood that in the embodiments of this application, when defining the device object model, the developers standardized the description of the smart device according to the model level, thereby defining the functional scope and functional constraints of the smart device. Here, the model level refers to the specific model of the smart device. For example, for a smart air conditioner of a certain brand, its model can be KFR-26GW / N8MJA3. That is to say, since smart devices of the same model can perform mostly the same functions, each model corresponds to one object model.
[0077] After configuring object models for all possible models of smart devices on the market in advance, object model description text for each model of smart device can be obtained.
[0078] Device function conflict descriptions are used to describe conflicts in device functions based on model level. For example, a specific model of smart air conditioner may be specified that the fan speed cannot be adjusted in automatic mode, or that the smart air conditioner cannot operate in cooling mode when it is turned off.
[0079] The voice control function scope is used to specify which functions in the device object model are applicable to voice control. As described above, the device object model defines all the functions that a smart device can perform, but not all of these functions support voice control. Therefore, the voice control function scope needs to be defined. For example, the object model of a certain model of smart air conditioner defines 1000 functions, but only 100 functions are available for user voice control. In this case, the voice control function scope of the smart air conditioner only includes these 100 functions.
[0080] The timed function scope is used to define and configure which functions within the voice control function scope support scheduled task planning. Since scheduled task execution is independent of real-time user supervision and occurs automatically without human intervention, it carries inherent risks. Improper use could lead to dangerous accidents. For example, scheduling a microwave oven to heat an empty machine or a washing machine to start automatically not only consumes electricity but could also cause a fire. Therefore, the timed functions supported by smart devices must be carefully configured to mitigate risks from the outset.
[0081] The supported model range is used to constrain the models of smart devices that can support access to the dialogue system. In other words, only smart devices whose models are within the supported model range can be recognized, processed, and have commands issued by the dialogue system.
[0082] The aforementioned descriptions of device function conflicts, voice control function ranges, timing function ranges, and supported model ranges also have a one-to-one correspondence with model levels. In other words, a device model is associated not only with a device object model but also with a description of device function conflicts, a range of voice control functions, a range of timing functions, and a range of supported models. The descriptions of device function conflicts, the range of voice control functions, the range of timing functions, and the range of supported models are essentially configuration tables. This information configured according to model levels can be stored in the processing terminal.
[0083] The reasoning module 202 is the core component of the entire control system 200. It is used to perform reasonable reasoning on the user's voice control commands to obtain the final reasoning result.
[0084] Combination Figure 1 As shown, in a specific home scenario, when a user outputs a voice control command, that voice control command is... Figure 1 The dialogue terminal 102 shown receives and is processed by the processing terminal, specifically by the inference module 202.
[0085] Specifically, the processing terminal can obtain device information for all smart devices in the current home environment. Device information includes, but is not limited to, device category name (i.e., device type), device name (i.e., the name of the device in the current home environment, such as the master bedroom air conditioner or the living room light), room name, floor name, and the current operating status of each smart device.
[0086] Based on the device information, the reasoning module 202 can further instantiate the device object model in the configuration module 201, that is, find the object models of all smart devices in the current home scene from the object model description file in the configuration module 201 according to the device information.
[0087] Preferably, in order to improve the accuracy of reasoning, the reasoning terminal 202 can also acquire environmental perception information, including but not limited to weather, air humidity, temperature, time, etc.
[0088] Device information, object models of all smart devices in the current home scenario, and environmental perception information are used as inputs to the LLM model. The LLM model combines multidimensional information to perform reasoning and obtain the reasoning result.
[0089] like Figure 2 As shown, the inference process of the LLM model is multi-dimensional, including location inference, device inference, function inference, and parameter inference. Location inference refers to determining which spatial area in the home needs to be controlled, such as the living room, master bedroom, bathroom, or kitchen. Device inference refers to determining which smart devices in the home need to be controlled, such as air conditioners, refrigerators, and televisions. Function inference refers to which function of the smart device needs to be controlled. Parameter inference refers to how much or to what extent it needs to be adjusted.
[0090] After inference is complete, the LLM model outputs two types of inference results: candidate control commands and target prompts. Specifically, when clear inference information is obtained in each of the above dimensions, the LLM model obtains candidate control commands based on the multi-dimensional inference information. When the inference information in any of the above dimensions is unclear, the LLM model generates target prompts to prompt the user to clarify or re-enter the voice control command.
[0091] When the inference result is a candidate control command, the verification module 203 needs to perform an executability verification on the candidate control command. It should be understood that since the LLM model may experience illusions during the inference process, the obtained inference result may not be accurate or consistent with reality. Therefore, before actually controlling the intelligent device, the processing terminal also needs to verify the result through the verification module 203.
[0092] like Figure 2 As shown, the verification module 203 performs verification including device verification, function verification, parameter verification, and conflict verification. Device verification refers to verifying, based on device information, whether the inferred smart device exists in the current home scenario and whether the smart device is offline. Function verification refers to verifying, based on the device object model, whether the inferred function is possessed by the smart device in the current home scenario. Parameter verification refers to verifying, based on the object model, whether the inferred function parameters are reasonable, such as whether they are within a predefined value range. Function conflict verification refers to verifying, based on the device function conflict description in the configuration module 201, whether there is a conflict between the inferred candidate control command and the current state of the smart device.
[0093] If all the above checks are successful, the verification module 203 confirms that the check has passed and further sends the candidate control command as the final target control command to the inferred smart device so that the smart device can respond in a timely manner. If any of the above checks fails, the verification module 203 determines that the check has failed, generates a failure message, and informs the user that the operation cannot be executed.
[0094] Therefore, the above process is the voice control process of the intelligent device based on the control system 200.
[0095] After introducing the control system of intelligent devices, the following section will combine... Figure 3 The control method for a smart device provided in the embodiments of this application is described. It should be understood that this control method can be applied to… Figure 1 The aforementioned processing terminal. Specifically, this processing terminal integrates... Figure 2 The control system 200 is used. When the processing terminal executes the control method, it does so through the control system 200. Optionally, the processing terminal can be either a cloud platform or a dialog terminal; this application embodiment does not limit this. In the following description of this application embodiment, a cloud platform is used as the processing terminal and a smart speaker is used as the dialog terminal for illustration.
[0096] Figure 3 This is a schematic flowchart of a control method for an intelligent device provided in an embodiment of this application.
[0097] For example, such as Figure 3 As shown, the control method 300 includes the following steps S301 to S303.
[0098] S301, in response to a voice control command, obtains the object model description text of M smart devices in the home scene. The object model description text is used to represent the capabilities and basic attributes of the smart devices, where M is an integer greater than or equal to 1.
[0099] In a family Figure 1 In the smart home dialogue system shown, users can output corresponding voice control commands when they need to adjust a smart device. These voice control commands can then be received by the smart speaker. The smart speaker can forward the voice control commands to the cloud platform for processing. After receiving the voice control command, the cloud platform triggers the voice control command processing flow.
[0100] It should be understood that the core point of this application embodiment is that, for the voice control commands output by the user, even if the voice control commands are not in a standard format or are ambiguous, the cloud platform is no longer limited to strictly matching preset keywords or command formats to output results, but can perform reasonable reasoning and verification to ultimately generate accurate and executable control commands, instead of completely ignoring them.
[0101] Based on this, combined Figure 2 As explained, the cloud platform pre-stores pre-configured object model description texts for various models of smart devices, defining the functions that these devices can perform. When reasoning about voice control commands, the cloud platform first needs to obtain the smart devices present in the current home scenario and their object model description texts to determine the functional scope of the smart devices in the current home scenario, providing a precise range and constraint basis for subsequent intent reasoning and command verification.
[0102] The following section details the process of obtaining the object model description text for M smart devices in a home scenario.
[0103] It should be understood that in a smart home system, each user (i.e., a family) is bound to a unique identity document (ID, or family ID) on the cloud platform. Each smart device in the home is bound to this family ID upon its first network connection and activation, and this information is recorded in the cloud platform. Simultaneously, the user also configures a unique name for each smart device, such as the living room air conditioner or the master bedroom air conditioner. During operation, smart devices also report their operational status to the cloud platform in real-time or periodically. Furthermore, the smart speaker initiating voice interaction is itself a registered smart device bound to the family ID and possesses a unique device ID.
[0104] After receiving a voice control command from a smart speaker, the cloud platform will inevitably include its own device ID when sending the command back to the smart speaker. Upon receiving the smart speaker's device ID, the cloud platform will query the home ID bound to that device ID to determine the home scenario associated with the voice control command. After determining the home ID, the cloud platform can then query the device information of M smart devices within that home scenario.
[0105] Device information is used to represent the basic attributes and current operating status of the device. Optionally, device information includes, but is not limited to, device ID, device type, device name, device location, operating status, and device model. The device name refers to the name of the smart device in a home setting, such as the living room air conditioner or master bedroom air conditioner mentioned above.
[0106] Based on the device model information of M smart devices, the cloud platform can retrieve the object model description text for each of the M smart devices. Referring to the aforementioned content of the object model description text, it represents the capabilities and basic attributes of the smart devices.
[0107] S302, reason about the voice control command and the object model description text of M smart devices to obtain the target control command of the target smart device among the M smart devices.
[0108] Based on the core points of the aforementioned embodiments of this application, when processing voice control commands, the cloud platform can first identify whether the voice control command is a clear voice command, that is, whether it is a command with a preset format. A command with a preset format refers to a standardized command that includes a fixed phrase pre-configured by the cloud platform and contains device and operation actions. This type of command can be accurately identified and executed through keyword matching. For example, a fixed phrase could be: "Open xx (device type)," "Open xx (location) of xx (device type)," or "Adjust xx (location) of xx (device type) to xx (parameter)." Furthermore, the cloud platform can perform subsequent processing based on the recognition results of the voice control command.
[0109] In one possible implementation, reasoning is performed on the voice control command and the object model description text of M smart devices to obtain the target control command for the target smart device among the M smart devices, including: Determine if the voice control command is a preset format command; When the voice control command is not a preset format command, reasoning is performed on the voice control command and the object model description text to obtain the target control command, which is a preset format command.
[0110] Specifically, the process of recognizing voice control commands involves first converting the voice control commands into text using Automatic Speech Recognition (ASR) technology, then matching the text with keywords in a fixed sentence structure to determine character similarity. If the character similarity is greater than a preset similarity, the voice control command is determined to be a command in a preset format.
[0111] When the voice control command is not in a preset format, the methods used in related technologies cannot effectively recognize it. However, in this embodiment, the cloud platform can perform reasonable reasoning by combining the object model description text with such voice control commands to obtain a target control command that can be accurately issued. Since the target control command can be directly issued and used to cause the device to perform the corresponding action, it is a preset format command.
[0112] In this embodiment, when reasoning about voice control commands, the control method first determines whether the format of the voice control command is a preset format. If the voice control command is not in the preset format, for such irregular voice control commands, the control method can use reasoning to deduce the target control command in the preset format, taking into account the capabilities of devices in the home environment. Therefore, this control method can ultimately convert irregular voice control commands into target control commands that can be used to control devices, thereby avoiding the problem of smart devices failing to respond when voice control commands are irregular, and improving the success rate of voice interaction and the user's voice interaction experience.
[0113] When the voice control command is a pre-formatted command, the cloud platform does not need to perform inference and can directly control the smart device corresponding to the voice control command.
[0114] In one possible implementation, the control method further includes: When the voice control command is a preset format command, the target smart device is controlled to operate according to the voice control command.
[0115] When the voice control command is a pre-formatted command, the cloud platform can identify the smart device indicated by the voice control command among M smart devices as the target smart device to be controlled, and send the command to the target smart device so that the target smart device can execute the command.
[0116] In this embodiment, when the voice control command conforms to the preset format, there is no need to reason about the voice control command; the target smart device can be directly controlled to respond and execute, which can greatly improve the response efficiency of the smart device and improve the efficiency of voice interaction.
[0117] Specifically, when the voice control command is not a pre-defined format command, the cloud platform's processing procedure is as follows.
[0118] Based on the foregoing introduction and Figure 2 As shown, the cloud platform needs to perform reasoning by combining voice control commands and object model description text to obtain the target control commands for the target intelligent device to be controlled. Target control commands can be understood as commands that the target intelligent device can directly respond to; that is, target control commands are commands with a preset format.
[0119] In one possible implementation, reasoning is performed on the voice control command and the object model description text of M smart devices to obtain the target control command, including: The voice control commands are processed to obtain the corresponding text and the recognition result of the text. Based on the recognition results, or based on the recognition results and the object model description text, determine multiple inference information; Based on various inference information, target control commands are obtained.
[0120] Combination Figure 2 As shown, the above reasoning process is implemented using an existing LLM model, which mainly relies on the LLM model's deep understanding of natural language.
[0121] Voice control commands are essentially audio signals and cannot be directly processed by machines. The cloud platform needs to first recognize and process the voice control commands to obtain their corresponding text and the recognition results.
[0122] The process of obtaining the text corresponding to the voice control command relies on ASR technology. The determination of the text recognition result depends on rule or template matching, or is achieved through a Natural Language Understanding (NLU) model.
[0123] Optionally, the text recognition result includes at least one keyword and a target adjustment requirement. The at least one keyword can be extracted from the text through entity extraction and semantic parsing. This keyword may include device attribute keywords and action tendency keywords. Based on action tendency and device attributes, combined with common sense or semantic mapping rules in the smart home field, the LLM model can infer the user's target adjustment requirement.
[0124] For example, if the text obtained after voice control command processing using ASR technology is, "It's too dark at home, please brighten it a bit," the LLM model performs entity extraction on the text, extracting the keywords "brighten" and "a bit," thus determining the device attribute keyword as brightness and the action tendency keyword as "increase" or "raise." Furthermore, the LLM model, based on the attribute and action tendency, determines the target adjustment need as increasing the lighting brightness.
[0125] Another example: if the voice control command, after being processed by ASR technology, yields the text "Turn the temperature down a bit," the LLM model performs entity extraction on the text, extracting the keywords "temperature," "cool," and "a bit." This identifies the device attribute keyword as "temperature" and the action tendency keyword as "reduce." Furthermore, the LLM model, based on the attribute and action tendency, determines the target adjustment need as lowering the temperature.
[0126] Based on the above identification results, or in combination with the identification results or the object model description text, the LLM model can determine various inference information of the target intelligent device. These various inference information include... Figure 2 The diagram illustrates location reasoning, device reasoning, function reasoning, and parameter reasoning. Based on the obtained reasoning information, the LLM model can generate target control commands for the target intelligent device.
[0127] In this embodiment of the application, when processing voice control commands, the corresponding text recognition result is obtained first, which can accurately capture the user's core needs and ensure that the generation of subsequent control commands is closely related to the user's core needs, thus avoiding the problem of inaccurate control of smart devices.
[0128] Specifically, the process of determining multiple inference information can be divided into the following two scenarios.
[0129] In one possible implementation, for any one of the multiple inference information, the identification result includes at least one keyword and target adjustment requirement. Based on the identification result, or based on the identification result and the object model description text, multiple inference information is determined, including: When at least one keyword includes the target keyword corresponding to the reasoning information, the target keyword is identified as the reasoning information; When at least one keyword does not include the target keyword, obtain the device information of M smart devices; determine the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text.
[0130] The various inference information includes location inference information, device inference information, functional inference information, and parameter inference information. For any one of these inference information, as the name suggests, it is information obtained through inference by the LLM model. Based on at least one keyword in the text recognition result, the LLM model in this embodiment adopts an on-demand inference processing mechanism. The LLM model is only triggered to infer the inference information if at least one keyword does not contain the target keyword corresponding to the inference information; if at least one keyword contains the target keyword corresponding to the inference information, then the target keyword corresponding to the inference information is directly identified as the inference information.
[0131] For example, taking device inference information as the inference information, if at least one keyword contains the target keyword "air conditioner" which corresponds to the device inference information, the LLM model directly uses air conditioner as the device inference information.
[0132] When at least one keyword does not include the target keyword, the LLM model requires, during inference, information beyond the target regulation needs and object model description text in the identification results, as well as device information from M smart devices. The process for obtaining the device information of the M smart devices is described above.
[0133] Among these, the target adjustment requirement is used to reflect the user's core intent during the inference process, providing direction for reasoning. Device information for M smart devices provides the actual devices and their status in the current home scenario, offering a suitable scope for device inference. The object model description text limits the capabilities of the devices in the current home scenario, providing a suitable scope for functional inference.
[0134] The following sections describe the specific process by which the LLM model obtains inference information when at least one keyword does not include the target keyword corresponding to that inference information, according to the different types of inference information.
[0135] (1) Reasoning information is the target function In one possible implementation, the inference information includes the target function, and the object model description text includes the voice functions supported by M smart devices. The inference information is determined based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text, including: Match the target adjustment needs with the voice functions supported by M smart devices to determine the matching functions; The matching function is identified as the target function.
[0136] The target function is the one mentioned above. Figure 2 Functional reasoning in the context of [the subject].
[0137] It should be understood that users' target adjustment needs are usually strongly correlated with functions. Therefore, when reasoning with multiple pieces of inference information, the LLM model can first determine which function the user currently wants to adjust. When the inference information is the target function, if at least one keyword does not include the target keyword corresponding to the target function, then the function the user wants to adjust is not directly given in the voice control command, and the LLM model needs to infer the target function.
[0138] It should also be understood that, as described above, the object model description text includes the functions supported by M intelligent devices. Therefore, in addition to adjusting the target requirements during the inference process, the LLM model also needs to constrain the inference results as much as possible based on the functions supported by the M intelligent devices.
[0139] It should also be understood that, since the embodiments of this application mainly focus on the voice-controllable functions of smart devices, based on the functions supported by the M smart devices obtained, according to... Figure 2 The LLM model can first select the supported voice functions from the functions supported by M smart devices based on the supported model range and voice control function range, and then match the supported voice functions with the target adjustment requirements to obtain the matched functions.
[0140] During the inference process, the LLM model can match the target adjustment requirements with the voice functions supported by M smart devices to determine the matching functions.
[0141] Specifically, the process of the LLM model to obtain the matching function is as follows: the LLM model converts the target adjustment requirement and each supported speech function (including function name and function alias) into high-dimensional semantic vectors, then calculates the semantic similarity between the semantic vector of the target adjustment requirement and the semantic vector of each supported speech function, and sorts the multiple semantic similarities.
[0142] When there is a semantic similarity among multiple semantic similarities that is greater than or equal to a first preset similarity, the LLM model can determine the speech function corresponding to the semantic similarity that is greater than or equal to the first preset similarity as the final matching function. Optionally, the first preset similarity can be 95%.
[0143] If none of the multiple semantic similarities has a similarity greater than or equal to the first preset similarity, it indicates that there is no suitable voice function among the voice functions supported by the M smart devices. The LLM model attempts to call upon the general knowledge of the domain learned during its internal training to generate a new function from its own training knowledge base that is semantically related to the target adjustment requirement but may not be within the scope of voice functions supported by the M smart devices, as the matching function.
[0144] After obtaining the matching function, the LLM model can identify the matching function as the target function. Therefore, the target function may belong to the voice functions supported by M smart devices, or it may not belong to the voice functions supported by M smart devices (due to model illusion).
[0145] In this embodiment, during the reasoning process for the target function, the target adjustment requirement is matched with the voice functions supported by M smart devices to obtain the target function. The above reasoning process uses the voice functions supported by the M smart devices as a constraint to minimize the possibility that the target function is out of step with the capabilities of the devices in the current home environment. It also ensures that the matched function can be controlled by voice, guaranteeing the effectiveness of voice interaction. Furthermore, by filtering and determining suitable functions based on the user's actual adjustment intentions, a high degree of binding between the user's adjustment intentions and the functions is achieved, which improves the accuracy of smart device control.
[0146] The above process corresponds to the case where a target function exists. Furthermore, if none of the multiple semantic similarities has a similarity greater than or equal to the first preset similarity, it indicates that there is no suitable voice function among the functions supported by the M intelligent devices. In this case, if the LLM model, even if it invokes the general knowledge of the domain learned during its internal training, still cannot generate a new function that is semantically related to the target adjustment requirement but may not be within the scope of voice functions supported by the M intelligent devices, the LLM model will directly output a null value, indicating that it cannot infer the target function.
[0147] (2) The inference information is the type of target device. In one possible implementation, the inference information further includes the target device type, and the object model description text further includes the device types and models of M intelligent devices. The inference information is determined based on at least one of the target adjustment requirements, the device information of the M intelligent devices, and the object model description text, including: Based on the device type, device model, and supported functions of M smart devices, determine the mapping relationship between the device type and the supported voice functions of the M smart devices; Match the target function with the mapping relationship to determine the matching device type; The matching device type is determined as the target device type.
[0148] After identifying the target function, which is a function possessed by a smart device, the LLM model can further infer the target device type. The target device type is as described above. Figure 2 The device reasoning in this context involves the device type, which is also the name of the device category, such as air conditioner, refrigerator, television, etc.
[0149] In the object model description text of M smart devices, the same device type may correspond to at least one device model, and there is a certain mapping relationship between different device models and the voice functions supported by the smart devices. Based on this, the LLM model can first determine the correspondence between the device types and device models of the M smart devices, and then determine the correspondence between each device model and the supported voice functions, thereby indirectly obtaining the mapping relationship between device types and supported voice functions.
[0150] Based on the above mapping relationship, the LLM model can match the target function with the mapping relationship to obtain the matching device type.
[0151] Specifically, when matching target functions with mapping relationships, the LLM model can convert target functions into high-dimensional semantic vectors, convert each supported speech function in the mapping relationship into a high-dimensional semantic vector, and calculate the semantic similarity between the semantic vector of the target function and the semantic vector of each supported speech function.
[0152] When a semantic similarity greater than or equal to a second preset similarity exists among multiple semantic similarities, the LLM model can determine the device type corresponding to the voice function corresponding to the semantic similarity greater than or equal to the second preset similarity as the final matching device type. Optionally, the second preset similarity can be the same as or different from the first preset similarity; this application embodiment does not limit this.
[0153] When none of the multiple semantic similarities has a similarity greater than or equal to the second preset similarity, it indicates that the device types of the M smart devices do not include device types with the target function. The LLM model attempts to call its internally pre-trained knowledge base of device types and functions, matches a function close to the target function from the functions in the knowledge base, and uses the device type associated with that function as the matched device type. In this case, the matched device type is semantically related to the target function but may not be within the range of device types of the M smart devices.
[0154] After obtaining the matching device type, the LLM model can determine the matching device type as the target device type. Therefore, the target device type may or may not belong to the device types of the M smart devices (due to model illusion).
[0155] In this embodiment, the target device type is matched through the mapping relationship between the target function and the function and device type. Since the target function reflects the user's current functional needs, the above process essentially transforms the user's functional needs into a clear device type reference, achieving a match between user needs and device capabilities. Furthermore, in the matching process, the device types of M smart devices are used to constrain the inferred target device type as much as possible, ensuring that the inference result conforms as closely as possible to the current home scenario, guaranteeing the rationality of the target device type, and providing a precise control object for subsequent device control.
[0156] The above process corresponds to the case where a target device type exists. Furthermore, if none of the multiple semantic similarities have a similarity greater than or equal to a second preset similarity, it indicates that the device types of the M smart devices do not include those possessing the target function. In this case, if the LLM model calls its internally pre-trained knowledge base of device types and functions, and still cannot match a function close to the target function from the knowledge base, the LLM model will directly output a null value, indicating that it cannot infer the target device type.
[0157] (3) The inferred information is the location of the target device. In one possible implementation, the inference information further includes the target device location. The device information includes the device location. The inference information is determined based on at least one of the target adjustment requirements, the device information of M intelligent devices, and the object model description text. This inference information includes: Match the device types of M smart devices with the target device type to determine N matching devices, where N is an integer greater than or equal to 1 and less than or equal to M; When the quantity N equals 1, the device positions of the N matching devices are determined as the target device position; When the number N is greater than 1, obtain the user's historical preference information and environmental awareness information; determine the user's current location based on the environmental awareness information, historical preference information and the device locations of N matching devices; and determine the user's current location as the target device location.
[0158] After inferring the target device type, since there may be multiple smart devices of the same type in a home setting (e.g., a smart air conditioner in the master bedroom and another in the living room), the LLM model needs to further infer the location of the target device. The target device location is as described above. Figure 2 Room reasoning in the game.
[0159] It should be understood that since the device information provides the locations of M smart devices in the current home scenario, the LLM model needs to combine the known locations of the M smart devices in the device information to make reasonable inferences when inferring the location of the target device.
[0160] Specifically, during the inference process, the LLM model can convert the target device type into a high-dimensional semantic vector, and also convert the device types of the M smart devices into high-dimensional semantic vectors. Furthermore, the LLM model calculates the semantic similarity between the semantic vector of the target device type and the semantic vectors of the device types of the M smart devices.
[0161] When a semantic similarity greater than or equal to a third preset similarity exists among multiple semantic similarities, the LLM model identifies the devices associated with the device type corresponding to the semantic similarity greater than or equal to the third preset similarity as N matching devices. This step is equivalent to determining at least one smart device with the target device type from M smart devices when the semantic similarity condition is met. Optionally, the third preset similarity can be the same as or different from the first and second preset similarities; this embodiment does not limit this.
[0162] When no semantic similarity greater than or equal to a third preset similarity exists among multiple semantic similarities, the LLM model attempts to match device types from its internally pre-trained knowledge base. It then matches a device type from the knowledge base that is close to the target device type, and identifies N matching devices associated with this close device type in the knowledge base. In this case, the matching device type is semantically related to the target device type but may not be among the M smart devices.
[0163] As mentioned earlier, in the same home setting, one device type may correspond to multiple smart devices. Therefore, the LLM model also needs to determine the location of the target device according to the different numbers N of matching devices.
[0164] In one scenario, when the number N equals 1, it means that there is only one device corresponding to a device type that is close to the target device type. The LLM model can directly determine the device position of the N matching devices as the target device position.
[0165] This situation can be further divided into two scenarios. The first scenario is where the N matching devices belong to a set of M smart devices, meaning that among multiple semantic similarities, there exists a semantic similarity greater than or equal to a third preset similarity. In this case, the device locations of the N matching devices must also fall within the range of device locations of the M smart devices. The second scenario is where the N matching devices do not belong to the set of M smart devices, meaning that among multiple semantic similarities, there is no semantic similarity greater than or equal to the third preset similarity. The N matching devices are devices generated by the LLM model based on its internal knowledge base. In this scenario, the device locations of the N matching devices refer to the device locations of the N matching devices in the knowledge base, and do not fall within the range of device locations of the M smart devices.
[0166] In another scenario, when the number N is greater than 1, it indicates that there is more than one device corresponding to the device type that is similar to the target device type. The LLM model needs to further combine the user's historical preference information and the current environmental perception information to reasonably infer the user's current location and determine the user's current location as the target device location.
[0167] Optionally, the environmental perception information may include the current time, and the user's historical preference information may include the user's historical usage time range for M smart devices.
[0168] Similar to the previous scenario, this situation can be further divided into two scenarios. The first scenario is that N matching devices belong to a set of M smart devices. In this case, the LLM model can first filter the historical usage time ranges of N matching devices from the user's historical usage time ranges of the M smart devices, and then match the current time with the historical usage time ranges of the N matching devices to determine which historical usage time range the current time belongs to. This determines the device location corresponding to the historical usage time range in which the current time is located as the user's current location.
[0169] For example, if the target device type is an air conditioner, the LLM model will match two of the N matching devices. The device locations associated with the two air conditioners are the master bedroom air conditioner and the living room air conditioner, respectively. The LLM model further determines that the user is currently in the living room based on the user's historical usage time range for the master bedroom air conditioner and the living room air conditioner, as well as the current time, and thus the target device location is the living room.
[0170] The second scenario involves N matching devices that do not belong to the set of M smart devices. In this scenario, the LLM model can no longer refer to the user's historical preference information, but only determines the user's current location based on the current time and learned prior knowledge. Specifically, the LLM model can match the usage time ranges of the N matching devices with the current time based on prior knowledge in its internal knowledge base, determine the usage time range in which the current time falls, and then determine the user's current location from the device locations of the N matching devices that fall within the usage time range of the current time. In other words, in this case, the user's current location is determined based on environmental perception information and the device locations of the N matching devices.
[0171] In this embodiment, by matching the device types of M smart devices with the target device type, N matching devices are quickly selected, effectively narrowing down the range of device locations. When the number of matching devices is 1, the location of that device is directly determined as the target device location, avoiding the complex reasoning process for device location. When the number of matching devices is greater than 1, the user's current location is accurately inferred by combining the user's usage habits of devices in the home scenario and the real-time environmental status. This allows the determination of device location to be highly matched with the user's preferences, achieving personalized determination of device location.
[0172] (4) The inference information is the target parameter In one possible implementation, the inference information further includes target parameters, and the device information includes operating status. The inference information is determined based on at least one of the target adjustment requirements, the device information of M intelligent devices, and the object model description text, including: Acquire users' historical preference information and environmental awareness information; When historical preference information includes parameters from the user's previous use of the target function, those parameters are determined as the target parameters; or... When historical preference information does not include previous usage parameters, determine the operating status of the target function from the operating status of M smart devices; determine the target parameters of the target function based on environmental perception information, or based on environmental perception information and the operating status of the target function.
[0173] The target parameter is as described above. Figure 2 Parameter reasoning in the process.
[0174] This step, based on the identified target function, further deduces the target parameters required to control that function. To ensure that the adjustment of the target function aligns with user preferences, the determination of target parameters can be combined with the user's historical preference information, environmental perception information, and the operating status of M smart devices.
[0175] Optionally, historical preference information includes historical usage parameters of the user for the voice functions supported by the M smart devices. Environmental perception information includes at least one of air humidity, air temperature, ambient light intensity, current weather type, and current time. The operating status of the smart devices includes the current function of the smart device and parameters of that function.
[0176] The historical usage parameters of the voice functions supported by the M smart devices are stored on the cloud platform each time a user uses the function. Therefore, the cloud platform can directly access these parameters. Air humidity, air temperature, and current weather type can be obtained by the cloud platform through a third-party weather query interface. The cloud platform can obtain the operating status of the M smart devices in real time.
[0177] The LLM model can iterate through the historical usage parameters of the voice functions supported by M smart devices to determine whether they contain the previous usage parameters of the target function. If the historical usage parameters of the voice functions supported by the M smart devices contain the previous usage parameters of the target function, the LLM model can directly determine the previous usage parameters as the current target parameters.
[0178] When the historical usage parameters of the voice functions supported by the M smart devices do not include the previous usage parameters of the target function, there are two possibilities: one is that the target function is within the range of voice functions supported by the M smart devices, but the user has not used the target function before; the other is that the target function is not within the range of voice functions supported by the M smart devices, and it is a function of knowledge inference learned by the LLM model through internal training.
[0179] For the aforementioned target function within the range of voice functions supported by M smart devices, the LLM model can infer the target parameters by combining environmental perception information and the operating status of the M smart devices.
[0180] Specifically, the LLM model can first filter the operating states of the target smart device from the operating states of M smart devices to determine the operating state of the target function. The target smart device can be directly determined by its type and location. The operating state of the target function includes whether the target function is enabled, and if so, its specific operating values (if any).
[0181] After obtaining the operational status of the target function of the target intelligent device, the LLM model can infer the target parameters of the target function based on environmental perception information and the operational status of the target function.
[0182] It should be understood that the content of environmental perception information during the inference process varies depending on the smart device. For example, when the smart device is a smart light, the environmental perception information includes ambient light intensity, current time, and current weather type. When inferring the target parameter, if the target function is brightness, the LLM model can combine ambient light intensity, current weather type, current time, the operating values of the target function, and the domain common sense learned during LLM model training to infer whether the adjustment direction of the target function is to increase or decrease brightness. Based on determining the adjustment direction, and combining the operating values of the target function, the target parameter that needs to be adjusted, i.e., the target brightness, is determined.
[0183] For example, when the smart device is a smart refrigerator, the environmental perception information can include the current weather type and ambient temperature. The LLM model can first infer the target parameter, i.e. the target temperature, based on the current weather type, ambient temperature, the current operating temperature of the smart refrigerator, and the domain common sense learned by the LLM model during training.
[0184] When the target function is outside the range of voice functions supported by the M smart devices, the LLM model cannot refer to the operating status of the M smart devices and can only infer the target parameters by combining environmental perception information. The specific inference process still relies on the domain common sense learned by the LLM model during training, and is similar to the target parameter reasoning process in the first case mentioned above. The difference is that, in this case, due to the lack of constraints on the operating values of the target function, the actual inferred target parameters may not be accurate enough.
[0185] In this embodiment, during the determination of target parameters, if there are previous usage parameters for the target function, these parameters are directly determined as the target parameters. This approach maximizes consideration of the user's historical usage habits, ensuring that the adjustment results align with the user's personal preferences. When no previous usage parameters exist, the target parameters are determined by combining environmental perception information and the actual operating status of the target function on the target smart device. This ensures that the determination process closely matches the actual environmental scenario, guaranteeing the accuracy of the adjustment.
[0186] Thus, through the above steps, the LLM model can obtain a variety of inference information.
[0187] The specific process for generating target control instructions based on various inference information is as follows.
[0188] In one possible implementation, target control instructions are obtained based on multiple inference information, including: When the number of results for any one of the multiple inference information is equal to 1, a candidate control instruction containing multiple inference information is generated. When a candidate control command passes the verification, it is determined as the target control command.
[0189] Combination Figure 2 As shown, the inference results include two types: candidate control commands and target prompts. The above process describes the case where the inference result is a candidate control command.
[0190] Specifically, the prerequisite for generating candidate control commands is that the inference results of the LLM model are clear. Clarity specifically means that each piece of inference information is unique and explicit. Based on this, the LLM model first determines that the result quantity of any one of the multiple inference pieces of information is equal to 1, indicating that the number of target functions, target device types, target device locations, and target parameters is 1. At this point, there is no ambiguity or vagueness among the various inference pieces of information. In this case, the LLM model can generate candidate control commands based on the multiple inference pieces of information.
[0191] As mentioned earlier, because the LLM model may infer constraints from the object model description texts of M intelligent devices during the inference process, it can create illusions that cause the final inference result to deviate from the constraints, generating functions or devices that do not belong to the object model description texts of the M intelligent devices. Therefore, the cloud platform also needs to perform a rationality check on the candidate control commands. Only when the candidate control command passes the check is it determined to be valid and identified as the target control command.
[0192] In this embodiment, when the number of inference information results is 1, a candidate control command is generated. This ensures that the candidate control command has a clear direction and uniqueness, avoiding the generation of ambiguous commands that the device cannot respond to. Further verification of the candidate control command to determine whether it can be used as the target control command effectively avoids and detects unreasonable results in the inference process, ensuring the feasibility of the command finally issued to the device, improving the device's response success rate to commands, and thus improving the success rate of voice interaction.
[0193] Specifically, the steps for verifying the rationality of candidate control instructions are as follows.
[0194] In one possible implementation, the various inference information includes the target device type, target function, and target parameters, and the control method further includes: Based on the target device type, target device location, and the device types and locations of M smart devices, determine whether the M smart devices include the target smart device; When M smart devices include the target smart device, if the communication status of the target smart device is normal, determine whether the voice function supported by the target smart device includes the target function. When the voice functions supported by the target smart device include the target function, determine whether the target parameters are within the preset parameter range corresponding to the target function; When the target parameter is within the preset parameter range, determine whether the candidate control command conflicts with the current working mode of the target intelligent device; If the candidate control command does not conflict with the current operating mode of the target smart device, the candidate control command is deemed to have passed the verification.
[0195] For example, such as Figure 2 As shown, the rationality verification of candidate control commands includes four dimensions: device verification, function verification, parameter verification, and conflict verification.
[0196] Device verification refers to the cloud platform first determining whether the target smart device is included among the M smart devices.
[0197] Specifically, device verification involves verifying two inference results. Whether the M smart devices include the target smart device has two meanings: First, whether the device types of the M smart devices include the target device type. Second, whether the device locations of the M smart devices include the target device location. Therefore, the cloud platform first needs to match the target device type with the device types of the M smart devices to determine if the device types of the M smart devices include the target device type. It also needs to match the target device location with the device locations of the M smart devices to determine if the device locations of the M smart devices include the target device location. Only when both results are "included" does the cloud platform determine that the M smart devices include the target smart device. Conversely, if at least one of the two results is "not included," the cloud platform determines that the M smart devices do not include the target smart device.
[0198] This step is to prevent the LLM model from generating other intelligent devices that do not belong to the M intelligent devices during inference, and it is also a prerequisite for the entire rationality verification. Only after the device verification passes, the subsequent functional verification, parameter verification, and conflict verification are meaningful.
[0199] When the device type of the M smart devices includes the target device type, the cloud platform further determines the communication status of the target smart device to ensure that the control commands can be correctly received by the target smart device.
[0200] When the cloud platform and the target smart device are communicating, the target smart device can send heartbeat data packets to the cloud platform at a preset period. When the cloud platform receives the heartbeat data packet sent by the target smart device, it determines that the communication status of the target smart device is normal; conversely, when the cloud platform does not receive the heartbeat data packet sent by the target smart device, it determines that the communication status of the target smart device is offline or abnormal.
[0201] When the communication status of the target smart device is normal, the cloud platform further performs functional verification to determine whether the voice function supported by the target smart device includes the target function in the object model description text of the M smart devices.
[0202] If the target smart device supports voice functions that include the target function, then parameter verification is performed to determine whether the target parameters are within the preset parameter range associated with the target function.
[0203] Within the preset parameter range associated with the target function, and based on the device conflict description in configuration module 201, it is determined whether the current candidate control command conflicts with the current operating mode of the target smart device. If there is no conflict, the cloud platform ultimately determines that the candidate control command has passed verification.
[0204] Conversely, if the target smart device is not among the M smart devices, or the target smart device is offline or in an abnormal communication state, or the voice function supported by the target smart device does not include the target function, or the target parameters are not within the preset parameter range, or the candidate control command conflicts with the current working mode of the target smart device, the cloud platform determines that the candidate control command has failed verification. When the cloud platform determines that the candidate control command has failed verification, it will send a verification failure prompt to the user via voice to indicate that the interaction process cannot be completed.
[0205] In this embodiment, after receiving candidate control commands, it verifies whether the target smart device is among the M smart devices. This verifies the actual existence of the target smart device and avoids issuing commands to non-existent smart devices. After confirming the existence of the target smart device, its communication status is verified. Verification continues only when the communication status is normal, preventing command execution failure due to smart device offline or network anomalies. Verifying whether the target function falls within the voice function range supported by the target smart device ensures that the function is constrained within the actual capabilities of the device, avoiding the generation of functional commands that the device cannot perform. Finally, it verifies whether the candidate control command conflicts with the current operating mode of the target device, preventing performance degradation due to mode conflicts and ensuring compatibility between the control command and the current operating state of the device. Therefore, these multi-dimensional verifications ultimately ensure that the control command is correctly responded to by the target smart device, improving the success rate of command execution.
[0206] Next, we will introduce another process for generating inference results.
[0207] In one possible implementation, the control method further includes: When the number of results of at least one of the multiple inference information is greater than 1, a target prompt message is generated based on at least one inference information to prompt the user to enter the voice command again; Update the voice control commands to the voice commands that the user can re-enter.
[0208] When the number of inference messages exceeds one, ambiguity and vagueness inevitably arise among multiple results for the same inference message. In such cases, the LLM model cannot determine the specific object to be controlled. It can further generate target prompts based on at least one ambiguous inference message to prompt the user to respond again, i.e., re-enter the voice command. For example, if there are two target device types, the target prompt could be: "We have detected that you may want to control device types xx and xx. Please determine which device type you want to control." Alternatively, the target prompt could be: "We cannot currently infer the device type you want to control. Please repeat your request."
[0209] When the user inputs a voice command again, the steps of the above-described embodiments of this application are executed again.
[0210] In another scenario, as described above, when the LLM model outputs a null value for a certain type of inference information, the LLM model will eventually generate a target hint message to allow the user to further clarify or repeatedly adjust their needs.
[0211] In this embodiment, when the number of at least one result among multiple inference information is greater than 1, it indicates that there is ambiguity regarding function, device type, device location, or parameters. In this case, proactively triggering a prompt instead of directly executing the instruction can fundamentally avoid the problem of inaccurate instruction execution. Furthermore, proactively triggering a prompt to guide the user to output voice again can prevent a decline in user experience caused by device unresponsiveness, thus optimizing the user's interactive experience.
[0212] S303 controls the operation of the target intelligent device according to the target control command.
[0213] After obtaining the target control command based on step S302, the cloud platform can send the target control command to the target smart device so that the target smart device can respond to the target control command in a timely manner and adjust the device status.
[0214] After the target smart device has been adjusted, adjustment feedback information can be generated and sent to the dialogue terminal so that the dialogue terminal can broadcast it to the user in the form of voice, so that the user can understand the device's processing of voice control commands in a timely manner.
[0215] To facilitate understanding of the entire implementation process of the embodiments of this application, the following is combined with... Figure 4 The overall implementation process of the above scheme will be introduced.
[0216] Figure 4 This is a schematic flowchart of another intelligent device control method provided in the embodiments of this application.
[0217] For example, such as Figure 4 As shown, the control method 400 includes the following steps S401 to S413.
[0218] S401, in response to voice control commands, obtains the object model description text of M smart devices in a home scenario.
[0219] S402, determine whether the format of the voice control command is the preset format.
[0220] When the voice control command format is a preset format, proceed to step S403; When the voice control command format is a preset format, proceed to step S404.
[0221] S403 controls target smart devices based on voice control commands.
[0222] S404, determine the text of the voice control command and the recognition result of the text.
[0223] S405, for any kind of reasoning information, determine whether at least one keyword in the recognition result includes the target keyword corresponding to the reasoning information.
[0224] When at least one keyword includes the target keyword, proceed to step S406; If at least one keyword does not include the target keyword, proceed to step S407.
[0225] S406, the target keywords are identified as inference information.
[0226] S407, determine inference information based on at least one of the target adjustment requirements, device information of M smart devices, and object model description text.
[0227] S408, determine whether the number of results for multiple inference information is 1.
[0228] When the number of results for multiple inference information is not all 1, proceed to step S409; When the number of results for multiple inference information is 1, proceed to step S410.
[0229] S409, Generate target prompt information based on at least one inference information with a result quantity greater than 1.
[0230] S410 generates candidate control instructions based on various inference information.
[0231] S411, verify whether the candidate control command passes.
[0232] If the candidate control command passes the verification, proceed to step S412; If a candidate control command fails to pass verification, a verification failure response is generated.
[0233] S412, determine the candidate control instruction as the target control instruction.
[0234] S413 controls the operation of the target intelligent device based on the target control command.
[0235] Steps S401 to S413 in the control method 400 described above have the same inventive concept as steps S301 to S303 in the control method 300. For details, please refer to the description of the control method 300 above, which will not be repeated here.
[0236] In summary, this application provides a method for controlling smart devices. Upon receiving a user's voice control command, this method combines the device model of the current home environment with the voice control command to infer the target control command for the target smart device in the home environment. There are no restrictions on the form of the voice control command in the above process. That is, regardless of the format of the user's voice control command, this method can generate accurate control commands to control devices in the home environment. Therefore, this control method can improve the recognition accuracy of voice control commands, thereby increasing the success rate of voice-controlled smart devices and enhancing the user's interactive experience.
[0237] The above text combined Figures 3 to 4 The control method for the intelligent device provided in the embodiments of this application is described in detail below; the following will be combined with Figure 5 and Figure 6 The apparatus embodiments of this application are described in detail below. It should be understood that the apparatus in the embodiments of this application can perform the various methods described in the foregoing embodiments of this application, that is, the specific working processes of the various products described below can be referred to the corresponding processes in the foregoing method embodiments.
[0238] Figure 5 This is a schematic diagram of the structure of a control device for a smart device provided in an embodiment of this application.
[0239] For example, such as Figure 5 As shown, the control device 500 includes: The acquisition module 501 is used to acquire the object model description text of M smart devices in the home scene in response to the voice control command. The object model description text is used to represent the capabilities and basic attributes of the smart devices, and M is an integer greater than or equal to 1. The reasoning module 502 is used to reason about the voice control command and the object model description text of M smart devices to obtain the target control command of the target smart device among the M smart devices. The control module 503 is used to control the operation of the target intelligent device according to the target control command.
[0240] In one possible implementation, the reasoning module 502 is specifically used to: determine whether the voice control command is a command in a preset format; when the voice control command is not a command in a preset format, reason about the voice control command and the object model description text to obtain the target control command, which is a command in a preset format.
[0241] In one possible implementation, the inference module 502 is further configured to: control the target smart device to operate according to the voice control command when the voice control command is a command in a preset format.
[0242] In one possible implementation, the reasoning module 502 is specifically used to: perform recognition processing on the voice control command to obtain the text corresponding to the voice control command and the recognition result of the text; determine multiple inference information based on the recognition result, or based on the recognition result and the object model description text; and obtain the target control command based on the multiple inference information.
[0243] In one possible implementation, for any one of the multiple inference information, the identification result includes at least one keyword and a target adjustment requirement. The inference module 502 is specifically used to: when at least one keyword includes the target keyword corresponding to the inference information, determine the target keyword as the inference information; when at least one keyword does not include the target keyword, obtain the device information of M smart devices; and determine the inference information based on at least one of the target adjustment requirement, the device information of the M smart devices, and the object model description text.
[0244] In one possible implementation, the inference information includes the target function, and the object model description text includes the voice functions supported by M smart devices. The inference module 502 is specifically used to: match the target adjustment requirements with the voice functions supported by the M smart devices to determine the matching function; and determine the matching function as the target function.
[0245] In one possible implementation, the inference information also includes the target device type, and the object model description text also includes the device types and device models of M smart devices. The inference module 502 is specifically used to: determine the mapping relationship between the device types and supported voice functions of the M smart devices based on the device types, device models and supported voice functions of the M smart devices; match the target function with the mapping relationship to determine the matching device type; and determine the matching device type as the target device type.
[0246] In one possible implementation, the inference information further includes the target device location. The device information includes the device location. Specifically, the inference module 502 is used to: match the device types of M smart devices with the target device type to determine N matching devices, where N is an integer greater than or equal to 1 and less than or equal to M; when the number N equals 1, determine the device locations of the N matching devices as the target device location; when the number N is greater than 1, obtain the user's historical preference information and environmental awareness information; determine the user's current location based on the environmental awareness information, historical preference information, and the device locations of the N matching devices; and determine the user's current location as the target device location.
[0247] In one possible implementation, the inference information further includes target parameters, and the device information includes operating status. Specifically, the inference module 502 is used to: acquire the user's historical preference information and environmental perception information; when the historical preference information includes the user's previous usage parameters for the target function, determine the previous usage parameters as the target parameters; or, when the historical preference information does not include the previous usage parameters, determine the operating status of the target function of the target smart device from the operating status of M smart devices; and determine the target parameters of the target function based on the environmental perception information, or based on the environmental perception information and the operating status of the target function.
[0248] In one possible implementation, the inference module 502 is specifically used to: generate a candidate control instruction containing multiple inference information when the number of results of any inference information among multiple inference information is equal to 1; and determine the candidate control instruction as the target control instruction when the candidate control instruction passes the verification.
[0249] In one possible implementation, the inference module 502 is further configured to: when the number of results of at least one inference information among multiple inference information is greater than 1, generate target prompt information based on at least one inference information to prompt the user to input the voice command again; and update the voice control command to the voice command input by the user again.
[0250] In one possible implementation, the multiple inference information includes the target device type, target device location, target function, and target parameters. The inference module 502 is further configured to: determine whether the M smart devices include the target smart device based on the target device type, target device location, and the device types and locations of the M smart devices; when the M smart devices include the target smart device, if the communication status of the target smart device is normal, determine whether the voice function supported by the target smart device includes the target function; when the voice function supported by the target smart device includes the target function, determine whether the target parameters are within the preset parameter range corresponding to the target function; when the target parameters are within the preset parameter range, determine whether the candidate control command conflicts with the current working mode of the target smart device; when the candidate control command does not conflict with the current working mode of the target smart device, determine that the candidate control command verification is successful.
[0251] It should be noted that the control device of the aforementioned intelligent device is embodied in the form of a functional unit. The term "module" here can be implemented in software and / or hardware, without specific limitations.
[0252] For example, a "module" can be a software program, a hardware circuit, or a combination of both that implements the above functions. The hardware circuit may include an application-specific integrated circuit (ASIC), electronic circuits, a processor (e.g., a shared processor, a proprietary processor, or a group processor) and memory for executing one or more software or firmware programs, integrated logic circuits, and / or other suitable components that support the described functions.
[0253] Therefore, the units of the various examples described in the embodiments of this application can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
[0254] Figure 6 This is a schematic diagram of the structure of a smart device provided in an embodiment of this application. This smart device corresponds to the processing terminal that executes the aforementioned control method.
[0255] For example, such as Figure 6 As shown, the smart device 600 includes a memory 601 and a processor 602. The memory 601 stores executable program code 603, and the processor 602 is used to call and execute the executable program code 603 to perform a control method for the smart device.
[0256] Furthermore, embodiments of this application also protect an apparatus that may include a memory and a processor, wherein the memory stores executable program code, and the processor is used to call and execute the executable program code to perform a control method for a smart device provided in embodiments of this application.
[0257] This embodiment can divide the device into functional modules based on the above method example. For example, each module can correspond to a separate function, or two or more functions can be integrated into one processing module. The integrated module can be implemented in hardware. It should be noted that the module division in this embodiment is illustrative and only represents one logical functional division. In actual implementation, there may be other division methods.
[0258] When the functional modules are divided according to their respective functions, the device may further include an acquisition module, an inference module, and a control module. It should be noted that all relevant content regarding the steps involved in the above method embodiments can be referenced to the functional descriptions of the corresponding functional modules, and will not be repeated here.
[0259] It should be understood that the device provided in this embodiment is used to execute the above-described control method for a smart device, and therefore can achieve the same effect as the above-described implementation method.
[0260] When using integrated units, the device may include a processing module and a storage module. When applied to a smart device, the processing module can be used to control and manage the actions of the smart device. The storage module can be used to support the execution of relevant program code by the smart device.
[0261] The processing module may be a processor or a controller, which can implement or execute the various exemplary logic blocks, modules, and circuits described in conjunction with the disclosure of this application. The processor may also be a combination of functions that implement computing capabilities, such as a combination of one or more microprocessors, a combination of digital signal processing (DSP) and a microprocessor, etc., and the storage module may be a memory.
[0262] In addition, the device provided in the embodiments of this application may specifically be a chip, component or module. The chip may include a connected processor and a memory. The memory is used to store instructions. When the processor calls and executes the instructions, the chip can execute a control method for an intelligent device provided in the above embodiments.
[0263] This embodiment also provides a computer-readable storage medium storing computer program code. When the computer program code is run on a computer, the computer executes the above-described related method steps to implement the control method for an intelligent device provided in the above embodiment.
[0264] The computer-readable storage medium may include, but is not limited to, any type of disk, including floppy disks, optical disks, Digital Video Discs (DVDs), Compact Disc Read-Only Memory (CD-ROMs), microdrives, and magneto-optical disks, read-only memory (ROMs), random access memory (RAMs), erasable programmable read-only memory (EPROMs), electrically erasable programmable read-only memory (EEPROMs), dynamic random access memory (DRAMs), video random access memory (VRAMs), flash memory devices, magnetic cards or optical cards, nanosystems (including molecular memory ICs), or any type of medium or device suitable for storing instructions and / or data.
[0265] This embodiment also provides a computer program product that, when run on a computer, causes the computer to perform the aforementioned related steps to implement a smart device control method provided in the above embodiment.
[0266] In this embodiment, the device, computer-readable storage medium, computer program product, or chip are all used to execute the corresponding methods provided above. Therefore, the beneficial effects they can achieve can be referred to the beneficial effects in the corresponding methods provided above, and will not be repeated here.
[0267] Through the above description of the embodiments, those skilled in the art will understand that, for the sake of convenience and brevity, only the division of the above functional modules is used as an example. In actual applications, the above functions can be assigned to different functional modules as needed, that is, the internal structure of the device can be divided into different functional modules to complete all or part of the functions described above.
[0268] In the embodiments provided in this application, it should be understood that the disclosed apparatus and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of modules or units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another device, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between devices or units may be electrical, mechanical, or other forms.
[0269] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims
1. A control method for an intelligent device, characterized in that, The control method includes: In response to a voice control command, obtain the object model description text of M smart devices in the home scene. The object model description text is used to represent the capabilities and basic attributes of the smart devices, where M is an integer greater than or equal to 1. Reasoning is performed on the voice control commands and the object model description texts of the M smart devices to obtain the target control commands of the target smart device among the M smart devices; Control the operation of the target intelligent device according to the target control command.
2. The control method according to claim 1, characterized in that, The step of reasoning from the voice control commands and the object model description texts of the M smart devices to obtain the target control commands for the target smart device among the M smart devices includes: Determine whether the voice control command is a command in a preset format; When the voice control command is not a command in the preset format, reasoning is performed on the voice control command and the object model description text to obtain the target control command, which is a command in the preset format.
3. The control method according to claim 2, characterized in that, The control method further includes: When the voice control command is a command in the preset format, the target smart device is controlled to operate according to the voice control command.
4. The control method according to claim 2, characterized in that, The step of reasoning from the voice control command and the object model description text to obtain the target control command includes: The voice control command is processed to obtain the text corresponding to the voice control command and the recognition result of the text; Based on the recognition results, or based on the recognition results and the object model description text, various inference information is determined; The target control command is obtained based on the various inference information.
5. The control method according to claim 4, characterized in that, For any one of the multiple inference information, the identification result includes at least one keyword and target adjustment requirement. The determination of multiple inference information based on the identification result, or based on the identification result and the object model description text, includes: When the at least one keyword includes the target keyword corresponding to the reasoning information, the target keyword is determined as the reasoning information; When the at least one keyword does not include the target keyword, obtain the device information of the M smart devices; determine the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text.
6. The control method according to claim 5, characterized in that, The inference information includes the target function, and the object model description text includes the voice functions supported by the M smart devices. Determining the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text includes: The target adjustment requirements are matched with the voice functions supported by the M smart devices to determine the matching functions; The matching function is determined as the target function.
7. The control method according to claim 6, characterized in that, The inference information also includes the target device type, and the object model description text also includes the device type and device model of the M smart devices. Determining the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text includes: Based on the device type, device model, and supported voice functions of the M smart devices, determine the mapping relationship between the device type and supported voice functions of the M smart devices; Match the target function with the mapping relationship to determine the matching device type; The matching device type is determined as the target device type.
8. The control method according to claim 7, characterized in that, The inference information also includes the target device location, and the device information includes the device location. Determining the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text includes: The device types of the M smart devices are matched with the target device type to determine N matching devices, where N is an integer greater than or equal to 1 and less than or equal to M; When the number N equals 1, the device positions of the N matching devices are determined as the target device position; When the number N is greater than 1, the user's historical preference information and environmental awareness information are obtained; based on the environmental awareness information, the historical preference information and the device locations of the N matching devices, the user's current location is determined; the user's current location is determined as the target device location.
9. The control method according to any one of claims 5 to 8, characterized in that, The inference information further includes target parameters, and the device information includes operating status. Determining the inference information based on at least one of the target adjustment requirements, the device information of the M smart devices, and the object model description text includes: Acquire users' historical preference information and environmental awareness information; When the historical preference information includes the user's previous usage parameters for the target function, the previous usage parameters are determined as the target parameters; or... When the historical preference information does not include the previous usage parameters, the operating status of the target function is determined from the operating status of the M smart devices; the target parameters of the target function are determined based on the environmental perception information, or based on the environmental perception information and the operating status of the target function.
10. The control method according to claim 4, characterized in that, The step of obtaining the target control command based on the multiple inference information includes: When the number of results of any one of the multiple inference information is equal to 1, a candidate control instruction containing the multiple inference information is generated. When the candidate control instruction passes the verification, the candidate control instruction is determined as the target control instruction.
11. The control method according to claim 10, characterized in that, The control method further includes: When the number of results of at least one of the multiple inference information is greater than 1, a target prompt information is generated based on the at least one inference information to prompt the user to input the voice command again; Update the voice control command to the voice command that the user enters again.
12. The control method according to claim 10, characterized in that, The various inference information includes target device type, target device location, target function, and target parameters. The control method further includes: Based on the target device type, the target device location, and the device types and locations of the M smart devices, determine whether the M smart devices include the target smart device; When the M smart devices include the target smart device, if the communication status of the target smart device is normal, determine whether the voice function supported by the target smart device includes the target function; When the voice function supported by the target smart device includes the target function, determine whether the target parameter is within the preset parameter range corresponding to the target function; When the target parameter is within the preset parameter range, determine whether the candidate control command conflicts with the current working mode of the target smart device; When the candidate control command does not conflict with the current operating mode of the target smart device, the candidate control command is determined to have passed the verification.
13. A control device for an intelligent device, characterized in that, The control device includes: The acquisition module is used to acquire the object model description text of M smart devices in the home scene in response to the voice control command. The object model description text is used to represent the capabilities and basic attributes of the smart devices, where M is an integer greater than or equal to 1. The reasoning module is used to reason about the voice control command and the object model description text of the M smart devices to obtain the target control command of the target smart device among the M smart devices. The control module is used to control the operation of the target intelligent device according to the target control command.
14. A smart device, characterized in that, The intelligent device includes: Memory, used to store executable program code; A processor is configured to call and run the executable program code from the memory, causing the smart device to perform the control method as described in any one of claims 1 to 12.
15. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed, implements the control method as described in any one of claims 1 to 12.
16. A computer program product, characterized in that, The computer program product includes: computer program code, which, when run on a computer, implements the control method as described in any one of claims 1 to 12.