Electronic device and operating method of electronic device
The electronic device determines the appropriate AI service provider based on user commands and screen content, ensuring efficient command processing across devices with varying specifications.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- SAMSUNG ELECTRONICS CO LTD
- Filing Date
- 2025-12-09
- Publication Date
- 2026-06-18
AI Technical Summary
In multi-device environments, it is unclear which AI service provider should execute user commands due to varying device specifications and services, necessitating more efficient speech command processing.
An electronic device determines itself or an external device as an AI service provider based on user speech commands and screen content source information, executing operations through the identified provider.
Efficient and effective processing of user commands by identifying the most suitable AI service provider, optimizing device utilization and command execution.
Smart Images

Figure KR2025021120_18062026_PF_FP_ABST
Abstract
Description
Electronic device and method of operation of electronic device
[0001] The present disclosure relates to an electronic device and a method of operating the electronic device. More specifically, it relates to an electronic device and a method of operating the electronic device for determining an AI service providing device.
[0002] Speech recognition technology is a technology that receives a user's spoken voice and provides actions optimized for the user's questions. With the advancement of AI (Artificial Intelligence) technologies, the number of devices providing AI services capable of speech recognition (hereinafter referred to as "AI service providers") is increasing. AI service providers can offer various services to users by receiving and processing voice signals corresponding to user utterances. For example, each AI service provider performs operations that recognize and analyze human language for speech recognition (e.g., speech recognition, synthesis, natural language understanding, generation, machine translation, dialogue systems).
[0003] When AI service providers are placed in the same user space within a home, it is unclear which device should execute a command spoken by the user. Furthermore, since the device specifications and services provided vary among the AI service providers, it is necessary to process spoken commands more efficiently and effectively.
[0004] An electronic device according to one embodiment of the present disclosure includes at least one processor and a memory comprising one or more storage media for storing one or more instructions.
[0005] According to one embodiment of the present disclosure, the at least one processor executes the one or more instructions individually or collectively, and the electronic device obtains a user's speech command by the at least one processor.
[0006] According to one embodiment of the present disclosure, at least one processor executes the one or more instructions individually or collectively, and the electronic device by the at least one processor determines the electronic device or at least one of the external electronic device as an AI service providing device based on the user's speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0007] According to one embodiment of the present disclosure, at least one processor executes the one or more instructions individually or collectively, and the electronic device is controlled by the at least one processor to perform an operation corresponding to the user's speech command through the determined AI service provider.
[0008] A method of operation of an electronic device for determining an Artificial Intelligence (AI) service providing device according to one embodiment of the present disclosure comprises: acquiring a user’s speech command; determining at least one of the electronic device or the external electronic device as an AI service providing device based on the user’s speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device; and controlling the electronic device to perform an operation corresponding to the user’s speech command through the determined AI service providing device.
[0009] The present disclosure can be easily understood from the combination of the following detailed description and the accompanying drawings, where reference numerals denote structural elements.
[0010] FIG. 1 is a drawing showing an AI service providing system according to one embodiment of the present disclosure.
[0011] FIG. 2 is a drawing for explaining information about a plurality of electronic devices according to one embodiment of the present disclosure.
[0012] FIG. 3 is a block diagram showing the configuration of an electronic device according to one embodiment of the present disclosure.
[0013] FIG. 4 is a flowchart illustrating a method for an electronic device according to one embodiment of the present disclosure to determine an AI service providing device.
[0014] FIG. 5 is a diagram illustrating the operation of a first electronic device according to one embodiment of the present disclosure providing an AI service based on the first electronic device or a second electronic device in response to a speech command.
[0015] FIG. 6 is a diagram illustrating the operation of a first electronic device according to one embodiment of the present disclosure providing an AI service based on a server in accordance with a speech command.
[0016] FIG. 7 is a flowchart illustrating a method in which an electronic device according to one embodiment of the present disclosure determines an AI service providing device based on content source information.
[0017] FIG. 8 is a flowchart illustrating a method in which a first electronic device and a second electronic device provide an AI service based on content source information according to one embodiment of the present disclosure.
[0018] FIG. 9 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure provides an AI service based on content source information of a second electronic device.
[0019] FIG. 10 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure provides an AI service based on content source information of a second electronic device.
[0020] FIG. 11 is a flowchart illustrating a method for an electronic device according to one embodiment of the present disclosure to determine a microphone activation device and an AI service providing device.
[0021] FIG. 12 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device.
[0022] FIG. 13 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device.
[0023] FIG. 14 is a detailed block diagram of an electronic device according to one embodiment of the present disclosure.
[0024] In the present disclosure, the expression “at least one of a, b, or c” may refer to “a”, “b”, “c”, “a and b”, “a and c”, “b and c”, “a, b, and c all”, or variations thereof.
[0025] Embodiments of the present disclosure are described below in detail with reference to the attached drawings so that those skilled in the art can easily implement them. However, the present disclosure may be embodied in various different forms and is not limited to the embodiments described herein.
[0026] The terms used in this disclosure are described in their current, general form considering the functions mentioned herein; however, they may refer to various other terms depending on the intent of those skilled in the art, case law, the emergence of new technologies, etc. Accordingly, the terms used in this disclosure should not be interpreted solely by their names, but should be interpreted based on the meaning of the terms and the overall content of this disclosure.
[0027] Furthermore, the terms used in this disclosure are used merely to describe specific embodiments and are not intended to limit this disclosure.
[0028] Throughout the specification, when a part is described as being "connected" to another part, this includes not only cases where they are "directly connected," but also cases where they are "electrically connected" with other components in between.
[0029] The terms “above” and similar designations used in this specification, particularly in the claims, may indicate both singular and plural forms. Furthermore, unless there is a description explicitly specifying the order of the steps describing the method according to this disclosure, the described steps may be performed in a suitable order. This disclosure is not limited by the order in which the described steps are described.
[0030] Phrases such as "in some embodiments" or "in one embodiment" appearing in various places in this specification do not necessarily refer to the same embodiment.
[0031] Some embodiments of the present disclosure may be represented by functional block configurations and various processing steps. Some or all of these functional blocks may be implemented by various numbers of hardware and / or software configurations that execute specific functions. For example, the functional blocks of the present disclosure may be implemented by one or more microprocessors or by circuit configurations for a specific function. Additionally, for example, the functional blocks of the present disclosure may be implemented in various programming or scripting languages. The functional blocks may be implemented as algorithms executed on one or more processors. Furthermore, the present disclosure may employ prior art for electronic configuration, signal processing, and / or data processing, etc. Terms such as “mechanism,” “element,” “means,” and “configuration” may be used broadly and are not limited to mechanical and physical configurations.
[0032] Furthermore, the connecting lines or connecting members between the components depicted in the drawings are merely illustrative of functional connections and / or physical or circuit connections. In the actual device, connections between components may be represented by various alternative or added functional connections, physical connections, or circuit connections.
[0033] Additionally, terms such as "...part," "module," etc., as described in the specification refer to a unit that processes at least one function or operation, and this may be implemented in hardware or software, or as a combination of hardware and software.
[0034] In the present disclosure, "processor" may include various processing circuits and / or a plurality of processors. For example, the term "processor" as used herein, including in the claims, may include at least one processor and various processing circuits. In at least one processor, one or more processors may be configured to perform the various functions described herein in a distributed manner, individually and / or collectively. As used herein, "processor," "at least one processor," and "one or more processors" may be configured to perform various functions. However, these terms cover, without limitation, situations where one processor performs some of the functions and other processor(s) perform other parts of the functions, and situations where a single processor can perform all functions. Additionally, at least one processor may include a combination of processors performing various functions of the disclosed functions in a distributed manner. At least one processor may execute program instructions to achieve or perform various functions.
[0035] In the present disclosure, the term “user” refers to a person using a display device and may include a consumer, evaluator, viewer, administrator, or installer. Additionally, in the specification, “manufacturer” or “provider” may refer to a manufacturer that manufactures a display device and / or components included in the display device.
[0036] In the present disclosure, 'image' may include a still image, a graphic, a picture, a frame, a video composed of a plurality of consecutive still images, or a video.
[0037] In the present disclosure, functions related to 'Artificial Intelligence (AI)' are operated through a processor and memory. The processor may be composed of one or more processors. In this case, the one or more processors may be general-purpose processors such as CPUs, APs, and DSPs (Digital Signal Processors), graphics-dedicated processors such as GPUs and VPUs (Vision Processing Units), or AI-dedicated processors such as NPUs. The one or more processors control the processing of input data according to predefined operation rules or AI models stored in memory. Alternatively, if the one or more processors are AI-dedicated processors, the AI-dedicated processors may be designed with a hardware structure specialized for processing a specific AI model.
[0038] In the present disclosure, 'predefined operation rules' or 'artificial intelligence models' are characterized by being created through learning. Here, being created through learning means that a basic artificial intelligence model is trained using a number of training data by a learning algorithm, thereby creating predefined operation rules or an artificial intelligence model configured to perform a desired characteristic (or objective). Such learning may be performed on the device itself where the artificial intelligence according to the present disclosure is executed, or it may be performed through a separate server and / or system. Examples of learning algorithms include supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but are not limited to the examples described above.
[0039] In the present disclosure, an ‘artificial intelligence model’ may be a model that analyzes linear or non-linear correlations between a plurality of operands (which may also be referred to as variables or parameters). For example, the artificial intelligence model may include at least one of linear regression, polynomial regression, logistic regression, decision trees, support vector machines (SVM), and linear correlation neural networks, but the present disclosure is not limited thereto.
[0040] In one embodiment of the present disclosure, the 'artificial intelligence model' may include a neural network model. The neural network model may be composed of a plurality of neural network layers. Each of the plurality of neural network layers has a plurality of weight values and performs neural network operations through operations between the operation result of a previous layer and the plurality of weights. The plurality of weights possessed by the plurality of neural network layers may be optimized by the learning result of the artificial intelligence model. For example, the plurality of weights may be updated so that the loss value or cost value obtained from the artificial intelligence model during the learning process is reduced or minimized. The artificial neural network model may include a Deep Neural Network (DNN), such as a Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Bidirectional Recurrent Deep Neural Network (BRDNN), or Deep Q-Networks, but is not limited to the examples described above.
[0041] In the present disclosure, "AI service" refers to a service that enables the use of various AI functions based on an artificial intelligence model (hereinafter referred to as an AI model). The AI service is provided through on-device or cloud infrastructure and can provide services using various technologies such as machine learning (ML), natural language processing (NLP), natural language understanding processing (NLU), and computer vision.
[0042] In the present disclosure, an "AI service providing device" refers to a device for providing various AI services. The AI service device can provide various AI services through an on-device or cloud infrastructure.
[0043] In the present disclosure, an "on-device AI service" refers to an AI service that can be provided without an internet connection through an AI model running on the device itself. An on-device AI service may be a service that is available offline and has a fast response speed. For example, a device providing an on-device AI service may be equipped with a lightweight AI model, and the AI model may be executed within the device through an AI-dedicated processor. In the present disclosure, an "on-device AI service" may also be referred to as a "device-based AI service."
[0044] In the present disclosure, "Cloud AI service" refers to an AI service that processes data over the Internet and provides results to a user. The Cloud AI service may be a high-performance AI service that processes complex computations or large-scale data on a cloud server without relying on the performance of a local device. For example, a device may communicate with a cloud server to request a necessary AI service, and the cloud server may execute an AI model corresponding to the AI service and then transmit the execution result to the device. The device may provide the execution result received from the cloud server. In the present disclosure, the "Cloud AI service" may also be referred to as a "server-based AI service."
[0045] In the present disclosure, a 'user command' may include text input or voice input comprising one or more words and / or one or more sentences. User input may refer to input for interacting with an AI model. User input may be extracted into natural language text through Natural Language Processing (NLP). For example, a user's spoken voice may be converted into user's spoken text through Automatic Speech Recognition (ASR) and extracted into natural language text through Natural Language Processing (NLP). In the present disclosure, a 'user command' may be replaced with expressions such as "user input," "input," "input phrase," "directive," "starting sentence," "task query," "trigger sentence," "message," "prompt," etc., but is not limited to the examples described above.
[0046] In the present disclosure, 'content' may be received by the device from a content provider, such as a broadcast signal, a streaming service, a Blu-ray player, or a game console. It is not limited thereto.
[0047] FIG. 1 is a diagram illustrating an AI service provision operation in an AI service provision system according to one embodiment of the present disclosure.
[0048] Referring to FIG. 1, an AI service providing system according to one embodiment of the present disclosure may include a plurality of electronic devices (1001, 1002). Each of the plurality of electronic devices (1001, 1002) may be placed in a home. Each of the plurality of electronic devices (1001, 1002) may be a device that performs operations to recognize and analyze human language, such as speech recognition, synthesis, natural language understanding, generation, machine translation, and dialogue systems, by receiving and processing a user's spoken voice. Each of the plurality of electronic devices (1001, 1002) may provide various AI services capable of speech recognition to the user. Each of the plurality of electronic devices (1001, 1002) may be a device that provides at least one of a device-based AI service or a server-based AI service.
[0049] Each of the plurality of electronic devices (1001, 1002) according to one embodiment of the present disclosure may be implemented in various forms such as a TV, smart monitor, mobile phone, smartphone, tablet PC, digital camera, camcorder, laptop computer, desktop, e-book terminal, digital broadcasting terminal, PDA (Personal Digital Assistants), PMP (Portable Multimedia Player), navigation, MP3 player, DVD (Digital Video Disk) player, wearable device, video wall, digital signage, DID (Digital Information Display), projector display, refrigerator, washing machine, etc. Additionally, each of the plurality of electronic devices (1001, 1002) may be a fixed electronic device placed at a fixed location or a mobile electronic device having a portable form, and may be a digital broadcasting receiver capable of receiving digital broadcasts. However, it is not limited thereto.
[0050] In the present disclosure, the first electronic device (1001) is depicted as a PC and the second electronic device (1002) is depicted as a monitor, but is not limited thereto. For example, the first electronic device (1001) may be a monitor and the second electronic device (1002) may be a PC. Alternatively, for example, the first electronic device (1001) and the second electronic device (1002) may be devices of the same type.
[0051] A plurality of electronic devices (1001, 1002) according to one embodiment of the present disclosure may be connected to each other. For example, a plurality of electronic devices (1001, 1002) may be connected to each other through the same network. A plurality of electronic devices (1001, 1002) according to one embodiment of the present disclosure may be connected to the same user account. A plurality of electronic devices (1001, 1002) connected to the same network and / or the same user account may share their respective electronic device information with each other. For example, each of the plurality of electronic devices (1001, 1002) may store information of the first electronic device (1001) and information of the second electronic device (1002). A description regarding the information of the electronic devices is further explained in FIG. 2.
[0052] Each of the plurality of electronic devices (1001, 1002) according to one embodiment of the present disclosure may include a microphone for receiving a user's speech command. In an environment where the plurality of electronic devices (1001, 1002) are present in a home, when a user speaks, at least one of the plurality of electronic devices (1001, 1002) may receive the user's speech command through the microphone. At least one of the plurality of electronic devices (1001, 1002) may determine which device the user's speech command should be processed by based on the user's speech command and information of each of the plurality of electronic devices (1001, 1002). For example, at least one of the plurality of electronic devices (1001, 1002) can be determined as an AI service providing device by using information regarding the device specifications of each of the plurality of electronic devices (1001, 1002), the service that can be provided, the source information of the content currently being displayed, etc.
[0053] For example, in operation 10, at least one of the plurality of electronic devices (1001, 1002) can obtain a user's speech command. For example, at least one of the plurality of electronic devices (1001, 1002) can obtain a speech command such as "Send the PPT document I just created to Anthony of the UX team."
[0054] In operation 20, at least one device among a plurality of electronic devices (1001, 1002) that has obtained a user's speech command can determine an AI service provider corresponding to the user's speech command. The at least one device can determine at least one of the plurality of electronic devices (1001, 1002) as an AI service provider based on information regarding the user's speech command, information regarding the first electronic device (1001), and information regarding the second electronic device (1002), respectively. For example, the at least one device can determine a first electronic device (1001) (e.g., a PC) capable of running a PPT document creation program, an email program, a contact program, etc., as an AI service provider. Or, for example, the at least one device can determine a device capable of performing natural language processing (NLP) as an AI service provider based on the fact that the speech command is natural language text.
[0055] In operation 30, among the plurality of electronic devices (1001, 1002), the device determined as the AI service provider can perform an operation corresponding to the utterance command. For example, if the first electronic device (1001) receives a user's utterance command and the first electronic device (1001) determines itself as the AI service provider, the first electronic device (1001) can identify the context and intent through natural language processing of the user's utterance command and perform an operation using a PPT document creation program, an email program, a contact program, etc. Or, for example, if the first electronic device (1001) receives a user's utterance command and determines the second electronic device (1002) as the AI service provider, the user's utterance command can be transmitted to the second electronic device (1002). In this case, the second electronic device (1002) can analyze the user's utterance command received from the first electronic device (1001) and perform an operation corresponding to the user's utterance command.
[0056] In an AI service provision environment comprising a plurality of electronic devices (1001, 1002), the plurality of electronic devices (1001, 1002) may determine a device among the plurality of electronic devices (1001, 1002) capable of providing a more efficient and effective AI service based on a user's speech command and information of each of the plurality of electronic devices (1001, 1002). The determined electronic device can provide an AI service efficiently and effectively by performing an operation corresponding to the user's speech command. This is further explained in FIGS. 2 to 6.
[0057] For example, at least one of the plurality of electronic devices (1001, 1002) may acquire a user’s speech command and determine an AI service providing device based on source information of content displayed on at least one of the plurality of electronic devices (1001, 1002). This is further explained in FIGS. 7 to 10.
[0058] For example, if all of the plurality of electronic devices (1001, 1002) are equipped with microphones, at least one of the plurality of electronic devices (1001, 1002) may determine a device to activate the microphone and obtain a speech command from a user through the determined microphone. This is further explained in FIGS. 11 to 13.
[0059] FIG. 2 is a drawing for explaining information about a plurality of electronic devices according to an embodiment of the present disclosure. FIG. 2 illustrates a system including a plurality of electronic devices (1001, 1002) that provide AI services. When either the first electronic device (1001) or the second electronic device (1002) is referred to as an 'electronic device', the other of the first electronic device (1001) or the second electronic device (1002) may be referred to as an 'external electronic device'.
[0060] Referring to FIG. 2, a plurality of electronic devices (1001, 1002) according to one embodiment of the present disclosure may be connected to each other. For example, the plurality of electronic devices (1001, 1002) may be connected to each other through the same network. For example, the network may include at least one of a local area network, a long area network, or a mobile network. For example, the plurality of electronic devices (1001, 1002) may be directly connected through a local area network.
[0061] According to one embodiment of the present disclosure, a plurality of electronic devices (1001, 1002) may be connected to the same user account. For example, a server (2000) may manage user account information and information of the plurality of electronic devices (1001, 1002) connected to the user account. For example, a user may create a user account by accessing the server (2000) through the plurality of electronic devices (1001, 1002). The user account may be identified by an ID and password set by the user. The server (2000) may register the plurality of electronic devices (1001, 1002) to the user account according to a defined procedure. For example, the server (2000) may register the plurality of electronic devices (1001, 1002) by connecting identification information (e.g., serial number or MAC address) of the plurality of electronic devices (1001, 1002) to the user account. Multiple electronic devices (1001, 1002) can be indirectly connected through a network and a server (2000).
[0062] Multiple electronic devices (1001, 1002) connected to a network and / or the same user account may share their respective electronic device information with one another. The multiple electronic devices (1001, 1002) may transmit their own information externally and store information received from the outside in their respective memories. According to one embodiment of the present disclosure, the multiple electronic devices (1001, 1002) may transmit, store, manage, or update each other's electronic device information. For example, the first electronic device (1001) may store information of the first electronic device (1001) and information of the second electronic device (1002). The second electronic device (1002) may store information of the first electronic device (1001) and information of the second electronic device (1002). The multiple electronic devices (1001, 1002) may periodically share information with one another after being connected once initially, but are not limited thereto.
[0063] In one embodiment of the present disclosure, information of each of the plurality of electronic devices (1001, 1002) may be used to determine an AI service provider corresponding to a user's speech command. For example, each of the plurality of electronic devices (1001, 1002) may determine an AI service provider capable of processing a user's speech command more efficiently and effectively based on information of the first electronic device (1001) and information of the second electronic device (1002). For example, among the plurality of electronic devices (1001, 1002), the device that acquires the user's speech command may determine whether to determine itself as an AI service provider or to determine an external electronic device as an AI service provider based on the user's speech command and information of each of the plurality of electronic devices (1001, 1002).
[0064] In one embodiment of the present disclosure, information of each of a plurality of electronic devices may be used to determine a method of providing an AI service (e.g., device-based or server-based) corresponding to a user's speech command. For example, a plurality of electronic devices (1001, 1002) may determine a method of providing an AI service that can process a user's speech command more efficiently and effectively based on information of a first electronic device (1001) and information of a second electronic device (1002). For example, a device among the plurality of electronic devices (1001, 1002) that receives a user's speech command may determine whether the method of providing the AI service is device-based or server-based based on the user's speech command and information of each of the plurality of electronic devices (1001, 1002). For example, an AI service providing device may provide at least one of a device-based AI service or a server-based AI service in response to a user's speech command.
[0065] In one embodiment of the present disclosure, information of an electronic device may include information used to determine an AI service providing device and / or an AI service providing method corresponding to a user's speech command. In one embodiment of the present disclosure, information of an electronic device may include device specification information, capability information, on-device AI related information, and content source information.
[0066] In one embodiment of the present disclosure, device specification information may represent detailed information related to the hardware and performance of the device. For example, the device specification information may include device type information (e.g., PC, monitor, etc.), information regarding at least one processor included in the electronic device (e.g., NPU, CPU, GPU, etc.) (e.g., processor type information, processor capacity information, processor core count information, processor performance information), and information regarding at least one memory included in the electronic device (e.g., RAM) (e.g., memory type information, memory capacity information, memory performance information). For example, the processor performance information may include performance information of an AI-dedicated processor (e.g., NPU), and the performance information of the AI-dedicated processor may be determined by resource information (e.g., number of arithmetic units (M1, M2, M3, M4), memory capacity, frequency bandwidth, etc.). The greater the amount of resources, the better the performance of the processor may be. For example, the device specification information may further include AI service type information, on-device AI service information, performance information of an on-device AI model, etc. For example, information on the type of AI service may indicate the types of AI services that the device can provide (e.g., object recognition, natural language processing, etc.). For example, on-device AI service information may include at least one of information on whether the device supports device-based AI services, whether it supports server-based AI services, or information on the occupancy rate (or remaining resource amount) of an AI-dedicated processor (e.g., NPU). For example, performance information on an on-device AI model may include information on the number of parameters of the AI model. The more parameters an AI model has, the better the performance of the AI model, but the computation time may increase.
[0067] In one embodiment of the present disclosure, an AI service providing device corresponding to a user's utterance command may be determined based on device specification information. For example, an electronic device may identify device specifications necessary for providing an AI service corresponding to a user's utterance command and determine a device that supports said device specifications as the AI service providing device. For example, if the user's utterance command is natural language text, a device capable of performing natural language processing (NLP) may be determined as the AI service providing device. Additionally, for example, if both devices are capable of performing natural language processing, a device capable of performing device-based natural language processing (i.e., an on-device AI device) may be determined as the AI service providing device. Furthermore, for example, if there are multiple on-device AI devices, among the multiple devices, a device with better NPU performance, better AI model performance, or a larger remaining amount of NPU resources may be determined as the AI service providing device. For example, if there is no device equipped with the corresponding model, a device providing a server-based AI service may be determined as the AI service providing device. However, the above examples are not limited.
[0068] In one embodiment of the present disclosure, capability information may represent functional information that can be processed by the device. For example, capability information may include information regarding the ability to perform specific tasks, such as sending emails, converting file extensions (e.g., converting PPT to DOC), running a meeting program and attending a meeting, providing weather information, providing stock market conditions, providing news, setting alarms, and running applications on the device.
[0069] In one embodiment of the present disclosure, an AI service providing device corresponding to a user's speech command may be determined based on capability information. For example, an electronic device may identify the type of capability required to provide an AI service corresponding to a user's speech command and determine a device that supports the corresponding capability as the AI service providing device.
[0070] In one embodiment of the present disclosure, on-device AI-related information may include information for determining an AI service providing device and / or information for determining an AI service providing method. For example, on-device AI-related information may include at least one of whether device-based AI services are supported, whether server-based AI services are supported, performance information of an AI-dedicated processor, occupancy information of an AI-dedicated processor, and performance information of an on-device AI model. On-device AI-related information may include elements that overlap with some of the elements included in the device specification information described above. Additionally, on-device AI-related information may include network environment information of the electronic device. For example, if the network environment of the electronic device is offline, server-based AI services cannot be provided, so the electronic device may determine the AI service providing method based on the device.
[0071] In one embodiment of the present disclosure, a method for providing AI services (AI service provision method) may be determined based on at least one of device specification information or on-device AI-related information. For example, an electronic device may determine a provision method regarding whether the AI service providing device provides AI services based on the device or based on the server. For example, if the network environment is offline, the provision method may be determined to be device-based. For example, if the user desires fast computation processing, the provision method may be determined to be server-based. For example, if the current amount of remaining NPU resources is low, the provision method may be determined to be server-based. For example, if the AI service providing device is in a state of paid subscription to a server-based AI service, the provision method may be determined to be server-based.
[0072] In one embodiment of the present disclosure, the content source information may represent information of a source device (or source) for content displayed on the screen of an electronic device. For example, a first electronic device (1001) and a second electronic device (1002) may be connected to each other via a wireless connection method such as mirroring or casting, or via a wired connection method through a cable (e.g., HDMI). In this case, one of the plurality of electronic devices (1001, 1002) may be a source device that transmits content, and the other of the plurality of electronic devices (1001, 1002) may be a sink device that receives and outputs content.
[0073] Alternatively, for example, the first electronic device (1001) and the second electronic device (1002) may each be a device equipped with an operating system (OS) and an internet connection function. In this case, the first electronic device (1001) and the second electronic device (1002) may each execute various types of applications using the internally installed operating system to output content. In this case, the source of the content on the electronic device screen may be the electronic device itself.
[0074] In one embodiment of the present disclosure, a first electronic device (1001) and a second electronic device (1002) may share content source information with each other. For example, the electronic device may store at least one of the content source information of the electronic device or the content source information of an external electronic device. For example, the external electronic device may transmit content source information to the electronic device whenever the source of the content being displayed changes. The electronic device may store the content source information received from the external electronic device in memory.
[0075] In one embodiment of the present disclosure, an AI service providing device corresponding to a user's speech command may be determined based on source information of content displayed on at least one screen among a plurality of electronic devices (1001, 1002). For example, an electronic device may determine the device corresponding to the source of content displayed on at least one screen among the electronic device or an external electronic device as the AI service providing device. For example, if the source of content output by the electronic device is an external electronic device, the external electronic device may be determined as the AI service providing device because it is more efficient for the external electronic device to process the user's speech command. Additionally, for example, if the source of content output by the electronic device is the electronic device itself, the electronic device may be determined as the AI service providing device because it is more efficient for the electronic device itself to process the user's speech command.
[0076] An example is provided where the first electronic device (1001) is a PC and the second electronic device (1002) is a monitor. The monitor and the PC can be connected to each other. The monitor can receive and output content from the PC via a wireless connection method or a wired connection method. In this case, the PC may be the source device and the monitor may be the sink device. Additionally, for example, the monitor may be a smart monitor equipped with an operating system and internet connection functions. In this case, in addition to receiving and outputting content from the PC, the monitor may run applications using the operating system installed internally. When the monitor receives and outputs content from the PC, the source of the content on the monitor screen may be the PC. Also, when the monitor runs an application through the operating system installed internally to output content, the source of the content on the monitor screen may be the monitor itself. Furthermore, the monitor and the PC may share content source information with each other. For example, the monitor may transmit content source information to the PC whenever the source of the displayed content changes. The PC can store source information of content received from the monitor in memory. When the PC obtains a user's speech command, if the source of the monitor's content is itself, the PC can determine the AI service provider corresponding to the user's speech command as itself. If the source of the monitor's content is the monitor, the PC can determine the AI service provider corresponding to the user's speech command as the monitor. However, this is not limited to examples, and the same applies even when the first electronic device (1001) is the monitor and the second electronic device (1002) is the PC.
[0077] FIG. 3 is a block diagram showing the configuration of an electronic device according to one embodiment of the present disclosure.
[0078] Referring to FIG. 3, an electronic device (1000) according to one embodiment of the present disclosure may be any one of the plurality of electronic devices (1001, 1002) of FIG. 1 and FIG. 2. An external electronic device may be another of the plurality of electronic devices (1001, 1002) of FIG. 1 and FIG. 2.
[0079] An electronic device (1000) according to one embodiment of the present disclosure may include a processor (1100) (e.g., including a processing circuit), a communication unit (1200) (e.g., including a communication circuit), an input interface (1300) (e.g., including an input circuit), and a memory (1400). However, not all of the illustrated components are essential components. The electronic device (1000) may be implemented by more components than those illustrated, or by fewer components. In the present disclosure, a 'module' may be implemented by at least one processor included in the electronic device (1000) executing software such as program code, instructions, algorithms, and data structures stored in memory included in the electronic device (1000). In the following, operations described as being performed by a module of the electronic device (1000) may actually be performed by at least one processor included in the electronic device (1000).
[0080] The communication unit (1200) includes various communication circuits and can connect the electronic device (1000) to peripheral devices, external devices, servers, mobile terminals, etc. under the control of the processor (1100). The communication unit (1200) may include various communication circuits included in at least one communication module. The communication unit (1200) may include a short-range communication module, a wireless internet module, wired Ethernet, etc., corresponding to the performance and structure of the electronic device (1000).
[0081] A short-range communication module is a module for short-range communication and may include, but is not limited to, a WLAN module (Wi-Fi module), a Bluetooth module, a Zigbee module, an infrared (IrDA, infrared Data Association) module, a WFD (Wi-Fi Direct) module, etc.
[0082] A wireless internet module is a module for wireless internet access and may be built into or external to a device. The wireless internet module may include a WLAN module, a Wibro (Wireless broadband) module, etc. The wireless internet module may be used for the source device (100) to communicate with a server device. A WLAN module may be used as a wireless internet module if it serves to connect to the internet through an access point.
[0083] In one embodiment of the present disclosure, the electronic device (1000) may be connected to an external electronic device through a communication unit (1200). The electronic device (1000) may be connected to the same network as the external electronic device through the communication unit (1200).
[0084] In one embodiment of the present disclosure, an electronic device (1000) may be connected to a server (e.g., 2000 in FIG. 2) through a communication unit (1200). The electronic device (1000) may register a user account by accessing the server through the communication unit (1200). The electronic device (1000) may be indirectly connected to an external electronic device through the server.
[0085] In one embodiment of the present disclosure, an electronic device (1000) can share its respective electronic device information with an external electronic device through a communication unit (1200). A plurality of electronic devices can transmit their respective electronic device information to each other through the communication unit (1200), store it in a memory (1400), manage it, or update it.
[0086] In one embodiment of the present disclosure, when the electronic device (1000) determines an external electronic device as an AI service provider, it can transmit data regarding a user's speech command to the external electronic device through the communication unit (1200).
[0087] In one embodiment of the present disclosure, when the electronic device (1000) determines to provide a server-based AI service, it may transmit data regarding a user's speech command to an AI server through a communication unit (1200). The electronic device (1000) may request processing of the user's speech command from the AI server through the communication unit (1200) and receive analysis data of the user's speech command from the AI service.
[0088] The input interface (1300) includes various input circuits and can receive user input for controlling an electronic device (1000) under the control of a processor (1100). The input interface (1300) may include, but is not limited to, various forms of user input devices including a touch panel for detecting a user's touch, a button for receiving a user's push operation, a wheel for receiving a user's rotation operation, a keyboard, a dome switch, a microphone (1310) for voice recognition, a motion detection sensor for sensing motion, etc.
[0089] In one embodiment of the present disclosure, an electronic device (1000) may obtain a speech command from a user requesting an AI service through a microphone (1310). The user's speech command may include voice data. According to one embodiment of the present disclosure, the user's speech command may include specific details of a task or command requested by the user regarding the AI service.
[0090] However, not limited thereto, an electronic device (1000) according to one embodiment of the present disclosure may receive a command from a user requesting an AI service through various types of user input devices (e.g., mouse, keyboard, touch panel, etc.). In this case, the user's spoken command may include text data.
[0091] The processor (1100) includes various processing circuits and is electrically connected to components included in the electronic device (1000) to perform operations or data processing regarding the control and / or communication of components included in the electronic device (1000). In one embodiment of the present disclosure, the processor (1100) may load a request, command, or data received from at least one of the other components into memory for processing and store the processing result data in memory. According to one embodiment of the present disclosure, the processor (1100) may include at least one of a general-purpose processor such as a CPU (central processing unit), AP (application processor), DSP (Digital Signal Processor), a graphics-dedicated processor such as a GPU (graphic processing unit) or VPU (Vision Processing Unit), or an artificial intelligence-dedicated processor such as an NPU (neural processing unit). An artificial intelligence-dedicated processor may be a processor specialized for the computation of an AI model.
[0092] The processor (1100) can process input data or control other configurations to process it according to data, operation rules, algorithms, methods, or models stored in memory (1400). The processor (1100) can perform operations of predefined operation rules, algorithms, methods, or models stored in memory (1400) using the input data.
[0093] The memory (1400) is electrically connected to the processor (1100) and can store one or more modules, algorithms, operation rules, AI models, programs, instructions, or data related to the operation of components included in the electronic device (1000). For example, the memory (1400) can store one or more modules, algorithms, operation rules, AI models, programs, instructions, or data for processing and controlling the processor (1100). The memory (1400) may include, but is not limited to, at least one type of storage medium among flash memory type, hard disk type, multimedia card micro type, card type memory (e.g., SD or XD memory, etc.), RAM (Random Access Memory), SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, magnetic disk, and optical disk.
[0094] In one embodiment of the present disclosure, the memory (1400) may include a voice preprocessing module (1410), a device determination module (1420), and a command processing module (1430), each of which may include various circuits and / or executable program instructions. The memory (1400) may store information (1440) of a plurality of electronic devices. The information (1440) of a plurality of electronic devices may include information of an electronic device and information of an external electronic device.
[0095] A voice preprocessing module (1410) according to one embodiment of the present disclosure may perform preprocessing (e.g., tokenizing) on a user's utterance command. The preprocessed user's utterance command may be data used to determine an AI service provider. For example, an electronic device (1000) may perform tokenization on a user's utterance command to obtain one or more tokens corresponding to the user's utterance command. Here, tokenization refers to the process of dividing text or voice data into tokens, which are smaller semantic units. Tokenization may be a preprocessing operation for data to be input into a natural language processing (NLP) model, which converts text or voice into a form that the device can understand. A token may be a small semantic unit extracted from a user's utterance command. For example, each token may contain at least a part of the user's utterance command.
[0096] A voice preprocessing module (1410) according to one embodiment of the present disclosure can perform Automatic Speech Recognition (ASR). An electronic device (1000) can convert a user's spoken voice into a user's spoken text through automatic speech recognition. An electronic device (1000) can perform tokenization using the user's spoken text. Alternatively, an electronic device (1000) according to one embodiment of the present disclosure may tokenize the user's spoken voice itself to obtain acoustic or semantic tokens.
[0097] A voice preprocessing module (1410) according to one embodiment of the present disclosure identifies whether a trigger command is included in a user's speech command, and if a trigger command is included, it can perform tokenization. The electronic device (1000) can perform tokenization of the user's speech command if the user's speech command includes a trigger command such as "Bixby". However, it is not limited thereto, and the electronic device (1000) can perform tokenization of the user's speech command even if there is no trigger command.
[0098] A device determination module (1420) according to one embodiment of the present disclosure can determine an AI service providing device corresponding to a user's speech command. By executing one or more instructions stored in the device determination module (1420), the electronic device (1000) can determine an AI service providing device corresponding to a user's speech command among the electronic device (1000) and external electronic devices based on information (1440) of a plurality of electronic devices.
[0099] In the information (1440) of a plurality of electronic devices according to one embodiment of the present disclosure, the information of each electronic device may include information used to determine an AI service providing device corresponding to a user's speech command. For example, the information of the electronic device may include at least one of device specification information, capability information, on-device AI related information, or content source information.
[0100] An electronic device (1000) according to one embodiment of the present disclosure can determine an AI service providing device corresponding to a user's speech command based on information of the electronic device (1000) and information of an external electronic device through a device determination module (1420). For example, the electronic device (1000) can determine an AI service providing device corresponding to a user's speech command based on at least one of device specification information, capability information, on-device AI related information, screen content information of the electronic device (1000), device specification information, capability information, on-device AI related information, or screen content information of an external electronic device.
[0101] For example, the electronic device (1000) can identify the device specifications required to provide an AI service corresponding to a speech command based on the device specification information of the electronic device (1000) and the device specification information of an external electronic device, and determine a device that supports the said device specifications as an AI service providing device. For example, the electronic device (1000) can determine an AI service providing device based on at least one of device type information (e.g., PC or monitor), information about at least one processor, information about at least one memory, performance information of an AI dedicated processor, AI service type information, on-device AI service information (e.g., whether device-based AI service is supported, whether server-based AI service is supported, occupancy information of an AI dedicated processor), and performance information of an on-device AI model.
[0102] For example, the electronic device (1000) can select a device capable of providing a capability corresponding to a user's speech command based on the capability information of the electronic device (1000) and the capability information of an external electronic device.
[0103] For example, the electronic device (1000) may select a device capable of providing an on-device AI service in response to a user's speech command based on on-device AI-related information of the electronic device (1000) and on-device AI-related information of an external electronic device. For example, the electronic device (1000) may determine an AI service providing device based on at least one of whether device-based AI service is supported, whether server-based AI service is supported, performance information of an AI-dedicated processor, occupancy information of an AI-dedicated processor, performance information of an on-device AI model, or network environment information of the electronic device (1000). This is further explained in FIGS. 4 to 6.
[0104] For example, the electronic device (1000) may determine an AI service providing device in response to a user's speech command based on source information of content displayed on at least one screen of the electronic device (1000) or an external electronic device. For example, if content is displayed on at least one screen of the electronic device (1000) or an external electronic device and the user's speech command is related to the content, the electronic device (1000) may determine a source device that generates the content as an AI service providing device. This is further explained in FIGS. 7 and 8.
[0105] A device determination module (1420) according to one embodiment of the present disclosure can determine a method of providing an AI service corresponding to a user's speech command. By executing one or more instructions stored in the device determination module (1420), the electronic device (1000) can determine whether to provide the AI service based on a device or based on a server, based on information (1440) of a plurality of electronic devices.
[0106] A device determination module (1420) according to one embodiment of the present disclosure may determine a microphone activation device to receive a user's speech command among a plurality of electronic devices. By executing one or more instructions stored in the device determination module (1420), the electronic device (1000) may determine a microphone activation device to receive a user's speech command among the electronic device (1000) and an external electronic device based on information (1440) of the plurality of electronic devices. A detailed description thereof is provided in FIGS. 11 to 13.
[0107] A command processing module (1430) according to one embodiment of the present disclosure can provide an AI service corresponding to a user's utterance command. In a plurality of electronic device environments, when an electronic device (1000) is determined to be an AI service providing device, the electronic device (1000) can perform an operation to provide an AI service corresponding to a user's utterance command. The electronic device (1000) may use at least one AI model to identify the intention included in the user's utterance command. The electronic device (1000) may execute the AI model using one or more AI-dedicated processors provided in the electronic device (1000).
[0108] In a plurality of electronic device environments according to one embodiment of the present disclosure, when an external electronic device is determined to be an AI service providing device, the electronic device (1000) can transmit a user's speech command to the external electronic device through a communication unit (1200).
[0109] In a plurality of electronic device environments according to one embodiment of the present disclosure, when it is determined that a server-based AI service is provided, the electronic device (1000) can transmit a user's speech command to an AI server through a communication unit (1200). The electronic device (1000) can receive analysis information regarding the user's speech command from the AI server through the communication unit (1200).
[0110] Meanwhile, according to one embodiment of the present disclosure, the command processing module (1430) may be included in a separate server. Alternatively, it may not be mounted in the second electronic device but may be mounted only in the first electronic device.
[0111] FIG. 4 is a flowchart illustrating a method for an electronic device according to an embodiment of the present disclosure to determine an AI service providing device. FIG. 4 is a flowchart illustrating a method of operation in which a plurality of electronic devices according to an embodiment of the present disclosure provide an AI service based on a first electronic device or a second electronic device.
[0112] Referring to FIG. 4, in operation 405, the first electronic device (1001) and the second electronic device (1002) can exchange electronic device information with each other. That is, the electronic device can store and manage not only its own information but also information of an external electronic device. For example, the first electronic device (1001) can transmit information of the first electronic device (1001) to the second electronic device (1002). The second electronic device (1002) can store and manage information of the first electronic device (1001) received from the first electronic device (1001) together with its own information, which is the information of the second electronic device (1002). The first electronic device (1001) and the second electronic device (1002) can be connected to a network and / or the same user account. This is omitted as it has been explained in FIG. 2.
[0113] In operation 410, the first electronic device (1001) can obtain a user's speech command. A user's speech command according to one embodiment of the present disclosure may include specific details of a task or command requested by the user for an AI service.
[0114] A first electronic device (1001) according to one embodiment of the present disclosure can obtain a speech command from a user requesting an AI service through a microphone. The user's speech command may include voice data.
[0115] An electronic device (1000) according to one embodiment of the present disclosure may receive a command from a user requesting an AI service through various types of user input devices (e.g., mouse, keyboard, touch panel, etc.). In this case, the user's spoken command may include text data.
[0116] Meanwhile, the first electronic device (1001) according to one embodiment of the present disclosure may be a device determined to receive a user's speech command. For example, one of the microphones provided in each of the first electronic device (1001) and the second electronic device (1002) may be deactivated, and the other microphone may be activated. Whether the microphone is activated may be determined based on location information between the electronic device and the user, whether it is in low-power mode, content source information, etc., and this will be described later in FIG. 11.
[0117] In operation 415, the first electronic device (1001) can perform preprocessing for the user's speech command. Operation 415 can correspond to the operation of the voice preprocessing module (1410) of FIG. 3.
[0118] A first electronic device (1001) according to one embodiment of the present disclosure can convert a user's spoken voice into the user's spoken text through Automatic Speech Recognition (ASR). The first electronic device (1001) can obtain one or more tokens through tokenizing the user's spoken command. One or more tokens may be data used to determine an AI service provider.
[0119] A user’s speech command according to one embodiment of the present disclosure may include a trigger command. For example, the first electronic device (1001) may perform tokenization of the user’s speech command when the user’s speech command includes a trigger command such as “Bixby”. However, it is not limited thereto, and the electronic device (1000) may perform tokenization of the user’s speech command even without a trigger command and perform the operation 420 or lower described below.
[0120] In operation 420, the first electronic device (1001) can determine an AI service providing device corresponding to the user's speech command based on the user's speech command, information of the first electronic device (1001), and information of the second electronic device (1002). According to one embodiment of the present disclosure, the first electronic device (1001) can determine at least one of the first electronic device (1001) or the second electronic device (1002) as an AI service providing device.
[0121] Information of an electronic device according to one embodiment of the present disclosure may include information used to determine an AI service providing device corresponding to a user's speech command. For example, the information of the electronic device may include at least one of device specification information, capability information, on-device AI related information, or content source information.
[0122] A first electronic device (1001) according to one embodiment of the present disclosure may determine an AI service providing device corresponding to a user's speech command based on at least one of the device specification information, capability information, on-device AI related information, screen content information of the first electronic device (1001), device specification information, capability information, on-device AI related information, or screen content information of the second electronic device (1002).
[0123] For example, the first electronic device (1001) may determine as the AI service provider device a device with good processor performance, a device with a large memory capacity, a device with good performance of an AI dedicated processor (e.g., NPU), a device that provides device-based AI services, a device with a low current occupancy of an AI dedicated processor (e.g., NPU) (i.e., a device with a high remaining resource amount), or a device with good performance of an on-device AI model or a device with a stable network environment.
[0124] Operation 420 corresponds to the operation of the device determination module (1420) of FIG. 3 and is described in detail in FIG. 3.
[0125] In operations 420 and 425, if the first electronic device (1001) determines the second electronic device (1002) as an AI service provider, it may transmit a preprocessed user speech command to the second electronic device (1002). For example, the preprocessed user speech command may include one or more tokens. The second electronic device (1002) may receive the preprocessed user speech command. In operations 460 and 465, the second electronic device (1002) may use at least one AI model to analyze the preprocessed user speech command and perform an operation corresponding to the user speech command. This corresponds to operations 440 and 455 and will be described in detail below.
[0126] In operations 420 and 430, when the first electronic device (1001) determines the first electronic device (1001) as an AI service providing device, it may determine an AI service providing method corresponding to the user's speech command based on the user's speech command, information of the first electronic device (1001), and information of the second electronic device (1002). The first electronic device (1001) according to one embodiment of the present disclosure may determine whether to provide the AI service based on the device or based on the server.
[0127] Information of an electronic device according to one embodiment of the present disclosure may include information used to determine a method of providing an AI service corresponding to a user's speech command. For example, the first electronic device (1001) may determine whether to provide an AI service corresponding to a user's speech command on a device basis or on a server basis based on at least one of the device specification information, capability information, on-device AI related information, screen content information of the first electronic device (1001), device specification information, capability information, on-device AI related information, or screen content information of the second electronic device (1002).
[0128] For example, if the AI service provider does not support on-device AI services, the provision method may be determined as server-based. For example, if the network environment is offline, the provision method may be determined as device-based. For example, if the performance of the embedded AI model is high and the user desires fast computation processing, the provision method may be determined as server-based. For example, if the current remaining NPU resources are low, the provision method may be determined as server-based. For example, if the AI service provider is subscribed to a paid server-based AI service, the provision method may be determined as server-based. However, the above examples are not limited to these cases.
[0129] In operations 430 and 435, if the first electronic device (1001) decides to provide AI services on a server basis, it may transmit a preprocessed user speech command to an AI server (3000). For example, the preprocessed user speech command may include one or more tokens. The AI server (3000) may be a cloud server equipped with an AI model. The AI server (3000) may be the same server as the server (2000) of FIG. 2, or it may be a different server. The AI server (3000) may receive the preprocessed user speech command. In operation 445, the AI server (3000) may analyze the preprocessed user speech command using at least one AI model. The operation of analyzing the user speech command is described later in operation 440. In operation 450, the AI server (3000) may transmit analysis information to the first electronic device (1001). The first electronic device (1001) can receive analysis information from the AI server (3000) and perform an operation corresponding to the analysis information.
[0130] In operations 430, 440, and 455, if the first electronic device (1001) determines to provide an AI service based on the device, it may operate to provide an AI service corresponding to a user's speech command. Operations 440 and 455 may correspond to the operation of the command processing module (1430) of FIG. 3.
[0131] In operation 440, the first electronic device (1001) may use at least one AI model to determine the intent included in the user's utterance command. For example, the first electronic device (1001) may use a natural language processing (NLP) and / or natural language understanding (NLU) model. The first electronic device (1001) may perform an analysis on the tokenized user's utterance command to determine the meaning, context, intent, etc. of words within the sentence. The user's utterance command may be interpreted through a natural language processing (NLP) and / or natural language understanding (NLU) model and extracted as information, numerical values, parameters, etc., representing the user's utterance intent. In operation 455, the first electronic device (1001) may perform an operation corresponding to the user's utterance command based on the analyzed information.
[0132] Meanwhile, in the method of operation of an electronic device according to one embodiment of the present disclosure, operations 430 and 435 may be omitted. That is, the determination of the method of providing AI services may be omitted, or it may be performed as a separate operation from the operation of determining the AI service providing device. For example, if the first electronic device (1001) determines the first electronic device (1001) as an AI service providing device, it may operate to provide an AI service corresponding to a user's speech command. If the first electronic device (1001) determines the second electronic device (1002) as an AI service providing device, it may transmit the user's speech command to the second electronic device (1002).
[0133] FIG. 5 is a diagram illustrating the operation of a first electronic device according to one embodiment of the present disclosure providing an AI service based on the first electronic device or a second electronic device in response to a speech command.
[0134] Referring to FIG. 5, the operation of the first electronic device (1001) receiving a user's speech command and determining an AI service provider is described. The operation of the first electronic device (1001) determining the first electronic device (1001) as an AI service provider and the operation of determining the second electronic device (1002) as an AI service provider are described separately.
[0135] A first electronic device (1001) according to one embodiment of the present disclosure includes a microphone (1311), a voice preprocessing module (1411), a device determination module (1421), a command processing module (1431), and a communication unit (1201) (e.g., including a communication circuit), each of which may include various circuits and / or executable program instructions. The first electronic device (1001) may store information (1440) of a plurality of electronic devices. A second electronic device (1002) according to one embodiment of the present disclosure includes a microphone (1312), a voice preprocessing module (1412), a device determination module (1422), a command processing module (1432), and a communication unit (1202) (e.g., including a communication circuit), each of which may include various circuits and / or executable program instructions. The second electronic device (1002) may store information (1440) of a plurality of electronic devices. These can each correspond to the microphone (1310), voice preprocessing module (1410), device determination module (1420), command processing module (1430), communication unit (1200), and information (1440) of the plurality of electronic devices of the electronic device (1000) of FIG. 3.
[0136] The first electronic device (1001) can receive a user's speech command through a microphone (1311). For example, the first electronic device (1001) can receive a speech command such as "Send the PPT document I just created to Anthony of the UX team."
[0137] The first electronic device (1001) can perform preprocessing on the user's speech command through a voice preprocessing module (1411). For example, the first electronic device (1001) can tokenize the speech command to generate tokens such as “just,” “write,” “PPT document,” “UX Anthony,” and “send.”
[0138] The first electronic device (1001) can determine whether to determine the first electronic device (1001) as an AI service provider or the second electronic device (1002) as an AI service provider based on information (1440) of a plurality of electronic devices through a device determination module (1421). For example, the first electronic device (1001) can identify necessary device specification information (e.g., NPU performance, NPU share, whether on-device AI service is supported), necessary types of AI services (e.g., Natural Language Processing (NLP)), and necessary capability information (e.g., PPT document creation program, email program, and contact program) based on tokens. For example, the first electronic device (1001) can match the identified information with the information (1440) of a plurality of electronic devices in response to a user's speech command. For example, the first electronic device (1001) can identify whether each electronic device can execute a device-based NLP model, the NPU performance of each electronic device, the NPU occupancy rate of each electronic device, etc., based on information (1440) of a plurality of electronic devices. Additionally, for example, the first electronic device (1001) can identify an electronic device having a capability corresponding to a PPT document creation program, an email program, a contact program, etc., based on information (1440) of a plurality of electronic devices.
[0139] When the first electronic device (1001) determines the first electronic device (1001) as an AI service provider through the device determination module (1421), it can transmit token data to the command processing module (1431). The first electronic device (1001) can provide device-based AI services. The first electronic device (1001) can provide AI services by executing an AI model through the command processing module (1431) and executing one or more applications (or programs) related to AI services. For example, the first electronic device (1001) can identify the intent included in the user's utterance command through natural language processing and perform context analysis. Based on the user's intent, the first electronic device (1001) can execute a PPT document creation program, an email program, and a contact program. The first electronic device (1001) can perform the action of identifying a stored PPT document, identifying the contact of UX team member Anthony, and sending the identified PPT document to Anthony via email.
[0140] Alternatively, if the first electronic device (1001) determines the second electronic device (1002) as an AI service provider through the device determination module (1421), it may transmit token data to the second electronic device (1002) through the communication unit (1201). The second electronic device (1002) may receive token data through the communication unit (1202) and perform an AI service provision operation corresponding to a user's speech command through the command processing module (1432). In this case, the microphone (1312), the voice preprocessing module (1412), and the device determination module (1422) may not operate, but are not limited thereto.
[0141] FIG. 6 is a diagram illustrating the operation of a first electronic device according to one embodiment of the present disclosure providing an AI service based on a server in accordance with a speech command.
[0142] Referring to FIG. 6, the operation of the first electronic device (1001) determining the first electronic device (1001) as an AI service providing device and providing a server-based AI service is described. In FIG. 6, descriptions that overlap with FIG. 5 are omitted.
[0143] The first electronic device (1001) can determine the method of providing AI services based on a server through the device determination module (1421). For example, the first electronic device (1001) can provide AI services based on a server if the current NPU occupancy rate is high or the NPU performance is low. The first electronic device (1001) can transmit token data to the AI server (3000) through the communication unit (1201). The token data is input into the AI model of the AI server (3000), and analysis information of the speech command can be extracted from the AI model. The first electronic device (1001) can receive analysis information of the speech command from the AI server (3000) through the communication unit (1201).
[0144] The first electronic device (1001) can provide AI services by executing one or more applications (or programs) related to AI services through the analysis information of speech commands received from the AI server (3000) and the command processing module (1431). The first electronic device (1001) can provide server-based AI services. For example, the first electronic device (1001) can execute a PPT document creation program, an email program, and a contact program based on the user's intent, and transmit related data. The first electronic device (1001) can perform the action of identifying a stored PPT document, identifying the contact of UX team member Anthony, and sending the identified PPT document to Anthony via email. In this case, the second electronic device (1002) does not operate, so AI services can be provided efficiently.
[0145] Hereinafter, with reference to FIGS. 7 to 10, an operation in which an electronic device (1000) according to one embodiment of the present disclosure determines an AI service providing device based on content source information will be described.
[0146] FIG. 7 is a flowchart illustrating a method in which an electronic device according to one embodiment of the present disclosure determines an AI service providing device based on content source information. The electronic device (1000) according to FIG. 7 may be any one of the plurality of electronic devices (1001, 1002) of FIG. 1, FIG. 2, FIG. 4, FIG. 5, and FIG. 6. The external electronic device may be another one of the plurality of electronic devices (1001, 1002).
[0147] Referring to FIG. 7, in operation 710, the electronic device (1000) can obtain a speech command from a user.
[0148] A user’s speech command according to one embodiment of the present disclosure may include specific details of a task or command requested by the user regarding an AI service. An electronic device (1000) according to one embodiment of the present disclosure may obtain a user’s speech command requesting an AI service through a microphone. The user’s speech command may include voice data. This is the same as described in operation 410 of FIG. 4.
[0149] An electronic device (1000) according to one embodiment of the present disclosure can convert a user's spoken voice into the user's spoken text through automatic speech recognition (ASR). The electronic device (1000) can obtain one or more tokens through tokenizing the user's spoken command. One or more tokens may be data used to determine an AI service provider. In this regard, it is the same as described in the voice preprocessing module (1410) of FIG. 3 and operation 415 of FIG. 4.
[0150] Meanwhile, an electronic device (1000) according to one embodiment of the present disclosure may be a device for receiving a user's speech command. For example, one of the microphones provided in the electronic device (1000) and the external electronic device may be deactivated, and the other microphone may be activated. Whether the microphone is activated may be determined based on location information between the electronic device and the user, whether it is in low-power mode, content source information, etc., and this will be described later in FIG. 11.
[0151] In operation 720, the electronic device (1000) can determine an AI service provider based on the user's speech command and source information of the content displayed on the screen.
[0152] In one embodiment of the present disclosure, content may be displayed on at least one screen of an electronic device (1000) or an external electronic device. The electronic device (1000) may store source information of the content displayed on at least one screen of the electronic device (1000) or an external electronic device. For example, the electronic device (1000) may store at least one of source information of the content of the electronic device (1000) or source information of the content of the external electronic device. Source information of the content of the external electronic device may be received from the external electronic device and stored in the memory of the electronic device (1000). The electronic device (1000) and the external electronic device are connected via a network and / or the same user account and may share content source information with each other. This is described in FIG. 2.
[0153] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service providing device in response to a user's speech command based on source information of content displayed on at least one screen of the electronic device (1000) or an external electronic device. For example, when content is displayed on at least one screen of the electronic device (1000) or an external electronic device, the electronic device (1000) may determine a source device that generates the content as an AI service providing device.
[0154] For example, assume a case where content is displayed on the screen of an external electronic device. The electronic device (1000) can identify the source information of the content displayed on the screen of the external electronic device. If the source of the content displayed by the external electronic device is the electronic device (1000) itself, the electronic device (1000) can determine itself as the AI service provider. If the source of the content displayed by the external electronic device is the external electronic device, the electronic device (1000) can determine the external electronic device as the AI service provider.
[0155] Alternatively, for example, assume a case where content is displayed on the screen of an electronic device (1000). The electronic device (1000) can identify the source information of the content displayed on the screen of the electronic device (1000). If the source of the content displayed by the electronic device (1000) is the electronic device (1000) itself, the electronic device (1000) can determine itself as the AI service provider. If the source of the content displayed by the electronic device (1000) is an external electronic device, the electronic device (1000) can determine the external electronic device as the AI service provider.
[0156] In one embodiment of the present disclosure, if a user’s utterance command is related to the control of content, it may be efficient to have the source device that generates the content control the content. To perform content control operations at the sink device, additional data may be requested from the source device or additional communication may be required. Conversely, if the source device performs the content control operations, the task can be processed immediately without unnecessary data transmission. Furthermore, if the user’s utterance command is related to content, the source device can better understand the context and intent of the command. For example, if a user commands, "Save the video I am currently watching," the sink device is merely outputting the video and therefore does not know information regarding the save location or format. On the other hand, since the source device is the creator of the video, it can process the command more accurately.
[0157] An electronic device (1000) according to one embodiment of the present disclosure can identify whether a user's speech command is related to content. For example, the electronic device (100) can identify whether content-related keywords are included in the tokenized user's speech command. The electronic device (1000) can identify source information of content only when content-related keywords are included in the tokenized user's speech command. However, it is not limited thereto, and the electronic device (1000) can collectively identify source information of content even if content-related keywords are not included in the tokenized user's speech command, and can determine an AI service provider using the source information of content. Since the user is likely to request a command regarding content when content is displayed, the electronic device (1000) can collectively use source information of content.
[0158] Meanwhile, an electronic device (1000) according to one embodiment of the present disclosure may determine at least one of a plurality of electronic devices as an AI service providing device by further utilizing information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information) in addition to source information of the content. This is the same as described in the device determination module (1420) of FIG. 3 and operation 420 of FIG. 4.
[0159] Meanwhile, an electronic device (1000) according to one embodiment of the present disclosure may determine an AI service provision method (e.g., device-based, server-based) by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI-related information, content source information). This is the same as described in the device determination module (1420) of FIG. 3 and operation 430 of FIG. 4.
[0160] In operation 730, the electronic device (1000) can be controlled to perform an action corresponding to a user's speech command through an AI service provider. For example, if the electronic device (1000) determines the electronic device (1000) as an AI service provider, it can process the user's speech command and perform an action corresponding to the user's speech command. This is as described in the command processing module (1430) of FIG. 3 and in operations 440 and 455 of FIG. 4. Alternatively, for example, if the electronic device (1000) determines an external electronic device as an AI service provider, it can control a communication unit to transmit the user's speech command to the external electronic device.
[0161] It is assumed that content is displayed on the screen of an external electronic device. An electronic device (1000) according to one embodiment of the present disclosure determines the external electronic device as an AI service provider based on the fact that the content source of the external electronic device is the external electronic device, and can transmit a user's speech command to the external electronic device through a communication unit. Based on the fact that the content source of the external electronic device is the electronic device (1000), the electronic device (1000) determines the electronic device (1000) as an AI service provider and can perform an operation corresponding to the user's speech command.
[0162] It is assumed that content is displayed on the screen of an electronic device (1000). An electronic device (1000) according to one embodiment of the present disclosure can determine the electronic device (1000) as an AI service provider based on the fact that the electronic device (1000) is the content source of the electronic device (1000) and can perform an operation corresponding to a user's speech command. An electronic device (1000) can determine the external electronic device as an AI service provider based on the fact that the electronic device (1000) is the content source of the electronic device (1000) is an external electronic device and can transmit a user's speech command to the external electronic device through a communication unit.
[0163] FIG. 8 is a flowchart illustrating a method in which a first electronic device and a second electronic device provide an AI service based on content source information according to one embodiment of the present disclosure.
[0164] Referring to FIG. 8, in operation 810, the first electronic device (1001) and the second electronic device (1002) can exchange content source information with each other. For example, the first electronic device (1001) may store at least one of the source information of content displayed on the screen of the first electronic device (1001) or the source information of content displayed on the screen of the second electronic device (1002). The first electronic device (1001) and the second electronic device (1002) may be connected to a network and / or the same user account. This is omitted as it has been described in FIG. 2.
[0165] In operation 820, the first electronic device (1001) can obtain a user's speech command. Operation 820 can correspond to operation 410 of FIG. 4.
[0166] In operation 830, the first electronic device (1001) can perform preprocessing for the user's speech command. Operation 830 can correspond to operation 415 of FIG. 4.
[0167] In operation 840, the first electronic device (1001) can identify source information of content displayed on at least one screen of the first electronic device (1001) or the second electronic device (1002). In operations 840 and 850, if the source of content displayed on at least one screen of the first electronic device (1001) or the second electronic device (1002) is the second electronic device (1002), the first electronic device (1001) can transmit a preprocessed user speech command to the second electronic device (1002). The second electronic device (1002) can receive the preprocessed user speech command. In operations 880 and 890, the second electronic device (1002) can use at least one AI model to analyze the preprocessed user speech command and perform an operation corresponding to the user speech command. Operations 880 and 890 correspond to operations 860 and 870.
[0168] In operations 840 and 860, the first electronic device (1001) can process a user's speech command through the first electronic device (1001) when the source of the content displayed on the screen of at least one of the first electronic device (1001) or the second electronic device (1002) is the first electronic device (1001).
[0169] In operations 860 and 870, when the first electronic device (1001) determines the first electronic device (1001) as an AI service provider, it may operate to provide an AI service corresponding to a user's speech command. In operation 860, the first electronic device (1001) may use at least one AI model to determine the intent included in the user's speech command. In operation 870, the first electronic device (1001) may perform an operation corresponding to the user's speech command based on the analyzed information. Operations 860 and 870 may correspond to operations 440 and 455 of FIG. 4.
[0170] Referring to FIGS. 9 and 10, an operation in which the first electronic device (1001) determines an AI service provider device when the first electronic device (1001) receives a user’s speech command and the second electronic device (1002) displays content is described. Although the first electronic device (1001) is exemplified as a PC and the second electronic device (1002) is a monitor, it is not limited thereto.
[0171] FIG. 9 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure provides an AI service based on content source information of a second electronic device. In FIG. 9, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0172] In operation 910, the first electronic device (1001) can transmit content data to the second electronic device (1002). In operation 920, the second electronic device (1002) can receive content data and display the content on a screen. In this case, the first electronic device (1001) and the second electronic device (1002) can be connected to each other via a wireless connection method such as mirroring or casting, or a wired connection method via a cable (e.g., HDMI). The first electronic device (1001) is a source device that transmits content to the second electronic device (1002), and the second electronic device (1002) may be a sink device that outputs the received content.
[0173] In operation 930, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the first electronic device (1001). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the displayed content changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0174] In operation 940, the first electronic device (1001) can receive a speech command from a user. For example, the first electronic device (1001) can receive a speech command such as "Tell me who is on the screen right now." The first electronic device (1001) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 950, the first electronic device (1001) can determine the first electronic device (1001) as an AI service provider because the content source of the second electronic device (1002) is the first electronic device (1001) itself. Additionally, the first electronic device (1001) can identify the type of AI service required (e.g., natural language processing, object recognition, etc.) based on the tokens. Since the first electronic device (1001) has stored its own device specification information, capability information, on-device AI related information, etc., it can determine whether the first electronic device (1001) can support the necessary AI service and determine the first electronic device (1001) as an AI service providing device.
[0175] In operation 960, the first electronic device (1001) can provide AI services by analyzing a user's speech command through a natural language processing model and performing an operation to recognize an object on the screen through an object recognition model. For example, the first electronic device (1001) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0176] FIG. 10 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure provides an AI service based on content source information of a second electronic device. In FIG. 10, the content source of the second electronic device (1002) is exemplified as the second electronic device (1002).
[0177] In operation 1010, the second electronic device (1002) can display content on a screen. The second electronic device (1002) may be a device equipped with an operating system (OS) and an internet connection function. For example, the second electronic device (1002) may run an OTT (Over-The-Top) application using an internally installed operating system to output content. In this case, the source of the content on the screen of the second electronic device (1002) may be the second electronic device (1002) itself.
[0178] In operation 1020, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the second electronic device (1002). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the displayed content changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0179] In operation 1030, the first electronic device (1001) can receive a user's speech command. For example, the first electronic device (1001) can receive a speech command such as "Tell me who the person is on the screen right now." The first electronic device (1001) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1040, the first electronic device (1001) can determine the second electronic device (1002) as an AI service provider because the content source of the second electronic device (1002) is the second electronic device (1002). Additionally, the first electronic device (1001) can identify the type of AI service required (e.g., natural language processing, object recognition, etc.) based on the tokens. Since the first electronic device (1001) and the second electronic device (1002) have already stored each other's device specification information, capability information, on-device AI-related information, etc., the first electronic device (1001) can determine whether the second electronic device (1002) can support the necessary AI service and can determine the second electronic device (1002) as an AI service provider. In operation 1050, the first electronic device (1001) can transmit a user's speech command to the second electronic device (1002). The second electronic device (1002) can receive the user's speech command.
[0180] In operation 1060, the second electronic device (1002) can provide AI services by performing an operation of analyzing a user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the second electronic device (1002) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0181] Hereinafter, with reference to FIGS. 11 to 13, an operation in which an electronic device (1000) according to one embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information will be described.
[0182] FIG. 11 is a flowchart illustrating a method for an electronic device according to one embodiment of the present disclosure to determine a microphone activation device and an AI service providing device. The electronic device (1000) according to FIG. 11 may be any one of the plurality of electronic devices (1001, 1002) of FIG. 1, FIG. 2, FIG. 4, FIG. 5, FIG. 6, FIG. 9, and FIG. 10. The external electronic device may be another one of the plurality of electronic devices (1001, 1002).
[0183] Referring to FIG. 11, in operation 1110, the electronic device (1000) may decide to activate at least one microphone of the electronic device (1000) or an external electronic device.
[0184] In one embodiment of the present disclosure, both the electronic device (1000) and the external electronic device may be equipped with a microphone. In this case, it may be more efficient to activate only one microphone rather than activating both the electronic device (1000) and the external electronic device.
[0185] An electronic device (1000) according to one embodiment of the present disclosure may determine to activate at least one microphone of the electronic device (1000) or an external electronic device based on information from a plurality of electronic devices. The information from the plurality of electronic devices may include information for determining whether to activate the microphone. For example, the information from the plurality of electronic devices may include at least one of location information between the electronic device and the user, whether to be in low-power mode, or content source information.
[0186] For example, the electronic device (1000) may decide to activate at least one microphone of the electronic device (1000) or an external electronic device based on location information between the electronic device (1000) and the user. For example, in an environment where a monitor and a PC are used, if the monitor is located closer to the user, only the monitor's microphone may be activated and the PC's microphone may be deactivated. The location information between the users may be predetermined or measured by a distance sensor, etc.
[0187] For example, when a low-power mode is set, the electronic device (1000) may decide to activate at least one microphone of the electronic device (1000) or an external electronic device to reduce power consumption. For example, the electronic device (1000) may decide to activate the microphone of a device connected to power, a device with good power consumption, or a device with good performance.
[0188] For example, the electronic device (1000) may decide to activate the microphone of at least one of the electronic device (1000) or the external electronic device based on source information of content displayed on the screen of at least one of the electronic device (1000) or the external electronic device. If the user's speech command is related to the control of the content, it may be efficient to have the source device generating the content receive the user's speech command. For example, the electronic device (1000) may activate the microphone of the source device generating the content when content is displayed on the screen of at least one of the electronic device (1000) or the external electronic device.
[0189] For example, assume a case where content is displayed on the screen of an external electronic device. The electronic device (1000) can identify the source information of the content displayed on the screen of the external electronic device. If the source of the content displayed by the external electronic device is the electronic device (1000) itself, the electronic device (1000) may decide to activate its own microphone. If the source of the content displayed by the external electronic device is the external electronic device, the electronic device (1000) may decide to activate the microphone of the external electronic device.
[0190] Alternatively, for example, assume that content is displayed on the screen of an electronic device (1000). The electronic device (1000) can identify the source information of the content displayed on the screen of the electronic device (1000). If the source of the content displayed by the electronic device (1000) is the electronic device (1000) itself, the electronic device (1000) may decide to activate its own microphone. If the source of the content displayed by the electronic device (1000) is an external electronic device, the electronic device (1000) may decide to activate the microphone of the external electronic device.
[0191] The electronic device (1000) can generate a microphone activation signal to activate its microphone and activate its microphone. Additionally, the electronic device (1000) can transmit a microphone deactivation signal to an external electronic device through a communication unit to deactivate the external electronic device's microphone. The external electronic device can receive the microphone deactivation signal from the electronic device (1000) through the communication unit and deactivate its microphone.
[0192] Alternatively, the electronic device (1000) may generate a microphone disable signal to disable its own microphone and disable its own microphone. To enable the microphone of an external electronic device, the electronic device (1000) may transmit a microphone enable signal to the external electronic device through a communication unit. The external electronic device may receive the microphone enable signal from the electronic device (1000) through the communication unit and enable its own microphone.
[0193] In the present disclosure, the entity determining the microphone activation device is exemplified as an electronic device (1000), but is not limited thereto. For example, the entity determining the microphone activation device may be an external electronic device, or a server (e.g., 2000 of FIG. 2) that manages information between the electronic device (1000) and the external electronic device.
[0194] In operation 1120, the electronic device (1000) can obtain a user's speech command through the microphone when its microphone is activated. This is the same as described in operation 710 of FIG. 7.
[0195] In operation 1130, the electronic device (1000) can determine an AI service providing device based on the user's speech command, information of the electronic device (1000), and information of an external electronic device. This is the same as described in the device determination module (1420) of FIG. 3.
[0196] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service providing device in response to a user's speech command based on source information of content displayed on at least one screen of the electronic device (1000) or an external electronic device. For example, when content is displayed on at least one screen of the electronic device (1000) or an external electronic device, the electronic device (1000) may determine a source device that generates the content as an AI service providing device. This is the same as described in the device determination module (1420) of FIG. 3 and operation 720 of FIG. 7.
[0197] An electronic device (1000) according to one embodiment of the present disclosure may determine at least one of a plurality of electronic devices as an AI service providing device by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information) in addition to source information of content. This is the same as described in the device determination module (1420) of FIG. 3 and operation 420 of FIG. 4.
[0198] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service provision method by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information, content source information). This is the same as described in the device determination module (1420) of FIG. 3 and operation 430 of FIG. 4.
[0199] In operation 1140, the electronic device (1000) can be controlled to perform an action corresponding to a user's speech command through an AI service provider.
[0200] For example, if the electronic device (1000) determines the electronic device (1000) as an AI service provider, it can process a user's speech command and perform an action corresponding to the user's speech command. This is as described in the command processing module (1430) of FIG. 3 and in action 440 and action 455 of FIG. 4.
[0201] When the electronic device (1000) determines an external electronic device as an AI service provider, it can control the communication unit to transmit the user's speech command to the external electronic device.
[0202] Referring to FIGS. 12 and 13, when the second electronic device (1002) displays content, the first electronic device (1001) determines a microphone activation device, and the first electronic device (1001) or the second electronic device (1002) determines an AI service providing device. Although the first electronic device (1001) is exemplified as a PC and the second electronic device (1002) is a monitor, it is not limited thereto.
[0203] FIG. 12 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device. In FIG. 12, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0204] In operation 1210, the first electronic device (1001) can transmit content data to the second electronic device (1002). In operation 1220, the second electronic device (1002) can receive content data and display the content on a screen. In this case, the first electronic device (1001) and the second electronic device (1002) can be connected to each other via a wireless connection method such as mirroring or casting, or a wired connection method via a cable (e.g., HDMI). The first electronic device (1001) is a source device that transmits content to the second electronic device (1002), and the second electronic device (1002) may be a sink device that outputs the received content.
[0205] In operation 1230, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the first electronic device (1001). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the displayed content changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0206] In operation 1240, the first electronic device (1001) can determine the first electronic device (1001) as a microphone activation device because the content source of the second electronic device (1002) is the first electronic device (1001) itself. In operation 1250, the first electronic device (1001) can transmit a microphone deactivation signal to the second electronic device (1002) through a communication unit. The microphone of the first electronic device (1001) can be activated, and the microphone of the second electronic device (1002) can be deactivated.
[0207] In operation 1260, the first electronic device (1001) can receive a speech command from a user. For example, the first electronic device (1001) can receive a speech command such as "Tell me who the person is on the screen right now." The first electronic device (1001) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1270, the first electronic device (1001) can determine the first electronic device (1001) as an AI service provider because the content source of the second electronic device (1002) is the first electronic device (1001) itself. In operation 1280, the first electronic device (1001) can provide an AI service by performing an operation of analyzing the user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the first electronic device (1001) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0208] FIG. 13 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device. In FIG. 13, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0209] In operation 1310, the second electronic device (1002) can display content on a screen. The second electronic device (1002) may be a device equipped with an operating system (OS) and an internet connection function. For example, the second electronic device (1002) may run an OTT (Over-The-Top) application using an internally installed operating system to output content. In this case, the source of the content on the screen of the second electronic device (1002) may be the second electronic device (1002) itself.
[0210] In operation 1320, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the second electronic device (1002). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the content being displayed changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0211] In operation 1330, the first electronic device (1001) can determine the second electronic device (1002) as a microphone activation device because the content source of the second electronic device (1002) is the second electronic device (1002). In operation 1340, the first electronic device (1001) can transmit a microphone activation signal to the second electronic device (1002) through a communication unit. The microphone of the first electronic device (1001) can be deactivated, and the microphone of the second electronic device (1002) can be activated.
[0212] In operation 1350, the second electronic device (1002) can receive a speech command from a user. For example, the second electronic device (1002) can receive a speech command such as "Tell me who the person is on the screen right now." The second electronic device (1002) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1360, the second electronic device (1002) can determine the second electronic device (1002) as an AI service provider because the content source of the second electronic device (1002) is the second electronic device (1002) itself. In operation 1370, the second electronic device (1002) can provide an AI service by performing an operation of analyzing the user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the second electronic device (1002) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0213] FIG. 14 is a detailed block diagram of an electronic device according to one embodiment of the present disclosure.
[0214] Referring to FIG. 14, the electronic device (1000) may include a processor (1100) (e.g., including a processing circuit), memory (1400), a tuner unit (1403) (e.g., including a tuner), a communication unit (1200) (e.g., including a communication circuit), a sensing unit (1404) (e.g., including a circuit), an input / output unit (1405) (e.g., including an input / output circuit), a video processing unit (1450) (e.g., including various circuits and / or executable program instructions), a display (1460), an audio processing unit (1470) (e.g., including various circuits and / or executable program instructions), an audio output unit (1480) (e.g., including an audio output circuit), and an input interface (1300) (e.g., including an input circuit).
[0215] The tuner unit (1403) includes various circuits and can select only the frequency of the channel to be received by the electronic device (1000) from among many radio wave components by tuning through amplification, mixing, resonance, etc. of broadcast content received via wired or wireless connection. The content received through the tuner unit (1403) is decoded and separated into audio, video, and / or additional information. The separated audio, video, and / or additional information can be stored in memory (1400) under the control of the processor (1100).
[0216] The communication unit (1200) includes various communication circuits and can connect the electronic device (1000) to peripheral devices, external devices, servers, mobile terminals, etc. under the control of the processor (1100). The communication unit (1200) may include at least one communication module capable of performing wireless communication. The communication unit (1200) may include at least one of a wireless LAN module (1421), a Bluetooth module (1422), and a wired Ethernet (1423) in accordance with the performance and structure of the electronic device (1000).
[0217] The wireless LAN module (1421) can transmit and receive Wi-Fi signals with a peripheral device according to the Wi-Fi communication standard. The Bluetooth module (1422) can receive Bluetooth signals transmitted from a peripheral device according to the Bluetooth communication standard.
[0218] The detection unit (1430) includes various circuits and detects the user's voice, user image, or user interaction, and may include a microphone, camera unit, light receiver, and sensing unit.
[0219] The input / output unit (1405) includes various circuits and can receive video (e.g., dynamic image signal or still image signal), audio (e.g., voice signal or music signal), and additional information from external devices under the control of the processor (1100). The input / output unit (1405) may include one of an HDMI port (High-Definition Multimedia Interface port), a component jack, a PC port, and a USB port. In addition to these, the input / output unit (1405) may further include a DisplayPort (DP), Thunderbolt, and MHL (Mobile High-Definition Link). The input / output unit (1405) may further include ports for separate output of video and audio.
[0220] The video processing unit (1450) includes various circuits and / or executable program instructions, processes video data to be displayed by the display (1460), and can perform various image processing operations such as decoding, rendering, scaling, noise filtering, frame rate conversion, and resolution conversion on the video data. For example, the video processing unit (1450) may include various image processing circuits. For example, the video processing unit (1450) may include a media codec for processing video content.
[0221] The display (1460) can receive content from a broadcasting station, receive content from an external device such as an external server or external storage media, or output content provided by various apps, such as an OTT service provider or a content provider. The display (1460) can display video-processed content.
[0222] The audio processing unit (1470) includes various circuits and / or executable program instructions and performs processing on audio data. Various processing such as decoding, amplification, and noise filtering on audio data can be performed in the audio processing unit (1470).
[0223] The audio output unit (1480) includes various circuits and can output audio included in content received through the tuner unit (1403) under the control of the processor (1100), audio input through the communication unit (1200) or input / output unit (1405), and audio stored in memory (1400). The audio output unit (1480) may include at least one of a speaker, headphones, or S / PDIF (Sony / Philips Digital Interface: output terminal).
[0224] The input interface (1300) includes various input circuits and can receive user input for controlling the electronic device (1000). The input interface (1300) may include, but is not limited to, various forms of user input devices including, a touch panel for detecting user touch, a button for receiving user push operation, a wheel for receiving user rotation operation, a keyboard, a dome switch, a microphone for voice recognition, a motion detection sensor for sensing motion, etc.
[0225] An electronic device according to one embodiment of the present disclosure includes at least one processor and a memory comprising one or more storage media for storing one or more instructions.
[0226] According to one embodiment of the present disclosure, the electronic device obtains a user's speech command by having at least one processor execute the one or more instructions individually or collectively.
[0227] According to one embodiment of the present disclosure, by having at least one processor execute the one or more instructions individually or collectively, the electronic device determines at least one of the electronic device or the external electronic device as an AI service providing device based on the user's speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0228] According to one embodiment of the present disclosure, the electronic device controls the electronic device to perform an operation corresponding to the user's speech command through the determined AI service provider by executing the one or more instructions individually or collectively.
[0229] The electronic device according to one embodiment of the present disclosure may further include a communication unit.
[0230] According to one embodiment of the present disclosure, the electronic device can identify source information of content displayed on the screen of the external electronic device by the at least one processor executing the one or more instructions individually or in combination.
[0231] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device can transmit the user's speech command to the external electronic device through the communication unit, based on the fact that the content source of the external electronic device is the external electronic device.
[0232] According to one embodiment of the present disclosure, by having at least one processor execute the one or more instructions individually or in combination, the electronic device can perform an operation corresponding to the user's speech command based on the fact that the content source of the external electronic device is the electronic device.
[0233] Source information of the content of the external electronic device according to one embodiment of the present disclosure may be received from the external electronic device and stored in the memory.
[0234] According to one embodiment of the present disclosure, the electronic device can identify source information of content displayed on the screen of the electronic device by having the at least one processor execute the one or more instructions individually or in combination.
[0235] According to one embodiment of the present disclosure, by having at least one processor execute one or more instructions individually or in combination, the electronic device can perform an operation corresponding to the user's speech command based on the fact that the content source of the electronic device is the electronic device.
[0236] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device can transmit the user's speech command to the external electronic device through the communication unit, based on the fact that the content source of the electronic device is the external electronic device.
[0237] According to one embodiment of the present disclosure, by having at least one processor execute one or more instructions individually or in combination, the electronic device can analyze the user's intent regarding the user's speech command from one or more tokens corresponding to the user's speech command, based on the fact that the electronic device is determined to be the AI service provider.
[0238] According to one embodiment of the present disclosure, the electronic device can execute an operation corresponding to the user's intention by having the at least one processor execute the one or more instructions individually or in combination.
[0239] An AI service providing device corresponding to a user's speech command according to one embodiment of the present disclosure may be determined based on at least one of the device specification information, capability information, on-device AI related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI related information.
[0240] According to one embodiment of the present disclosure, the device specification information may include at least one of device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
[0241] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
[0242] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate a microphone of the electronic device or at least one of the external electronic device based on source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0243] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate the microphone of the external electronic device based on the fact that the content source of the external electronic device is the external electronic device.
[0244] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate the microphone of the electronic device based on the fact that the content source of the external electronic device is the electronic device.
[0245] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate at least one of the microphone of the electronic device or the microphone of the external electronic device based on at least one of the location of the electronic device and the user and the location of the external electronic device and the user, or whether it is in a low-power mode.
[0246] A method of operation of an electronic device for determining an Artificial Intelligence (AI) service providing device according to one embodiment of the present disclosure comprises: acquiring a user’s speech command; determining at least one of the electronic device or the external electronic device as an AI service providing device based on the user’s speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device; and controlling the electronic device to perform an operation corresponding to the user’s speech command through the determined AI service providing device.
[0247] The method according to one embodiment of the present disclosure may further include the steps of: identifying source information of content displayed on the screen of the external electronic device; transmitting a speech command of the user to the external electronic device through a communication unit based on the fact that the content source of the external electronic device is the external electronic device; and performing an operation corresponding to the speech command of the user based on the fact that the content source of the external electronic device is the electronic device.
[0248] Source information of the content of the external electronic device according to one embodiment of the present disclosure may be received from the external electronic device and stored in the memory.
[0249] The method according to one embodiment of the present disclosure may include the steps of: identifying source information of content displayed on the screen of the electronic device; performing an operation corresponding to the user's speech command based on the fact that the content source of the electronic device is the electronic device; and transmitting the user's speech command to the external electronic device through the communication unit based on the fact that the content source of the electronic device is the external electronic device.
[0250] The step of controlling the electronic device to perform an operation corresponding to the user’s speech command according to one embodiment of the present disclosure may include, based on the fact that the electronic device is determined to be the AI service provider, analyzing the user’s intent regarding the user’s speech command from one or more tokens corresponding to the user’s speech command, and executing an operation corresponding to the user’s intent.
[0251] An AI service providing device corresponding to a user's speech command according to one embodiment of the present disclosure may be determined based on at least one of the device specification information, capability information, on-device AI related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI related information.
[0252] According to one embodiment of the present disclosure, the device specification information may include at least one of device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
[0253] The method according to one embodiment of the present disclosure may further include the step of determining to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
[0254] The method according to one embodiment of the present disclosure may further include the step of determining to activate a microphone of at least one of the electronic device or the external electronic device based on source information of content displayed on a screen of at least one of the electronic device or the external electronic device, wherein the content source of the external electronic device is the external electronic device, and the step of determining to activate a microphone of the external electronic device based on the content source of the external electronic device is the electronic device.
[0255] A device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory storage medium' simply means that it is a tangible device and does not contain a signal (e.g., electromagnetic waves), and the term does not distinguish between cases where data is stored semi-permanently and cases where it is stored temporarily. For example, a 'non-transitory storage medium' may include a buffer in which data is stored temporarily.
[0256] According to one embodiment, the method according to the various embodiments disclosed herein may be provided by being included in a computer program product. The computer program product may be traded between a seller and a buyer as a product. The computer program product may be distributed in the form of a device-readable storage medium (e.g., compact disc read-only memory (CD-ROM)), or distributed online (e.g., download or upload) through an application store or directly between two user devices (e.g., smartphones). In the case of online distribution, at least a portion of the computer program product (e.g., downloadable app) may be temporarily stored or temporarily created on a device-readable storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.
[0257] Alternatively, for example, assume that content is displayed on the screen of an electronic device (1000). The electronic device (1000) can identify the source information of the content displayed on the screen of the electronic device (1000). If the source of the content displayed by the electronic device (1000) is the electronic device (1000) itself, the electronic device (1000) may decide to activate its own microphone. If the source of the content displayed by the electronic device (1000) is an external electronic device, the electronic device (1000) may decide to activate the microphone of the external electronic device.
[0258] The electronic device (1000) can generate a microphone activation signal to activate its microphone and activate its microphone. Additionally, the electronic device (1000) can transmit a microphone deactivation signal to an external electronic device through a communication unit to deactivate the external electronic device's microphone. The external electronic device can receive the microphone deactivation signal from the electronic device (1000) through the communication unit and deactivate its microphone.
[0259] Alternatively, the electronic device (1000) may generate a microphone disable signal to disable its own microphone and disable its own microphone. To enable the microphone of an external electronic device, the electronic device (1000) may transmit a microphone enable signal to the external electronic device through a communication unit. The external electronic device may receive the microphone enable signal from the electronic device (1000) through the communication unit and enable its own microphone.
[0260] In the present disclosure, the entity determining the microphone activation device is exemplified as an electronic device (1000), but is not limited thereto. For example, the entity determining the microphone activation device may be an external electronic device, or a server (e.g., 2000 of FIG. 2) that manages information between the electronic device (1000) and the external electronic device.
[0261] In operation 1120, the electronic device (1000) can obtain a user's speech command through the microphone when its microphone is activated. This is the same as described in operation 710 of FIG. 7.
[0262] In operation 1130, the electronic device (1000) can determine an AI service providing device based on the user's speech command, information of the electronic device (1000), and information of an external electronic device. This is the same as described in the device determination module (1420) of FIG. 3.
[0263] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service providing device in response to a user's speech command based on source information of content displayed on at least one screen of the electronic device (1000) or an external electronic device. For example, when content is displayed on at least one screen of the electronic device (1000) or an external electronic device, the electronic device (1000) may determine a source device that generates the content as an AI service providing device. This is the same as described in the device determination module (1420) of FIG. 3 and operation 720 of FIG. 7.
[0264] An electronic device (1000) according to one embodiment of the present disclosure may determine at least one of a plurality of electronic devices as an AI service providing device by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information) in addition to source information of content. This is the same as described in the device determination module (1420) of FIG. 3 and operation 420 of FIG. 4.
[0265] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service provision method by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information, content source information). This is the same as described in the device determination module (1420) of FIG. 3 and operation 430 of FIG. 4.
[0266] In operation 1140, the electronic device (1000) can be controlled to perform an action corresponding to a user's speech command through an AI service provider.
[0267] For example, if the electronic device (1000) determines the electronic device (1000) as an AI service provider, it can process a user's speech command and perform an action corresponding to the user's speech command. This is as described in the command processing module (1430) of FIG. 3 and in action 440 and action 455 of FIG. 4.
[0268] When the electronic device (1000) determines an external electronic device as an AI service provider, it can control the communication unit to transmit the user's speech command to the external electronic device.
[0269] Referring to FIGS. 12 and 13, when the second electronic device (1002) displays content, the first electronic device (1001) determines a microphone activation device, and the first electronic device (1001) or the second electronic device (1002) determines an AI service providing device. Although the first electronic device (1001) is exemplified as a PC and the second electronic device (1002) is a monitor, it is not limited thereto.
[0270] FIG. 12 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device. In FIG. 12, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0271] In operation 1210, the first electronic device (1001) can transmit content data to the second electronic device (1002). In operation 1220, the second electronic device (1002) can receive content data and display the content on a screen. In this case, the first electronic device (1001) and the second electronic device (1002) can be connected to each other via a wireless connection method such as mirroring or casting, or a wired connection method via a cable (e.g., HDMI). The first electronic device (1001) is a source device that transmits content to the second electronic device (1002), and the second electronic device (1002) may be a sink device that outputs the received content.
[0272] In operation 1230, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the first electronic device (1001). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the displayed content changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0273] In operation 1240, the first electronic device (1001) can determine the first electronic device (1001) as a microphone activation device because the content source of the second electronic device (1002) is the first electronic device (1001) itself. In operation 1250, the first electronic device (1001) can transmit a microphone deactivation signal to the second electronic device (1002) through a communication unit. The microphone of the first electronic device (1001) can be activated, and the microphone of the second electronic device (1002) can be deactivated.
[0274] In operation 1260, the first electronic device (1001) can receive a speech command from a user. For example, the first electronic device (1001) can receive a speech command such as "Tell me who the person is on the screen right now." The first electronic device (1001) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1270, the first electronic device (1001) can determine the first electronic device (1001) as an AI service provider because the content source of the second electronic device (1002) is the first electronic device (1001) itself. In operation 1280, the first electronic device (1001) can provide an AI service by performing an operation of analyzing the user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the first electronic device (1001) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0275] FIG. 13 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device. In FIG. 13, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0276] In operation 1310, the second electronic device (1002) can display content on a screen. The second electronic device (1002) may be a device equipped with an operating system (OS) and an internet connection function. For example, the second electronic device (1002) may run an OTT (Over-The-Top) application using an internally installed operating system to output content. In this case, the source of the content on the screen of the second electronic device (1002) may be the second electronic device (1002) itself.
[0277] In operation 1320, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the second electronic device (1002). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the content being displayed changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0278] In operation 1330, the first electronic device (1001) can determine the second electronic device (1002) as a microphone activation device because the content source of the second electronic device (1002) is the second electronic device (1002). In operation 1340, the first electronic device (1001) can transmit a microphone activation signal to the second electronic device (1002) through a communication unit. The microphone of the first electronic device (1001) can be deactivated, and the microphone of the second electronic device (1002) can be activated.
[0279] In operation 1350, the second electronic device (1002) can receive a speech command from a user. For example, the second electronic device (1002) can receive a speech command such as "Tell me who the person is on the screen right now." The second electronic device (1002) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1360, the second electronic device (1002) can determine the second electronic device (1002) as an AI service provider because the content source of the second electronic device (1002) is the second electronic device (1002) itself. In operation 1370, the second electronic device (1002) can provide an AI service by performing an operation of analyzing the user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the second electronic device (1002) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0280] FIG. 14 is a detailed block diagram of an electronic device according to one embodiment of the present disclosure.
[0281] Referring to FIG. 14, the electronic device (1000) may include a processor (1100) (e.g., including a processing circuit), memory (1400), a tuner unit (1403) (e.g., including a tuner), a communication unit (1200) (e.g., including a communication circuit), a sensing unit (1404) (e.g., including a circuit), an input / output unit (1405) (e.g., including an input / output circuit), a video processing unit (1450) (e.g., including various circuits and / or executable program instructions), a display (1460), an audio processing unit (1470) (e.g., including various circuits and / or executable program instructions), an audio output unit (1480) (e.g., including an audio output circuit), and an input interface (1300) (e.g., including an input circuit).
[0282] The tuner unit (1403) includes various circuits and can select only the frequency of the channel to be received by the electronic device (1000) from among many radio wave components by tuning through amplification, mixing, resonance, etc. of broadcast content received via wired or wireless connection. The content received through the tuner unit (1403) is decoded and separated into audio, video, and / or additional information. The separated audio, video, and / or additional information can be stored in memory (1400) under the control of the processor (1100).
[0283] The communication unit (1200) includes various communication circuits and can connect the electronic device (1000) to peripheral devices, external devices, servers, mobile terminals, etc. under the control of the processor (1100). The communication unit (1200) may include at least one communication module capable of performing wireless communication. The communication unit (1200) may include at least one of a wireless LAN module (1421), a Bluetooth module (1422), and a wired Ethernet (1423) in accordance with the performance and structure of the electronic device (1000).
[0284] The wireless LAN module (1421) can transmit and receive Wi-Fi signals with a peripheral device according to the Wi-Fi communication standard. The Bluetooth module (1422) can receive Bluetooth signals transmitted from a peripheral device according to the Bluetooth communication standard.
[0285] The detection unit (1430) includes various circuits and detects the user's voice, user image, or user interaction, and may include a microphone, camera unit, light receiver, and sensing unit.
[0286] The input / output unit (1405) includes various circuits and can receive video (e.g., dynamic image signal or still image signal), audio (e.g., voice signal or music signal), and additional information from external devices under the control of the processor (1100). The input / output unit (1405) may include one of an HDMI port (High-Definition Multimedia Interface port), a component jack, a PC port, and a USB port. In addition to these, the input / output unit (1405) may further include a DisplayPort (DP), Thunderbolt, and MHL (Mobile High-Definition Link). The input / output unit (1405) may further include ports for separate output of video and audio.
[0287] The video processing unit (1450) includes various circuits and / or executable program instructions, processes video data to be displayed by the display (1460), and can perform various image processing operations such as decoding, rendering, scaling, noise filtering, frame rate conversion, and resolution conversion on the video data. For example, the video processing unit (1450) may include various image processing circuits. For example, the video processing unit (1450) may include a media codec for processing video content.
[0288] The display (1460) can receive content from a broadcasting station, receive content from an external device such as an external server or external storage media, or output content provided by various apps, such as an OTT service provider or a content provider. The display (1460) can display video-processed content.
[0289] The audio processing unit (1470) includes various circuits and / or executable program instructions and performs processing on audio data. Various processing such as decoding, amplification, and noise filtering on audio data can be performed in the audio processing unit (1470).
[0290] The audio output unit (1480) includes various circuits and can output audio included in content received through the tuner unit (1403) under the control of the processor (1100), audio input through the communication unit (1200) or input / output unit (1405), and audio stored in memory (1400). The audio output unit (1480) may include at least one of a speaker, headphones, or S / PDIF (Sony / Philips Digital Interface: output terminal).
[0291] The input interface (1300) includes various input circuits and can receive user input for controlling the electronic device (1000). The input interface (1300) may include, but is not limited to, various forms of user input devices including, a touch panel for detecting user touch, a button for receiving user push operation, a wheel for receiving user rotation operation, a keyboard, a dome switch, a microphone for voice recognition, a motion detection sensor for sensing motion, etc.
[0292] An electronic device according to one embodiment of the present disclosure includes at least one processor and a memory comprising one or more storage media for storing one or more instructions.
[0293] According to one embodiment of the present disclosure, the electronic device obtains a user's speech command by having at least one processor execute the one or more instructions individually or collectively.
[0294] According to one embodiment of the present disclosure, by having at least one processor execute the one or more instructions individually or collectively, the electronic device determines at least one of the electronic device or the external electronic device as an AI service providing device based on the user's speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0295] According to one embodiment of the present disclosure, the electronic device controls the electronic device to perform an operation corresponding to the user's speech command through the determined AI service provider by executing the one or more instructions individually or collectively.
[0296] The electronic device according to one embodiment of the present disclosure may further include a communication unit.
[0297] According to one embodiment of the present disclosure, the electronic device can identify source information of content displayed on the screen of the external electronic device by the at least one processor executing the one or more instructions individually or in combination.
[0298] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device can transmit the user's speech command to the external electronic device through the communication unit, based on the fact that the content source of the external electronic device is the external electronic device.
[0299] According to one embodiment of the present disclosure, by having at least one processor execute the one or more instructions individually or in combination, the electronic device can perform an operation corresponding to the user's speech command based on the fact that the content source of the external electronic device is the electronic device.
[0300] Source information of the content of the external electronic device according to one embodiment of the present disclosure may be received from the external electronic device and stored in the memory.
[0301] According to one embodiment of the present disclosure, the electronic device can identify source information of content displayed on the screen of the electronic device by having the at least one processor execute the one or more instructions individually or in combination.
[0302] According to one embodiment of the present disclosure, by having at least one processor execute one or more instructions individually or in combination, the electronic device can perform an operation corresponding to the user's speech command based on the fact that the content source of the electronic device is the electronic device.
[0303] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device can transmit the user's speech command to the external electronic device through the communication unit, based on the fact that the content source of the electronic device is the external electronic device.
[0304] According to one embodiment of the present disclosure, by having at least one processor execute one or more instructions individually or in combination, the electronic device can analyze the user's intent regarding the user's speech command from one or more tokens corresponding to the user's speech command, based on the fact that the electronic device is determined to be the AI service provider.
[0305] According to one embodiment of the present disclosure, the electronic device can execute an operation corresponding to the user's intention by having the at least one processor execute the one or more instructions individually or in combination.
[0306] An AI service providing device corresponding to a user's speech command according to one embodiment of the present disclosure may be determined based on at least one of the device specification information, capability information, on-device AI related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI related information.
[0307] According to one embodiment of the present disclosure, the device specification information may include at least one of device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
[0308] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
[0309] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate a microphone of the electronic device or at least one of the external electronic device based on source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0310] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate the microphone of the external electronic device based on the fact that the content source of the external electronic device is the external electronic device.
[0311] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate the microphone of the electronic device based on the fact that the content source of the external electronic device is the electronic device.
[0312] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate at least one of the microphone of the electronic device or the microphone of the external electronic device based on at least one of the location of the electronic device and the user and the location of the external electronic device and the user, or whether it is in a low-power mode.
[0313] A method of operation of an electronic device for determining an Artificial Intelligence (AI) service providing device according to one embodiment of the present disclosure comprises: acquiring a user’s speech command; determining at least one of the electronic device or the external electronic device as an AI service providing device based on the user’s speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device; and controlling the electronic device to perform an operation corresponding to the user’s speech command through the determined AI service providing device.
[0314] The method according to one embodiment of the present disclosure may further include the steps of: identifying source information of content displayed on the screen of the external electronic device; transmitting a speech command of the user to the external electronic device through a communication unit based on the fact that the content source of the external electronic device is the external electronic device; and performing an operation corresponding to the speech command of the user based on the fact that the content source of the external electronic device is the electronic device.
[0315] Source information of the content of the external electronic device according to one embodiment of the present disclosure may be received from the external electronic device and stored in the memory.
[0316] The method according to one embodiment of the present disclosure may include the steps of: identifying source information of content displayed on the screen of the electronic device; performing an operation corresponding to the user's speech command based on the fact that the content source of the electronic device is the electronic device; and transmitting the user's speech command to the external electronic device through the communication unit based on the fact that the content source of the electronic device is the external electronic device.
[0317] The step of controlling the electronic device to perform an operation corresponding to the user’s speech command according to one embodiment of the present disclosure may include, based on the fact that the electronic device is determined to be the AI service provider, analyzing the user’s intent regarding the user’s speech command from one or more tokens corresponding to the user’s speech command, and executing an operation corresponding to the user’s intent.
[0318] An AI service providing device corresponding to a user's speech command according to one embodiment of the present disclosure may be determined based on at least one of the device specification information, capability information, on-device AI related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI related information.
[0319] According to one embodiment of the present disclosure, the device specification information may include at least one of device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
[0320] The method according to one embodiment of the present disclosure may further include the step of determining to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
[0321] The method according to one embodiment of the present disclosure may further include the step of determining to activate a microphone of at least one of the electronic device or the external electronic device based on source information of content displayed on a screen of at least one of the electronic device or the external electronic device, wherein the content source of the external electronic device is the external electronic device, and the step of determining to activate a microphone of the external electronic device based on the content source of the external electronic device is the electronic device.
[0322] A device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory storage medium' simply means that it is a tangible device and does not contain a signal (e.g., electromagnetic waves), and the term does not distinguish between cases where data is stored semi-permanently and cases where it is stored temporarily. For example, a 'non-transitory storage medium' may include a buffer in which data is stored temporarily.
[0323] According to one embodiment, the method according to the various embodiments disclosed herein may be provided by being included in a computer program product. The computer program product may be traded between a seller and a buyer as a product. The computer program product may be distributed in the form of a device-readable storage medium (e.g., compact disc read-only memory (CD-ROM)), or distributed online (e.g., download or upload) through an application store or directly between two user devices (e.g., smartphones). In the case of online distribution, at least a portion of the computer program product (e.g., downloadable app) may be temporarily stored or temporarily created on a device-readable storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.
[0324] Alternatively, for example, assume that content is displayed on the screen of an electronic device (1000). The electronic device (1000) can identify the source information of the content displayed on the screen of the electronic device (1000). If the source of the content displayed by the electronic device (1000) is the electronic device (1000) itself, the electronic device (1000) may decide to activate its own microphone. If the source of the content displayed by the electronic device (1000) is an external electronic device, the electronic device (1000) may decide to activate the microphone of the external electronic device.
[0325] The electronic device (1000) can generate a microphone activation signal to activate its microphone and activate its microphone. Additionally, the electronic device (1000) can transmit a microphone deactivation signal to an external electronic device through a communication unit to deactivate the external electronic device's microphone. The external electronic device can receive the microphone deactivation signal from the electronic device (1000) through the communication unit and deactivate its microphone.
[0326] Alternatively, the electronic device (1000) may generate a microphone disable signal to disable its own microphone and disable its own microphone. To enable the microphone of an external electronic device, the electronic device (1000) may transmit a microphone enable signal to the external electronic device through a communication unit. The external electronic device may receive the microphone enable signal from the electronic device (1000) through the communication unit and enable its own microphone.
[0327] In the present disclosure, the entity determining the microphone activation device is exemplified as an electronic device (1000), but is not limited thereto. For example, the entity determining the microphone activation device may be an external electronic device, or a server (e.g., 2000 of FIG. 2) that manages information between the electronic device (1000) and the external electronic device.
[0328] In operation 1120, the electronic device (1000) can obtain a user's speech command through the microphone when its microphone is activated. This is the same as described in operation 710 of FIG. 7.
[0329] In operation 1130, the electronic device (1000) can determine an AI service providing device based on the user's speech command, information of the electronic device (1000), and information of an external electronic device. This is the same as described in the device determination module (1420) of FIG. 3.
[0330] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service providing device in response to a user's speech command based on source information of content displayed on at least one screen of the electronic device (1000) or an external electronic device. For example, when content is displayed on at least one screen of the electronic device (1000) or an external electronic device, the electronic device (1000) may determine a source device that generates the content as an AI service providing device. This is the same as described in the device determination module (1420) of FIG. 3 and operation 720 of FIG. 7.
[0331] An electronic device (1000) according to one embodiment of the present disclosure may determine at least one of a plurality of electronic devices as an AI service providing device by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information) in addition to source information of content. This is the same as described in the device determination module (1420) of FIG. 3 and operation 420 of FIG. 4.
[0332] An electronic device (1000) according to one embodiment of the present disclosure may determine an AI service provision method by using information of a plurality of electronic devices (e.g., device specification information, capability information, on-device AI related information, content source information). This is the same as described in the device determination module (1420) of FIG. 3 and operation 430 of FIG. 4.
[0333] In operation 1140, the electronic device (1000) can be controlled to perform an action corresponding to a user's speech command through an AI service provider.
[0334] For example, if the electronic device (1000) determines the electronic device (1000) as an AI service provider, it can process a user's speech command and perform an action corresponding to the user's speech command. This is as described in the command processing module (1430) of FIG. 3 and in action 440 and action 455 of FIG. 4.
[0335] When the electronic device (1000) determines an external electronic device as an AI service provider, it can control the communication unit to transmit the user's speech command to the external electronic device.
[0336] Referring to FIGS. 12 and 13, when the second electronic device (1002) displays content, the first electronic device (1001) determines a microphone activation device, and the first electronic device (1001) or the second electronic device (1002) determines an AI service providing device. Although the first electronic device (1001) is exemplified as a PC and the second electronic device (1002) is a monitor, it is not limited thereto.
[0337] FIG. 12 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device. In FIG. 12, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0338] In operation 1210, the first electronic device (1001) can transmit content data to the second electronic device (1002). In operation 1220, the second electronic device (1002) can receive content data and display the content on a screen. In this case, the first electronic device (1001) and the second electronic device (1002) can be connected to each other via a wireless connection method such as mirroring or casting, or a wired connection method via a cable (e.g., HDMI). The first electronic device (1001) is a source device that transmits content to the second electronic device (1002), and the second electronic device (1002) may be a sink device that outputs the received content.
[0339] In operation 1230, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the first electronic device (1001). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the displayed content changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0340] In operation 1240, the first electronic device (1001) can determine the first electronic device (1001) as a microphone activation device because the content source of the second electronic device (1002) is the first electronic device (1001) itself. In operation 1250, the first electronic device (1001) can transmit a microphone deactivation signal to the second electronic device (1002) through a communication unit. The microphone of the first electronic device (1001) can be activated, and the microphone of the second electronic device (1002) can be deactivated.
[0341] In operation 1260, the first electronic device (1001) can receive a speech command from a user. For example, the first electronic device (1001) can receive a speech command such as "Tell me who the person is on the screen right now." The first electronic device (1001) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1270, the first electronic device (1001) can determine the first electronic device (1001) as an AI service provider because the content source of the second electronic device (1002) is the first electronic device (1001) itself. In operation 1280, the first electronic device (1001) can provide an AI service by performing an operation of analyzing the user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the first electronic device (1001) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0342] FIG. 13 is a diagram illustrating an example of an operation in which a first electronic device according to an embodiment of the present disclosure determines a microphone activation device and an AI service providing device based on content source information of a second electronic device. In FIG. 13, the content source of the second electronic device (1002) is exemplified as the first electronic device (1001).
[0343] In operation 1310, the second electronic device (1002) can display content on a screen. The second electronic device (1002) may be a device equipped with an operating system (OS) and an internet connection function. For example, the second electronic device (1002) may run an OTT (Over-The-Top) application using an internally installed operating system to output content. In this case, the source of the content on the screen of the second electronic device (1002) may be the second electronic device (1002) itself.
[0344] In operation 1320, the second electronic device (1002) can transmit content source information to the first electronic device (1001) regarding that the source of the content is the second electronic device (1002). The second electronic device (1002) can share the content source information with the first electronic device (1001) whenever the source of the content being displayed changes or periodically. The first electronic device (1001) and the second electronic device (1002) can store each other's content source information.
[0345] In operation 1330, the first electronic device (1001) can determine the second electronic device (1002) as a microphone activation device because the content source of the second electronic device (1002) is the second electronic device (1002). In operation 1340, the first electronic device (1001) can transmit a microphone activation signal to the second electronic device (1002) through a communication unit. The microphone of the first electronic device (1001) can be deactivated, and the microphone of the second electronic device (1002) can be activated.
[0346] In operation 1350, the second electronic device (1002) can receive a speech command from a user. For example, the second electronic device (1002) can receive a speech command such as "Tell me who the person is on the screen right now." The second electronic device (1002) can tokenize the speech command to generate tokens such as "screen," "person," and "who." In operation 1360, the second electronic device (1002) can determine the second electronic device (1002) as an AI service provider because the content source of the second electronic device (1002) is the second electronic device (1002) itself. In operation 1370, the second electronic device (1002) can provide an AI service by performing an operation of analyzing the user's speech command through a natural language processing model and recognizing an object on the screen through an object recognition model. For example, the second electronic device (1002) can output a response such as "The person appearing on the screen is actor X." The response can be output in the form of voice data, text data, etc. through an output interface such as a speaker or a display.
[0347] FIG. 14 is a detailed block diagram of an electronic device according to one embodiment of the present disclosure.
[0348] Referring to FIG. 14, the electronic device (1000) may include a processor (1100) (e.g., including a processing circuit), memory (1400), a tuner unit (1403) (e.g., including a tuner), a communication unit (1200) (e.g., including a communication circuit), a sensing unit (1404) (e.g., including a circuit), an input / output unit (1405) (e.g., including an input / output circuit), a video processing unit (1450) (e.g., including various circuits and / or executable program instructions), a display (1460), an audio processing unit (1470) (e.g., including various circuits and / or executable program instructions), an audio output unit (1480) (e.g., including an audio output circuit), and an input interface (1300) (e.g., including an input circuit).
[0349] The tuner unit (1403) includes various circuits and can select only the frequency of the channel to be received by the electronic device (1000) from among many radio wave components by tuning through amplification, mixing, resonance, etc. of broadcast content received via wired or wireless connection. The content received through the tuner unit (1403) is decoded and separated into audio, video, and / or additional information. The separated audio, video, and / or additional information can be stored in memory (1400) under the control of the processor (1100).
[0350] The communication unit (1200) includes various communication circuits and can connect the electronic device (1000) to peripheral devices, external devices, servers, mobile terminals, etc. under the control of the processor (1100). The communication unit (1200) may include at least one communication module capable of performing wireless communication. The communication unit (1200) may include at least one of a wireless LAN module (1421), a Bluetooth module (1422), and a wired Ethernet (1423) in accordance with the performance and structure of the electronic device (1000).
[0351] The wireless LAN module (1421) can transmit and receive Wi-Fi signals with a peripheral device according to the Wi-Fi communication standard. The Bluetooth module (1422) can receive Bluetooth signals transmitted from a peripheral device according to the Bluetooth communication standard.
[0352] The detection unit (1430) includes various circuits and detects the user's voice, user image, or user interaction, and may include a microphone, camera unit, light receiver, and sensing unit.
[0353] The input / output unit (1405) includes various circuits and can receive video (e.g., dynamic image signal or still image signal), audio (e.g., voice signal or music signal), and additional information from external devices under the control of the processor (1100). The input / output unit (1405) may include one of an HDMI port (High-Definition Multimedia Interface port), a component jack, a PC port, and a USB port. In addition to these, the input / output unit (1405) may further include a DisplayPort (DP), Thunderbolt, and MHL (Mobile High-Definition Link). The input / output unit (1405) may further include ports for separate output of video and audio.
[0354] The video processing unit (1450) includes various circuits and / or executable program instructions, processes video data to be displayed by the display (1460), and can perform various image processing operations such as decoding, rendering, scaling, noise filtering, frame rate conversion, and resolution conversion on the video data. For example, the video processing unit (1450) may include various image processing circuits. For example, the video processing unit (1450) may include a media codec for processing video content.
[0355] The display (1460) can receive content from a broadcasting station, receive content from an external device such as an external server or external storage media, or output content provided by various apps, such as an OTT service provider or a content provider. The display (1460) can display video-processed content.
[0356] The audio processing unit (1470) includes various circuits and / or executable program instructions and performs processing on audio data. Various processing such as decoding, amplification, and noise filtering on audio data can be performed in the audio processing unit (1470).
[0357] The audio output unit (1480) includes various circuits and can output audio included in content received through the tuner unit (1403) under the control of the processor (1100), audio input through the communication unit (1200) or input / output unit (1405), and audio stored in memory (1400). The audio output unit (1480) may include at least one of a speaker, headphones, or S / PDIF (Sony / Philips Digital Interface: output terminal).
[0358] The input interface (1300) includes various input circuits and can receive user input for controlling the electronic device (1000). The input interface (1300) may include, but is not limited to, various forms of user input devices including, a touch panel for detecting user touch, a button for receiving user push operation, a wheel for receiving user rotation operation, a keyboard, a dome switch, a microphone for voice recognition, a motion detection sensor for sensing motion, etc.
[0359] An electronic device according to one embodiment of the present disclosure includes at least one processor and a memory comprising one or more storage media for storing one or more instructions.
[0360] According to one embodiment of the present disclosure, the electronic device obtains a user's speech command by having at least one processor execute the one or more instructions individually or collectively.
[0361] According to one embodiment of the present disclosure, by having at least one processor execute the one or more instructions individually or collectively, the electronic device determines at least one of the electronic device or the external electronic device as an AI service providing device based on the user's speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0362] According to one embodiment of the present disclosure, the electronic device controls the electronic device to perform an operation corresponding to the user's speech command through the determined AI service provider by executing the one or more instructions individually or collectively.
[0363] The electronic device according to one embodiment of the present disclosure may further include a communication unit.
[0364] According to one embodiment of the present disclosure, the electronic device can identify source information of content displayed on the screen of the external electronic device by the at least one processor executing the one or more instructions individually or in combination.
[0365] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device can transmit the user's speech command to the external electronic device through the communication unit, based on the fact that the content source of the external electronic device is the external electronic device.
[0366] According to one embodiment of the present disclosure, by having at least one processor execute the one or more instructions individually or in combination, the electronic device can perform an operation corresponding to the user's speech command based on the fact that the content source of the external electronic device is the electronic device.
[0367] Source information of the content of the external electronic device according to one embodiment of the present disclosure may be received from the external electronic device and stored in the memory.
[0368] According to one embodiment of the present disclosure, the electronic device can identify source information of content displayed on the screen of the electronic device by having the at least one processor execute the one or more instructions individually or in combination.
[0369] According to one embodiment of the present disclosure, by having at least one processor execute one or more instructions individually or in combination, the electronic device can perform an operation corresponding to the user's speech command based on the fact that the content source of the electronic device is the electronic device.
[0370] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device can transmit the user's speech command to the external electronic device through the communication unit, based on the fact that the content source of the electronic device is the external electronic device.
[0371] According to one embodiment of the present disclosure, by having at least one processor execute one or more instructions individually or in combination, the electronic device can analyze the user's intent regarding the user's speech command from one or more tokens corresponding to the user's speech command, based on the fact that the electronic device is determined to be the AI service provider.
[0372] According to one embodiment of the present disclosure, the electronic device can execute an operation corresponding to the user's intention by having the at least one processor execute the one or more instructions individually or in combination.
[0373] An AI service providing device corresponding to a user's speech command according to one embodiment of the present disclosure may be determined based on at least one of the device specification information, capability information, on-device AI related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI related information.
[0374] According to one embodiment of the present disclosure, the device specification information may include at least one of device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
[0375] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
[0376] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate a microphone of the electronic device or at least one of the external electronic device based on source information of content displayed on the screen of at least one of the electronic device or the external electronic device.
[0377] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate the microphone of the external electronic device based on the fact that the content source of the external electronic device is the external electronic device.
[0378] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate the microphone of the electronic device based on the fact that the content source of the external electronic device is the electronic device.
[0379] According to one embodiment of the present disclosure, by executing the one or more instructions individually or in combination, the electronic device may determine to activate at least one of the microphone of the electronic device or the microphone of the external electronic device based on at least one of the location of the electronic device and the user and the location of the external electronic device and the user, or whether it is in a low-power mode.
[0380] A method of operation of an electronic device for determining an Artificial Intelligence (AI) service providing device according to one embodiment of the present disclosure comprises: acquiring a user’s speech command; determining at least one of the electronic device or the external electronic device as an AI service providing device based on the user’s speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device; and controlling the electronic device to perform an operation corresponding to the user’s speech command through the determined AI service providing device.
[0381] The method according to one embodiment of the present disclosure may further include the steps of: identifying source information of content displayed on the screen of the external electronic device; transmitting a speech command of the user to the external electronic device through a communication unit based on the fact that the content source of the external electronic device is the external electronic device; and performing an operation corresponding to the speech command of the user based on the fact that the content source of the external electronic device is the electronic device.
[0382] Source information of the content of the external electronic device according to one embodiment of the present disclosure may be received from the external electronic device and stored in the memory.
[0383] The method according to one embodiment of the present disclosure may include the steps of: identifying source information of content displayed on the screen of the electronic device; performing an operation corresponding to the user's speech command based on the fact that the content source of the electronic device is the electronic device; and transmitting the user's speech command to the external electronic device through the communication unit based on the fact that the content source of the electronic device is the external electronic device.
[0384] The step of controlling the electronic device to perform an operation corresponding to the user’s speech command according to one embodiment of the present disclosure may include, based on the fact that the electronic device is determined to be the AI service provider, analyzing the user’s intent regarding the user’s speech command from one or more tokens corresponding to the user’s speech command, and executing an operation corresponding to the user’s intent.
[0385] An AI service providing device corresponding to a user's speech command according to one embodiment of the present disclosure may be determined based on at least one of the device specification information, capability information, on-device AI related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI related information.
[0386] According to one embodiment of the present disclosure, the device specification information may include at least one of device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
[0387] The method according to one embodiment of the present disclosure may further include the step of determining to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
[0388] The method according to one embodiment of the present disclosure may further include the step of determining to activate a microphone of at least one of the electronic device or the external electronic device based on source information of content displayed on a screen of at least one of the electronic device or the external electronic device, wherein the content source of the external electronic device is the external electronic device, and the step of determining to activate a microphone of the external electronic device based on the content source of the external electronic device is the electronic device.
[0389] A device-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory storage medium' simply means that it is a tangible device and does not contain a signal (e.g., electromagnetic waves), and the term does not distinguish between cases where data is stored semi-permanently and cases where it is stored temporarily. For example, a 'non-transitory storage medium' may include a buffer in which data is stored temporarily.
[0390] According to one embodiment, the method according to the various embodiments disclosed herein may be provided by being included in a computer program product. The computer program product may be traded between a seller and a buyer as a product. The computer program product may be distributed in the form of a device-readable storage medium (e.g., compact disc read-only memory (CD-ROM)), or distributed online (e.g., download or upload) through an application store or directly between two user devices (e.g., smartphones). In the case of online distribution, at least a portion of the computer program product (e.g., downloadable app) may be temporarily stored or temporarily created on a device-readable storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.
Claims
1. In an electronic device for determining an AI (Artificial Intelligence) service providing device, At least one processor including a processing circuit; and Memory comprising one or more storage media that store one or more instructions, and The above at least one processor executes the above one or more instructions individually or collectively, and the electronic device by the above at least one processor, Acquire the user's speech command, and Based on the speech command of the user and source information of content displayed on the screen of at least one of the electronic device or external electronic device, at least one of the electronic device or external electronic device is determined to be an AI service providing device, and An electronic device that controls the electronic device to perform an action corresponding to the user's speech command through the AI service providing device determined above.
2. In Paragraph 1, The above electronic device further includes a communication unit including a communication circuit, and The electronic device, by means of at least one processor, Identifying source information of content displayed on the screen of the above external electronic device, and Based on the fact that the content source of the above external electronic device is the above external electronic device, the user's speech command is transmitted to the above external electronic device through the communication unit, and An electronic device that performs an operation corresponding to the user's speech command based on the fact that the content source of the external electronic device is the electronic device.
3. In Paragraph 2, An electronic device in which the source information of the content of the external electronic device is received from the external electronic device and stored in the memory.
4. In any one of paragraphs 1 to 3, The above electronic device further includes a communication unit including a communication circuit, and The electronic device, by means of at least one processor, Identifying source information of content displayed on the screen of the electronic device, and Based on the fact that the content source of the electronic device is the electronic device, it performs an operation corresponding to the user's speech command, and An electronic device that transmits a user's speech command to an external electronic device through a communication unit, based on the fact that the content source of the electronic device is the external electronic device.
5. In any one of paragraphs 1 through 4, The electronic device, by means of at least one processor, Based on the fact that the above electronic device is determined to be the AI service provider, the user's intent regarding the user's speech command is analyzed from one or more tokens corresponding to the user's speech command, and An electronic device that executes an action corresponding to the intention of the above user.
6. In any one of paragraphs 1 through 5, An electronic device in which an AI service providing device corresponding to the above user’s speech command is determined based on at least one of the device specification information, capability information, on-device AI-related information of the electronic device, device specification information, capability information of the external electronic device, or on-device AI-related information.
7. In Paragraph 6, An electronic device comprising at least one of the above device specification information, device type information, information about a processor, information about memory, performance information of an AI-dedicated processor, information on the type of AI service, information on whether a device-based AI service is supported, information on the share of an AI-dedicated processor, or performance information of an on-device AI model.
8. In any one of paragraphs 1 through 7, The electronic device, by means of at least one processor, An electronic device that determines to provide an AI service in at least one of a device-based or server-based manner based on at least one of performance information of an AI-dedicated processor of each of the electronic device and the external electronic device, occupancy information of an AI-dedicated processor, information on whether a device-based AI service is supported, performance information of an on-device AI model, or network environment information.
9. In any one of paragraphs 1 through 8, The electronic device, by means of at least one processor, Based on source information of content displayed on the screen of at least one of the electronic device or external electronic device, it is determined to activate the microphone of at least one of the electronic device or the external electronic device, Based on the fact that the content source of the external electronic device is the external electronic device, it is decided to activate the microphone of the external electronic device, and An electronic device that determines to activate the microphone of the electronic device based on the fact that the content source of the external electronic device is the electronic device.
10. In any one of paragraphs 1 through 9, The electronic device, by means of at least one processor, An electronic device that determines to activate at least one of the microphone of the electronic device or the microphone of the external electronic device based on at least one of the location of the electronic device and the user and the location of the external electronic device and the user, or whether it is in low power mode.
11. A method of operation of an electronic device for determining an AI (Artificial Intelligence) service providing device, A step of obtaining a user's speech command; A step of determining at least one of the electronic device or the external electronic device as an AI service provider device based on the user’s speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device; and A method comprising the step of controlling the electronic device to perform an action corresponding to the user's speech command through the AI service providing device determined above.
12. In Paragraph 11, The above method is, A step of identifying source information of content displayed on the screen of the external electronic device; Based on the fact that the content source of the external electronic device is the external electronic device, the step of transmitting the user's speech command to the external electronic device through a communication unit including a communication circuit; and A method further comprising the step of performing an action corresponding to the user's speech command based on the fact that the content source of the external electronic device is the electronic device.
13. In Paragraph 12, A method in which the source information of the content of the external electronic device is received from the external electronic device and stored in memory.
14. In any one of paragraphs 11 through 13, The above method is, A step of identifying source information of content displayed on the screen of the electronic device; Based on the fact that the content source of the electronic device is the electronic device, a step of performing an operation corresponding to the user's speech command; and A method comprising the step of transmitting a user’s speech command to an external electronic device through a communication unit, based on the fact that the content source of the electronic device is the external electronic device.
15. A step of obtaining a user's speech command; A step of determining at least one of the electronic device or the external electronic device as an AI service provider device based on the user’s speech command and source information of content displayed on the screen of at least one of the electronic device or the external electronic device; and A computer-readable recording medium having a program recorded thereon for executing a method of operation of an electronic device on a computer, the method comprising the step of controlling the electronic device to perform an operation corresponding to the user's speech command through the AI service providing device determined above.