Behavior control system

The behavior control system enhances robot interaction by recognizing user and robot emotions to adjust its actions, addressing the challenge of inappropriate responses in existing systems, thereby improving interaction quality.

JP7883466B2Active Publication Date: 2026-07-01SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2023-04-12
Publication Date
2026-07-01

AI Technical Summary

Technical Problem

Existing robot behavior control systems struggle to execute appropriate actions in response to user reactions effectively.

Method used

A behavior control system that includes an emotion determination unit to recognize user and robot emotions, utilizing a dialogue function to generate robot behavior content and an action determination unit to decide robot actions based on user and robot emotions, with sentiment analysis and response rules to adjust actions accordingly.

Benefits of technology

Enhances the robot's ability to respond appropriately to user emotions and actions, improving interaction quality and user satisfaction by adapting its behavior dynamically.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 0007883466000001
    Figure 0007883466000001
  • Figure 0007883466000002
    Figure 0007883466000002
  • Figure 0007883466000003
    Figure 0007883466000003
Patent Text Reader

Abstract

To make a robot take an appropriate action in response to an action of a user.SOLUTION: An action control system disclosed herein comprises an emotion determination unit configured to identify emotions of a user and emotions of a robot, and an action determination unit configured to use a dialog function for allowing the user and the robot to have a dialog in order to generate action details of the robot in response to an action of the user and the emotions of the user or the robot and determine an action of the robot corresponding to the action details. The action determination unit determines an action of the robot according to an answer from a chat engine obtained by inputting a fixed phrase for asking the chat engine, configured to conduct dialogs using characters and images, about the emotions of the user, emotions of the robot, and an action that should be taken by the robot.SELECTED DRAWING: Figure 2
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The present invention relates to a behavior control system.

Background Art

[0002] Patent Document 1 discloses a technique for determining appropriate behavior of a robot with respect to the state of a user. The prior art of Patent Document 1 recognizes the user's reaction when the robot executes a specific action, and if it cannot determine the robot's action with respect to the recognized user's reaction, it receives information on actions suitable for the recognized user's state from a server to update the robot's action.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] However, in the prior art, there is room for improvement in causing a robot to execute appropriate actions with respect to the actions of a user.

Means for Solving the Problems

[0005] According to a first aspect of the present invention, a behavior control system is provided. The behavior control system includes an emotion determination unit that determines the emotion of a user and the emotion of a robot, and based on a dialogue function that causes the user and the robot to dialogue, generates the behavior content of the robot with respect to the behavior of the user and the emotion of the user and the emotion of the robot, and an action determination unit that determines the action of the robot corresponding to the action content. The action determination unit is configured to A sentiment value that shows the time-series changes in the user's emotions. the emotion of the robot , and user imagesIn addition, the robot's actions are determined according to the response from the chat engine obtained by inputting a fixed sentence asking what action the robot should take. [Brief explanation of the drawing]

[0006] [Figure 1] An example of system 5 according to this embodiment is schematically shown. [Figure 2] The functional configuration of robot 100 is shown in general terms. [Figure 3] A schematic example of the operation flow by robot 100 is shown below. [Figure 4] A schematic example of the hardware configuration of Computer 1200 is shown below. [Modes for carrying out the invention]

[0007] The present invention will be described below through embodiments, but these embodiments are not intended to limit the scope of the claims. Furthermore, not all combinations of features described in the embodiments are necessarily essential to the solution of the invention.

[0008] Figure 1 schematically shows an example of system 5 according to this embodiment. System 5 comprises robot 100, robot 101, robot 102, and server 300. Users 10a, 10b, 10c, and 10d are users of robot 100. Users 11a, 11b, and 11c are users of robot 101. Users 12a and 12b are users of robot 102. In the description of this embodiment, users 10a, 10b, 10c, and 10d may be collectively referred to as user 10. Also, users 11a, 11b, and 11c may be collectively referred to as user 11. Also, users 12a and 12b may be collectively referred to as user 12. Robots 101 and 102 have substantially the same functions as robot 100. Therefore, system 5 will be described mainly focusing on the functions of robot 100.

[0009] Robot 100 engages in conversation with user 10 and provides user 10 with video. At this time, robot 100 cooperates with a server 300 or the like that which can communicate via the communication network 20 to converse with user 10 and provide user 10 with video and other information. For example, robot 100 not only learns appropriate conversation on its own, but also learns to converse with user 10 more appropriately by coordinating with server 300. In addition, robot 100 has server 300 record video data of user 10 that it has captured, and requests video data from server 300 as needed to provide it to user 10.

[0010] Furthermore, robot 100 has an emotion value that represents the type of emotion it is feeling. For example, robot 100 has emotion values ​​that represent the intensity of each of the following emotions: "joy," "anger," "sadness," "pleasure," "pleasantness," "unpleasantness," "anxiety," "sadness," "excitement," "worry," "relief," "fulfillment," "emptiness," and "neutral." For example, when robot 100 is conversing with user 10 with a high emotion value for excitement, it will speak at a fast pace. In this way, robot 100 can express its emotions through its actions.

[0011] Furthermore, the robot 100 may be configured to determine its actions in response to user 10's emotions by matching an AI (Artificial Intelligence) chat engine with an emotion engine. Specifically, the robot 100 may be configured to recognize user 10's actions, determine user 10's emotions in response to those actions, and determine its actions in response to the determined emotions.

[0012] More specifically, when the robot 100 recognizes the user 10's actions, it uses a pre-configured chat engine to automatically generate the actions the robot 100 should take in response to the user 10's actions. The chat engine can be interpreted as an algorithm and calculations for automatic text-based dialogue processing. The chat engine is based on, for example, Japanese Patent Publication No. 2018-081444 and chatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> As this is publicly known as disclosed in [public disclosure], a detailed explanation will be omitted. Such a chat engine is composed of a Large Language Model (LLM). In summary, this embodiment makes it possible to reflect the emotions of user 10 and robot 100, as well as various linguistic information, in the actions of robot 100 by combining a large language model and an emotion engine. In other words, according to this embodiment, a synergistic effect can be obtained by combining a chat engine and an emotion engine.

[0013] Furthermore, the robot 100 has the function of recognizing the user 10's actions. The robot 100 recognizes the user 10's actions by analyzing the user 10's facial image acquired by the camera function and the user 10's voice acquired by the microphone function. Based on the recognized user 10's actions, the robot 100 decides what action to perform.

[0014] Robot 100 stores rules that define the actions it will perform based on User 10's emotions, Robot 100's emotions, and User 10's actions, and performs various actions according to these rules.

[0015] Specifically, robot 100 has response rules for determining its actions based on user 10's emotions, robot 100's emotions, and user 10's actions. For example, if user 10's action is "laugh," the response rule stipulates that robot 100 should "laugh." Similarly, if user 10's action is "anger," the response rule stipulates that robot 100 should "apologize." Furthermore, if user 10's action is "ask a question," the response rule stipulates that robot 100 should "answer." And if user 10's action is "sadness," the response rule stipulates that robot 100 should "speak to" the user.

[0016] Based on the response rules, if robot 100 recognizes that user 10's behavior is "anger," it selects the action of "apologizing," as defined in the response rules, as the action robot 100 will perform. For example, if robot 100 selects the action of "apologizing," it will perform the "apologizing" motion and output an audible message expressing the words "apologizing."

[0017] Furthermore, it is stipulated that if robot 100's emotions are "normal" (i.e., "joy"=0, "anger"=0, "sadness"=0, "happiness"=0) and user 10's state is "alone and looks lonely", then robot 100's emotions will change to "worried" and it will be able to perform the action of "speaking to" the user.

[0018] Based on the response rules, if robot 100 recognizes that its current emotion is "normal" and that user 10 is lonely, it increases its "sadness" emotion value. Furthermore, robot 100 selects the action "speak to" as defined in the response rules to perform on user 10. For example, if robot 100 selects the action "speak to," it will output the phrase "What's wrong?" in a worried tone of voice.

[0019] Also, by this action, the robot 100 transmits user reaction information indicating that a positive reaction has been obtained from the user 10 to the server 300. The user reaction information includes, for example, the user action of "getting angry", the action of the robot 100 of "apologizing", that the reaction of the user 10 is positive, and the attributes of the user 10.

[0020] The server 300 stores the user reaction information received from the robot 100. Note that the server 300 receives and stores user reaction information not only from the robot 100 but also from each of the robot 101 and the robot 102. Then, the server 300 analyzes the user reaction information from the robot 100, the robot 101, and the robot 102 and updates the reaction rule.

[0021] The robot 100 receives the updated reaction rule from the server 300 by inquiring the server 300 about the updated reaction rule. The robot 100 incorporates the updated reaction rule into the reaction rule stored in the robot 100. Thereby, the robot 100 can incorporate the reaction rules acquired by the robot 101, the robot 102, etc. into its own reaction rule.

[0022] FIG. 2 schematically shows the functional configuration of the robot 100. The robot 100 includes a sensor unit 200, a sensor module unit 210, a storage unit 220, a user state recognition unit 230, an emotion determination unit 232, an action recognition unit 234, an action determination unit 236, a memory control unit 238, an action control unit 250, a control target 252, and a communication processing unit 280.

[0023] The controlled object 252 includes a display device, a speaker, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 100 are controlled by controlling the motors in the arms, hands, and feet. Some of the robot 100's emotions can be expressed by controlling these motors. The robot 100's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes. The posture, gestures, and facial expressions of the robot 100 are examples of the robot 100's attitude.

[0024] The sensor unit 200 includes a microphone 201, a 3D depth sensor 202, a 2D camera 203, and a distance sensor 204. The microphone 201 continuously detects sound and outputs sound data. The microphone 201 is mounted on the head of the robot 100 and may have a function for binaural recording. The 3D depth sensor 202 detects the outline of an object by continuously irradiating it with an infrared pattern and analyzing the infrared pattern from infrared images continuously captured by an infrared camera. The 2D camera 203 is an example of an image sensor. The 2D camera 203 captures images using visible light and generates visible light image information. The distance sensor 204 detects the distance to an object by irradiating it with, for example, a laser or ultrasound. The sensor unit 200 may also include other components such as a clock, a gyro sensor, a touch sensor, and a motor feedback sensor.

[0025] Note that, among the components of the robot 100 shown in Figure 2, the components other than the controlled object 252 and the sensor unit 200 are examples of components of the behavior control system of the robot 100. The behavior control system of the robot 100 controls the controlled object 252.

[0026] The storage unit 220 includes reaction rules 221 and history data 222. The history data 222 includes the user 10's past emotional values ​​and behavioral history. This emotional value and behavioral history is recorded for each user 10, for example, by associating it with the user 10's identification information. At least a portion of the storage unit 220 is implemented by a storage medium such as memory. It may also include a person database that stores the user 10's facial image, user 10's attribute information, etc. Of the components of the robot 100 shown in Figure 2, the functions of the components other than the controlled object 252, the sensor unit 200, and the storage unit 220 can be realized by the CPU operating based on a program. For example, the functions of these components can be implemented as CPU operations by the operating system (OS) and programs that run on the OS.

[0027] The sensor module 210 includes a voice emotion recognition unit 211, a speech understanding unit 212, a facial expression recognition unit 213, and a face recognition unit 214. The sensor module 210 receives information detected by the sensor unit 200. The sensor module 210 analyzes the information detected by the sensor unit 200 and outputs the analysis results to the user state recognition unit 230.

[0028] The voice emotion recognition unit 211 of the sensor module unit 210 analyzes the voice of user 10 detected by microphone 201 and recognizes the emotions of user 10. For example, the voice emotion recognition unit 211 extracts feature quantities such as frequency components of the voice and recognizes the emotions of user 10 based on the extracted feature quantities. The speech understanding unit 212 analyzes the voice of user 10 detected by microphone 201 and outputs character information representing the content of user 10's speech.

[0029] The facial expression recognition unit 213 recognizes the facial expressions and emotions of user 10 from images of user 10 captured by the 2D camera 203. For example, the facial expression recognition unit 213 recognizes the facial expressions and emotions of user 10 based on the shape and positional relationship of the eyes and mouth.

[0030] The face recognition unit 214 recognizes the face of user 10. The face recognition unit 214 recognizes user 10 by matching the face images stored in the person database (not shown) with the face images of user 10 captured by the 2D camera 203.

[0031] The user state recognition unit 230 recognizes the state of user 10 based on the information analyzed by the sensor module unit 210. For example, it uses the analysis results from the sensor module unit 210 to perform processing mainly related to perception. For example, it generates perceptual information such as "Dad is alone" or "There is a 90% chance that Dad is not smiling." It then processes the meaning of the generated perceptual information. For example, it generates semantic information such as "Dad is alone and looks lonely."

[0032] The emotion determination unit 232 determines an emotion value indicating the user 10's emotions based on the information analyzed by the sensor module unit 210 and the user state recognized by the user state recognition unit 230. For example, the information analyzed by the sensor module unit 210 and the recognized user state of the user 10 are input into a pre-trained neural network to obtain an emotion value indicating the user 10's emotions.

[0033] Here, the emotion value representing user 10's emotions is a value indicating whether the user's emotion is positive or negative. For example, if the user's emotion is a positive emotion accompanied by pleasure or comfort, such as "joy," "pleasure," "pleasantness," "security," "excitement," "relief," and "fulfillment," it will show a positive value, and the more positive the emotion, the larger the value. If the user's emotion is an unpleasant emotion, such as "anger," "sadness," "discomfort," "anxiety," "grief," "worry," and "emptiness," it will show a negative value, and the more unpleasant the emotion, the larger the absolute value of the negative value. If the user's emotion is none of the above ("neutral"), it will show a value of 0.

[0034] Furthermore, the emotion determination unit 232 determines an emotion value indicating the robot 100's emotions based on the information analyzed by the sensor module unit 210 and the state of the user 10 recognized by the user state recognition unit 230. Robot 100's emotional value includes emotional values ​​for each of several emotional categories, such as values ​​(0-5) indicating the intensity of "joy," "anger," "sadness," and "happiness."

[0035] Specifically, the emotion determination unit 232 determines an emotion value indicating the robot 100's emotions according to a rule for updating the robot 100's emotion value, which is determined in association with the information analyzed by the sensor module unit 210 and the state of user 10 recognized by the user state recognition unit 230.

[0036] For example, if the user state recognition unit 230 recognizes that user 10 looks lonely, the emotion determination unit 232 increases the "sadness" emotion value of robot 100. Also, if the user state recognition unit 230 recognizes that user 10 has smiled, it increases the "joy" emotion value of robot 100.

[0037] Furthermore, the emotion determination unit 232 may take the state of the robot 100 into further consideration when determining the emotion value that indicates the robot 100's emotions. For example, the emotion value of "sadness" of the robot 100 may be increased when the robot 100's battery level is low or when the surrounding environment of the robot 100 is completely dark. In addition, if the user 10 continues to talk to the robot despite the low battery level, the emotion value of "anger" may be increased.

[0038] The behavior recognition unit 234 recognizes the user 10's actions based on the information analyzed by the sensor module unit 210 and the user 10's state recognized by the user state recognition unit 230. For example, the information analyzed by the sensor module unit 210 and the recognized user 10's state are input into a pre-trained neural network to obtain the probability of each of several predetermined behavior classifications (e.g., "laughing," "getting angry," "asking a question," "being sad"), and the behavior classification with the highest probability is recognized as the user 10's action.

[0039] As described above, in this embodiment, the robot 100 identifies the user 10 and acquires the content of the user 10's speech. However, when acquiring and using the content of the speech, the robot 100 obtains the necessary consent from the user 10 in accordance with the law, and the behavior control system of the robot 100 according to this embodiment takes into consideration the protection of the user 10's personal information and privacy.

[0040] The action decision unit 236 determines an action corresponding to the user 10's actions recognized by the action recognition unit 234, based on the user 10's current emotion value determined by the emotion decision unit 232, the history data 222 of past emotion values ​​determined by the emotion decision unit 232 before the user 10's current emotion value was determined, and the robot 100's emotion value. In this embodiment, the action decision unit 236 uses the most recent emotion value included in the history data 222 as the user 10's past emotion value, but the disclosed technology is not limited to this embodiment. For example, the action decision unit 236 may use multiple most recent emotion values ​​as the user 10's past emotion value, or it may use an emotion value from a unit period earlier, such as one day ago. Furthermore, the action decision unit 236 may determine an action corresponding to the user 10's actions by further considering not only the robot 100's current emotion value but also the history of the robot 100's past emotion values. The actions determined by the action decision unit 236 include gestures performed by the robot 100 or the content of speech uttered by the robot 100.

[0041] In this embodiment, the action decision unit 236 determines the robot 100's action based on a combination of the user 10's past and current emotional values, the robot 100's emotional value, the user 10's action, and the reaction rule 221, as an action corresponding to the user 10's action. For example, if the user 10's past emotional value is positive and the current emotional value is negative, the action decision unit 236 determines an action to change the user 10's emotional value to positive, as an action corresponding to the user 10's action.

[0042] Response rule 221 defines the actions of robot 100 in response to combinations of user 10's past and current emotional values, robot 100's emotional value, and user 10's actions. For example, if user 10's past emotional value is positive and their current emotional value is negative, and user 10's action is sad, then the rule defines a combination of gestures and speech content for robot 100 to ask encouraging questions to user 10, accompanied by gestures.

[0043] For example, the response rule 221 defines the robot 100's actions for all combinations of the robot 100's emotional value patterns (1296 patterns, which are the 6 values ​​from "0" to "5" for "joy," "anger," "sadness," and "happiness" raised to the power of 4), the combinations of the user 10's past and current emotional values, and the user 10's behavior patterns. In other words, for each emotional value pattern of the robot 100, the robot 100's actions are defined according to the user 10's behavior patterns for each of the multiple combinations of the user 10's past and current emotional values, such as negative and negative, negative and positive, positive and negative, positive and positive, negative and neutral, and neutral and neutral. The action determination unit 236 may also transition to an operation mode in which it determines the robot 100's actions using history data 222 when the user 10 makes an utterance that intends to continue a conversation from a past topic, such as "I want to talk about that topic we talked about before." Furthermore, response rule 221 may specify at least one of a gesture and a statement as an action for robot 100 for each of the 1296 patterns of robot 100's emotional value. Alternatively, response rule 221 may specify at least one of a gesture and a statement as an action for robot 100 for each of the groups of patterns of robot 100's emotional value.

[0044] Each gesture included in the actions of robot 100 as defined in response rule 221 has a predetermined intensity. Each utterance included in the actions of robot 100 as defined in response rule 221 has a predetermined intensity.

[0045] The memory control unit 238 decides whether or not to store data including the user 10's actions in the history data 222, based on the predetermined intensity of the action determined by the action decision unit 236 and the emotion value of the robot 100 determined by the emotion decision unit 232. Specifically, if the sum of the emotion values ​​for each of the multiple emotion classifications of the robot 100, the predetermined intensity for the gestures included in the actions determined by the action decision unit 236, and the predetermined intensity for the speech content included in the actions determined by the action decision unit 236 is greater than or equal to a threshold, it is decided to store the data including the user 10's actions in the history data 222.

[0046] When the memory control unit 238 decides to store data including the user 10's actions in the history data 222, it stores the actions determined by the action determination unit 236, information analyzed by the sensor module unit 210 from the present time up to a certain period of time prior (for example, all surrounding information such as sound, images, smells, etc.), and the user 10's state recognized by the user state recognition unit 230 (for example, the user 10's facial expressions, emotions, etc.) in the history data 222.

[0047] The behavior control unit 250 controls the controlled object 252 based on the action determined by the action decision unit 236. For example, if the action decision unit 236 determines an action that includes speaking, the behavior control unit 250 causes the speaker included in the controlled object 252 to output sound. At this time, the behavior control unit 250 may determine the speech output speed based on the emotion value of the robot 100. For example, the behavior control unit 250 determines a faster speech output speed the greater the emotion value of the robot 100. In this way, the behavior control unit 250 determines the execution form of the action determined by the action decision unit 236 based on the emotion value determined by the emotion decision unit 232.

[0048] The behavior control unit 250 may recognize changes in user 10's emotions in response to the action decided by the behavior decision unit 236. For example, changes in emotions may be recognized based on user 10's voice or facial expressions. Alternatively, changes in user 10's emotions may be recognized based on the detection of an impact by the touch sensor included in the sensor unit 200. If an impact is detected by the touch sensor included in the sensor unit 200, it may be recognized that user 10's emotions have worsened. Conversely, if the detection result of the touch sensor included in the sensor unit 200 indicates that user 10 is laughing or happy, it may be recognized that user 10's emotions have improved. Information indicating user 10's reaction is output to the communication processing unit 280.

[0049] Furthermore, after the action control unit 250 executes the action determined by the action decision unit 236 in an execution mode determined according to the robot 100's emotions, the emotion decision unit 232 further changes the robot 100's emotion value based on the user's reaction to the execution of the action. Specifically, the emotion decision unit 232 increases the robot 100's "joy" emotion value if the user's reaction to the action performed by the action decision unit 236 in an execution mode determined by the action control unit 250 was not unfavorable. Also, the emotion decision unit 232 increases the robot 100's "sadness" emotion value if the user's reaction to the action performed by the action decided by the action decision unit 236 in an execution mode determined by the action control unit 250 was unfavorable.

[0050] Furthermore, the behavior control unit 250 expresses the emotions of the robot 100 based on the determined emotion values ​​of the robot 100. For example, if the behavior control unit 250 increases the "joy" emotion value of the robot 100, it controls the controlled object 252 to make the robot 100 perform joyful gestures. Also, if the behavior control unit 250 increases the "sadness" emotion value of the robot 100, it controls the controlled object 252 so that the robot 100 assumes a dejected posture.

[0051] The communication processing unit 280 is responsible for communication with the server 300. As described above, the communication processing unit 280 transmits user response information to the server 300. The communication processing unit 280 also receives updated response rules from the server 300. When the communication processing unit 280 receives updated response rules from the server 300, it updates the response rule 221.

[0052] Server 300 communicates with robots 100, 101, and 102, receives user response information transmitted from robot 100, and updates the response rules based on the response rules that include actions for which a positive response was obtained.

[0053] Figure 3 schematically shows an example of an action flow related to the actions that determine the behavior of the robot 100. The action flow shown in Figure 3 is executed repeatedly. At this time, it is assumed that information analyzed by the sensor module 210 is input. In the action flow, "S" represents the step that is executed.

[0054] First, in step S100, the user state recognition unit 230 recognizes the state of user 10 based on the information analyzed by the sensor module unit 210.

[0055] In step S102, the emotion determination unit 232 determines an emotion value indicating the emotion of user 10 based on the information analyzed by the sensor module unit 210 and the state of user 10 recognized by the user state recognition unit 230.

[0056] In step S103, the emotion determination unit 232 determines an emotion value indicating the robot 100's emotions based on the information analyzed by the sensor module unit 210 and the state of user 10 recognized by the user state recognition unit 230. The emotion determination unit 232 adds the determined emotion value of user 10 to the history data 222.

[0057] In step S104, the behavior recognition unit 234 recognizes the behavior classification of user 10 based on the information analyzed by the sensor module unit 210 and the state of user 10 recognized by the user state recognition unit 230.

[0058] In step S106, the action decision unit 236 determines the action of the robot 100 based on the combination of the user 10's current emotion value and past emotion values ​​included in the history data 222 determined in step S102, the robot 100's emotion value, the user 10's actions recognized by the action recognition unit 234, and the reaction rule 221.

[0059] In step S108, the action control unit 250 controls the controlled object 252 based on the action determined by the action decision unit 236.

[0060] In step S110, the memory control unit 238 calculates a total intensity value based on the predetermined intensity of the action determined by the action decision unit 236 and the emotion value of the robot 100 determined by the emotion decision unit 232.

[0061] In step S112, the memory control unit 238 determines whether the total intensity value is equal to or greater than a threshold. If the total intensity value is less than the threshold, the process is terminated without storing data including the user 10's actions in the history data 222. On the other hand, if the total intensity value is equal to or greater than the threshold, the process proceeds to step S114.

[0062] In step S114, the action determined by the action decision unit 236, the information analyzed by the sensor module unit 210 from the present time up to a certain period of time prior, and the state of user 10 recognized by the user state recognition unit 230 are stored in the history data 222.

[0063] As explained above, the robot 100 determines an emotion value indicating its own emotions based on the user's state, and then decides whether or not to store data including user 10's actions in the history data 222 based on the robot 100's emotion value. This reduces the capacity of the history data 222 that stores data including user 10's actions. For example, if the robot 100 determines that the user's state is the same as it was 10 years ago, it can read the history data 222 from 10 years ago and present the user 10 with all kinds of surrounding information, such as user 10's state at that time (e.g., user 10's facial expressions, emotions, etc.), as well as data such as sounds, images, and smells from that time.

[0064] Furthermore, robot 100 can perform actions appropriate to user 10's actions. Traditionally, user actions were classified, and actions including the robot's facial expressions and appearance were determined based on that. In contrast, robot 100 determines user 10's current emotional state and performs actions for user 10 based on past and current emotional states. Therefore, for example, if user 10 was cheerful yesterday but depressed today, robot 100 can say something like, "You were cheerful yesterday, what's wrong today?" Robot 100 can also perform speech with gestures. For example, if user 10 was depressed yesterday but cheerful today, robot 100 can say something like, "You were depressed yesterday, but you seem cheerful today?" For example, if user 10 was cheerful yesterday and is more cheerful today than yesterday, robot 100 can say something like, "You seem more cheerful today than yesterday. Did something better happen today?" Furthermore, for example, robot 100 can make statements such as, "You seem to be in a stable mood lately, which is good," to user 10 whose emotional value is 0 or higher and whose emotional value fluctuations remain within a certain range.

[0065] Furthermore, for example, if robot 100 asks user 10, "Did you finish the homework we talked about yesterday?", and user 10 replies, "Yes, I did it," robot 100 can make positive statements such as "Great job!" and perform positive gestures such as applause or a thumbs-up. Also, for example, if user 10 says, "The presentation I gave the day before yesterday went well," robot 100 can make positive statements such as "You did a great job!" and perform the aforementioned positive gestures. In this way, by having robot 100 act based on user 10's history, it is expected that user 10 will feel a sense of familiarity with robot 100.

[0066] In the above embodiment, the case in which the robot 100 recognizes user 10 using a facial image of user 10 was described, but the disclosed technology is not limited to this embodiment. For example, the robot 100 may recognize user 10 using voice spoken by user 10, user 10's email address, user 10's SNS ID, or an ID card with a built-in wireless IC tag possessed by user 10.

[0067] Robot 100 is an example of an electronic device equipped with a behavior control system. The behavior control system is not limited to robot 100, and can be applied to various electronic devices. Furthermore, the functions of server 300 may be implemented by one or more computers. At least some of the functions of server 300 may be implemented by a virtual machine. Also, at least some of the functions of server 300 may be implemented in the cloud.

[0068] Figure 4 schematically shows an example of the hardware configuration of a computer 1200 that functions as a robot 100 and a server 300. A program installed on the computer 1200 can cause the computer 1200 to function as one or more "parts" of the apparatus according to this embodiment, or to cause the computer 1200 to execute operations associated with the apparatus according to this embodiment or such one or more "parts", and / or to cause the computer 1200 to execute a process or a stage of such process according to this embodiment. Such a program may be executed by the CPU 1212 to cause the computer 1200 to execute specific operations associated with some or all of the blocks in the flowcharts and block diagrams described herein.

[0069] The computer 1200 according to this embodiment includes a CPU 1212, RAM 1214, and a graphics controller 1216, which are interconnected by a host controller 1210. The computer 1200 also includes input / output units such as a communication interface 1222, a storage device 1224, a DVD drive 1226, and an IC card drive, which are connected to the host controller 1210 via an input / output controller 1220. The DVD drive 1226 may be a DVD-ROM drive and a DVD-RAM drive, etc. The storage device 1224 may be a hard disk drive and a solid-state drive, etc. The computer 1200 also includes legacy input / output units such as a ROM 1230 and a keyboard, which are connected to the input / output controller 1220 via an input / output chip 1240.

[0070] The CPU 1212 operates according to the programs stored in the ROM 1230 and RAM 1214, thereby controlling each unit. The graphics controller 1216 acquires the image data generated by the CPU 1212 and stores it in the frame buffer provided in RAM 1214 or within itself, so that the image data is displayed on the display device 1218.

[0071] The communication interface 1222 communicates with other electronic devices via a network. The storage device 1224 stores programs and data used by the CPU 1212 in the computer 1200. The DVD drive 1226 reads programs or data from a DVD-ROM 1227, etc., and provides them to the storage device 1224. The IC card drive reads programs and data from an IC card and / or writes programs and data to an IC card.

[0072] The ROM 1230 stores boot programs and / or hardware-dependent programs of the computer 1200, which are executed by the computer 1200 upon activation. The input / output chip 1240 may also connect various input / output units to the input / output controller 1220 via USB ports, parallel ports, serial ports, keyboard ports, mouse ports, etc.

[0073] The program is provided on a computer-readable storage medium such as a DVD-ROM 1227 or an IC card. The program is read from the computer-readable storage medium and installed on a storage device 1224, RAM 1214, or ROM 1230, which are examples of computer-readable storage media, and executed by the CPU 1212. The information processing described within these programs is read by the computer 1200, resulting in coordination between the program and the various types of hardware resources described above. The apparatus or method may be configured to realize the operation or processing of information in accordance with the use of the computer 1200.

[0074] For example, when communication is performed between a computer 1200 and an external device, the CPU 1212 may execute a communication program loaded into RAM 1214 and, based on the processing described in the communication program, instruct the communication interface 1222 to perform communication processing. Under the control of the CPU 1212, the communication interface 1222 reads transmission data stored in a transmission buffer area provided in a recording medium such as RAM 1214, storage device 1224, DVD-ROM 1227, or IC card, transmits the read transmission data to the network, or writes received data received from the network to a reception buffer area or the like provided on the recording medium.

[0075] Furthermore, the CPU 1212 may read all or necessary parts of files or databases stored on external recording media such as the storage device 1224, DVD drive 1226 (DVD-ROM 1227), or IC card into the RAM 1214, and perform various types of processing on the data in the RAM 1214. The CPU 1212 may then write the processed data back to the external recording media.

[0076] Various types of information, such as various types of programs, data, tables, and databases, may be stored on the recording medium and subjected to information processing. The CPU 1212 may perform various types of processing on the data read from RAM 1214, including various types of operations, information processing, conditional judgments, conditional branching, unconditional branching, information retrieval / replacement, etc., as described throughout this disclosure and specified by the program instruction sequence, and write the results back to RAM 1214. The CPU 1212 may also retrieve information in files, databases, etc., within the recording medium. For example, if multiple entries are stored in the recording medium, each having an attribute value of a first attribute associated with an attribute value of a second attribute, the CPU 1212 may search among the multiple entries for an entry that matches the specified condition for the attribute value of the first attribute, read the attribute value of the second attribute stored in that entry, and thereby obtain the attribute value of the second attribute associated with the first attribute that satisfies the predetermined condition.

[0077] The program or software module described above may be stored on or near the computer 1200 in a computer-readable storage medium. Alternatively, a recording medium such as a hard disk or RAM provided within a server system connected to a dedicated communication network or the Internet can be used as a computer-readable storage medium, thereby providing the program to the computer 1200 via the network.

[0078] In this embodiment, blocks in the flowchart and block diagram may represent a stage in a process in which an operation is performed or a "part" of a device that has the role of performing an operation. A particular stage and "part" may be implemented by a dedicated circuit, a programmable circuit supplied with computer-readable instructions stored on a computer-readable storage medium, and / or a processor supplied with computer-readable instructions stored on a computer-readable storage medium. The dedicated circuit may include digital and / or analog hardware circuits, and may include integrated circuits (ICs) and / or discrete circuits. The programmable circuit may include reconfigurable hardware circuits, such as field-programmable gate arrays (FPGAs) and programmable logic arrays (PLAs), which include logical AND, logical OR, exclusive OR, negated AND, negated OR, and other logical operations, flip-flops, registers, and memory elements.

[0079] A computer-readable storage medium may include any tangible device capable of storing instructions to be executed by a suitable device, and as a result, a computer-readable storage medium having instructions stored therein will comprise a product that includes instructions that can be executed to create means for performing operations specified in a flowchart or block diagram. Examples of computer-readable storage media may include electronic storage media, magnetic storage media, optical storage media, electromagnetic storage media, semiconductor storage media, etc. More specific examples of computer-readable storage media may include floppy disks, diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), electrically erasable programmable read-only memory (EEPROM), static random access memory (SRAM), compact disk read-only memory (CD-ROM), digital multipurpose disc (DVD), Blu-ray® disc, memory stick, integrated circuit card, etc.

[0080] Computer-readable instructions may include assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including object-oriented programming languages ​​such as Smalltalk, Java®, C++, and traditional procedural programming languages ​​such as the C programming language or similar languages.

[0081] Computer-readable instructions may be provided to a general-purpose computer, a special-purpose computer, or a programmable circuit, either locally or via a wide area network (WAN) such as a local area network (LAN) or the internet, so that the computer-readable instructions may be executed by the processor or programmable circuit of a general-purpose computer, a special-purpose computer, or other programmable data processing device, in order to generate means for performing operations specified in a flowchart or block diagram. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers, and the like.

[0082] Although the present invention has been described above using embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various modifications or improvements can be made to the above embodiments. It will be clear from the claims that such modified or improved forms may also be included in the technical scope of the present invention.

[0083] It should be noted that the execution order of operations, procedures, steps, and stages in the apparatus, systems, programs, and methods shown in the claims, specifications, and drawings is not explicitly stated as "before" or "prior to," and that these can be implemented in any order unless the output of a previous process is used in a later process. Even if the operation flow in the claims, specifications, and drawings is described using phrases such as "first," and "next," for convenience, this does not mean that it is essential to perform the operations in that order.

[0084] (Other embodiments) The action decision unit 236 outputs the emotions of user 10, determined from user 10's actions, and the emotions of robot 100, determined by emotion decision unit 232, to a text file. In this case, the action decision unit 236 adds a fixed sentence to the text file containing the emotions of user 10 and robot 100, which is a predefined phrase used to ask what action robot 100 should take, such as "What action should the robot take in this situation?".

[0085] The action decision unit 236 inputs a text file with added fixed text and an image of user 10 taken by the 2D camera 203 into the chat engine via the communication processing unit 280. This allows the chat engine to provide a response indicating the action robot 100 should take, based on user 10's emotions, robot 100's emotions, and information obtained from user 10's image. The chat engine can accept not only text but also images, and these input images can also be used as reference information to determine the action robot 100 should take.

[0086] Therefore, the action decision unit 236 determines the action of the robot 100 according to the content of the response obtained from the chat engine. [Explanation of Symbols]

[0087] 5 System, 10, 11, 12 User, 20 Communication Network, 100, 101, 102 Robot, 200 Sensor Unit, 201 Microphone, 202 Depth Sensor, 203 Camera, 204 Distance Sensor, 210 Sensor Module Unit, 211 Voice Emotion Recognition Unit, 212 Speech Understanding Unit, 213 Facial Expression Recognition Unit, 214 Face Recognition Unit, 220 Storage Unit, 221 Response Rules, 222 History Data, 230 User State Recognition Unit, 232 Emotion Determination Unit, 234 Action Recognition Unit, 236 Action Determination Unit, 238 Memory Control Unit, 250 Action Control Unit, 252 Controlled Object, 280 Communication Processing Unit, 300 Server, 1200 Computer, 1210 Host Controller, 1212 CPU, 1214 RAM, 1216 Graphics Controller, 1218 Display Device, 1220 Input / output controller, 1222 communication interface, 1224 storage device, 1226 DVD drive, 1227 DVD-ROM, 1230 ROM, 1240 input / output chip

Claims

[Claim 1] An emotion determination unit that determines the user's emotions and the robot's emotions, It includes an action determination unit that generates user actions and robot action content in response to the user's actions and the user's and robot's emotions, and determines the robot's action corresponding to the action content, based on a dialogue function that allows the user and the robot to interact. The action decision unit determines the robot's action based on the response from the chat engine, which is obtained by inputting a fixed sentence asking what action the robot should take, along with an emotion value indicating the time-series change of the user's emotions, the robot's emotions, and the user's image, to a chat engine that engages in dialogue using text and images. Behavior control system.