Information processing device and information processing method

The information processing apparatus enhances virtual space interactions by personalizing NPC dialogue through context and value-based recommendations, addressing the low appeal of existing systems by generating tailored product promotions.

WO2026140031A1PCT designated stage Publication Date: 2026-07-02NTT DOCOMO INC

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
NTT DOCOMO INC
Filing Date
2024-12-23
Publication Date
2026-07-02

AI Technical Summary

Technical Problem

Existing communication systems in virtual spaces, such as the metaverse, fail to effectively personalize dialogue content with NPCs, leading to low appeal when promoting purchases of products or services.

Method used

An information processing apparatus that acquires context and value information from user interactions with NPCs, generating personalized speech recommendations using a large language model based on user values and dialogue context to enhance the appeal of product promotions.

Benefits of technology

Improves the effectiveness of promoting purchases by generating personalized product recommendations tailored to the user's context and values, enhancing the appeal of virtual space interactions.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure JP2024045497_02072026_PF_FP_ABST
    Figure JP2024045497_02072026_PF_FP_ABST
Patent Text Reader

Abstract

This information processing device comprises: a context acquisition unit that acquires, on the basis of the contents of a dialog between a user and a non-player character (NPC) in a virtual space, context information indicating the context of the dialog; and a generation unit that generates, on the basis of the context information and a sense-of-value segment that is information indicating the sense of value of the user, generation information for generating an utterance sentence by the NPC for recommending, to the user, a product and a service (product or the like) designated for recommendation to the user.
Need to check novelty before this filing date? Find Prior Art

Description

Information Processing Apparatus and Information Processing Method

[0001] The present invention relates to an information processing apparatus and an information processing method.

[0002] In a virtual space exemplified by a metaverse, communication by an NPC (Non Player Character) is provided to a user. The NPC communicates with the user based on, for example, a stereotyped dialogue script. Also, a technique is known in which the operation of an avatar by a user in a virtual space is learned, and the avatar is autonomously made to act based on the learning result (see, for example, Patent Document 1).

[0003] Japanese Unexamined Patent Application Publication No. 2020-101950

[0004] Through communication such as dialogue in a virtual space, promotion of purchases of services, products, etc. by users has been carried out. However, since the content of the dialogue did not correspond to individual users, the appealing effect was low.

[0005] Therefore, an object of the present disclosure is to improve the appealing effect when promoting purchases of products, services, etc. through communication with a user in a virtual space.

[0006] In order to solve the above problems, an information processing apparatus according to one aspect of the present disclosure includes a context acquisition unit that acquires context information indicating a context in a dialogue based on the dialogue content between a user and an NPC (non-player character) in a virtual space, and a generation unit that generates generation information for generating a speech sentence by an NPC for recommending products and services (products, etc.) designated as objects to be recommended to the user based on a value segment, which is information indicating the user's values, and the context information.

[0007] According to the above aspect, generation information is generated based on the context information estimated based on the dialogue content between the user and the NPC and the value segment of the user. By inputting the generation information into a generation AI such as a large language model (LLM), it becomes possible to generate a speech sentence for recommending a product to the user that is personalized according to the situation and scene of the dialogue represented by the context information.

[0008] This makes it possible to improve the effectiveness of appeals when promoting the purchase of products and services through communication with users in a virtual space.

[0009] This is a block diagram showing the functional configuration of the information processing device of this embodiment. This is a diagram showing the configuration of the value information storage unit and an example of the value information stored therein. This is a diagram showing an example of the value segment estimation process using a value estimation model. This is a diagram showing an example of the context information estimation process using a context estimation model. This is a diagram showing the configuration of the product information storage unit and an example of the product information stored therein. This is a diagram showing an example of a prompt (generated information) generated by the generation unit. This is a flowchart showing the processing content of the information processing method in the information processing system. This is a diagram showing the configuration of the information processing program. This is a hard block diagram of the information processing device.

[0010] Embodiments of the information processing apparatus according to the present invention will be described with reference to the drawings. Where possible, the same parts will be denoted by the same reference numerals, and redundant descriptions will be omitted.

[0011] Figure 1 is a block diagram showing the device configuration and functional configuration of an information processing system including an information processing device according to this embodiment. Information processing system 1 is a system that generates generated information (prompts) for generating AI to generate speech content for recommending products and services to users by NPCs (Non-Player Characters) in a virtual space. Information processing system 1 may be configured as part of a system that controls a virtual space exemplified by the metaverse, or it may be configured as a separate system.

[0012] As shown in Figure 1, the information processing system 1 may be composed of an information processing device 10. Functionally, the information processing device 10 includes a value acquisition unit 11, a context estimation unit 12 (context acquisition unit), a product information acquisition unit 13, a generation unit 14, a speech generation unit 15, and a speech output unit 16.

[0013] Each functional unit of the information processing device 10 is configured to access storage means such as a values ​​information storage unit 17 and a product information storage unit 18. The values ​​information storage unit 17 is a storage means that stores values ​​segments that indicate and identify the user's values, associated with a user ID that identifies the user. The product information storage unit 18 is a storage means that stores product information indicating products and services that are recommended to the user. In the example shown in Figure 1, the values ​​information storage unit 17 and the product information storage unit 18 are configured to be provided in external devices accessible from the information processing device 10, but they may also be configured within the information processing device 10.

[0014] The information processing device 10 is configured to access the server 20 which houses the generated AI model md1. In this embodiment, the generated AI model md1 is provided on the server 20, which is a different device from the information processing device 10, but it may also be provided on the information processing device 10. Also, in the example shown in Figure 1, the functional units 11 to 16 are configured on a single information processing device 10, but they may be distributed across multiple devices.

[0015] Furthermore, the information processing device 10 can obtain the content D of the conversation between the user and the NPC from the system that controls the virtual space. In addition, the information processing device 10 can cause the NPC to speak the generated utterance by outputting it to the system that controls the virtual space.

[0016] Next, the functional units of the information processing device 10 will be described. The value acquisition unit 11 acquires a value segment that represents the user's values. In this embodiment, the value acquisition unit 11 acquires the user's value segment from the value information storage unit 17.

[0017] Figure 2 shows the configuration of the values ​​information storage unit 17 and an example of the values ​​information stored therein. As shown in Figure 2, the values ​​information storage unit 17 stores values ​​segments that indicate and identify a user's values, associating them with a user ID that identifies the user. For example, a user with user ID "A" has the values ​​segment "I want to act like everyone else," which is identified by the identifier "v01."

[0018] Value segments are information estimated based on user information, including at least the user's behavioral history. Specifically, value segments may be estimated by a value estimation model that outputs value segments in response to user information input.

[0019] Figure 3 shows an example of the value segment estimation process using a value estimation model. The value estimation model md2 is constructed using machine learning with training data that associates user information A with value segment SV.

[0020] User information A includes at least the user's behavioral history, and may further include purchase history, activity history, application usage history, location information, and hobbies and preferences. A values ​​segment is a given label that classifies users according to the values ​​they hold and associates with each user, and for example, in this embodiment, it includes variations such as "want to act like everyone else," "evaluate benefits as challenges," "emphasize immediate value," "want to avoid losses," and "fan of the service."

[0021] The construction (generation) of the values ​​estimation model md2 is described below. User information A is collected for training the values ​​estimation model md2. User information A includes, for example, behavioral history, which is information associated with the user ID, date and time, and location information. Alternatively, user information A may be information associated with the user ID, date and time, and content, and the content may be information such as places visited and applications used.

[0022] The value segments SV, which serve as the correct labels for the training data, may be obtained, for example, based on a survey of users. By conducting a survey with users consisting of a set of questions for deriving each value segment, calculating the probability that a user belongs to each value segment, and associating the value segment with the highest probability with that user, training data can be obtained that associates each user's user information A with the value segment SV.

[0023] The type of model that constitutes the value estimation model md2 is not limited, but it may be a model that includes a neural network or a transformer. By performing machine learning on the value estimation model md2, which is composed of transformers, etc., using training data consisting of associations between user information A and value segment SV, a trained value estimation model md2 is obtained. Then, by inputting user information A of a user whose value segment SV is unknown into the trained value estimation model md2, the value segment SV of that user is obtained.

[0024] The values ​​estimation model md2 may be generated for each type of values ​​segment SV. That is, for one type of values ​​segment SV, a values ​​estimation model md2 can be obtained that classifies whether a user who is unfamiliar with the values ​​segment SV possesses that value segment SV, by performing machine learning using training data consisting of user information A and a label indicating whether the user possesses the values ​​segment SV.

[0025] The context estimation unit 12 acquires contextual information indicating the context of the dialogue based on the content of the dialogue between the user and the NPC in the virtual space. Contextual information is information indicating the state of the dialogue between the user and the NPC and the phase of the dialogue. The context estimation unit 12 may also estimate contextual information based on the content of the dialogue between the user and the NPC. The context estimation unit 12 uses the estimated contextual information to generate generated information.

[0026] Figure 4 shows an example of contextual information estimation processing by a contextual estimation model. The contextual estimation model md3 is a model that outputs contextual information CX in response to the input of the dialogue content D between the user and the NPC. The contextual estimation model md3 is constructed by machine learning using training data that associates the dialogue content D with the contextual information CX as the correct label.

[0027] Dialogue content D may include at least two utterances. Furthermore, dialogue content may include at least two consecutive utterances between the user and the NPC. For example, dialogue content d1, an example of dialogue content D, includes the user's utterance "Hello" followed by the NPC's utterance "Hello." Similarly, dialogue content d2, another example of dialogue content D, includes the user's utterance "That's a nice outfit," followed by the NPC's utterance "I think so," and the user's utterance "Nice."

[0028] Contextual information is a given label associated with the content of a dialogue, indicating the state and phase of the dialogue, and has variations such as "introduction to the conversation," "introduction to the event," "final push," and "free conversation."

[0029] The type of model that constitutes the context estimation model md3 is not limited, but it may be a model that includes a neural network or a transformer. By performing machine learning on the context estimation model md3, which is composed of transformers, etc., using training data consisting of associations between dialogue content D and context information CX, a trained value estimation model md2 is obtained. Then, by inputting dialogue content D for which the context information CX is unknown into the trained context estimation model md3, the context information CX of the dialogue content D is obtained.

[0030] The product information acquisition unit 13 acquires product information indicating the products to be recommended to the user. Specifically, the product information acquisition unit 13 may acquire product information from the product information storage unit 18.

[0031] Figure 5 shows the configuration of the product information storage unit 18 and an example of the product information stored therein. As shown in Figure 5, the product information storage unit 18 stores product information that associates product names with product knowledge. The product name identifies the product. Product knowledge includes, for example, information such as price, characteristics, and uses for the product. The information included in the product knowledge is not limited to the example shown in Figure 5.

[0032] The products to be recommended to the user may be specified by the administrator or other users in the virtual space. The product information acquisition unit 13 acquires the corresponding product information based on the specification of the products to be recommended to the user.

[0033] The generation unit 14 generates generation information for generating speech statements by an NPC to recommend products designated as items to be recommended to the user, based on the value segment SV and contextual information CX. Specifically, the generation unit 14 generates prompts as generation information for input into the generation AI model md1.

[0034] The generation unit 14 obtains contextual information CX, estimated from the content of the conversation between the user and the NPC, from the context estimation unit 12, based on the user ID that identifies the user to whom the product is recommended. The generation unit 14 also obtains the user's value segment SV from the value acquisition unit 11, based on the user ID. The generation unit 14 also obtains product information for the product to be recommended to the user from the product information acquisition unit 13. Finally, the generation unit 14 generates a prompt based on the obtained contextual information CX, value segment SV, and product information.

[0035] Figure 6 shows an example of a prompt generated by the generation unit 14. As shown in Figure 6, the prompt PT includes instruction information for instructing the generation AI model md1 to generate an utterance by the NPC.

[0036] The generation unit 14 may generate a prompt PT that includes contextual information CX and at least one of the sentences and phrases selected according to the contextual information CX. In the example of the prompt PT shown in Figure 6, the generation unit 14 includes the phrase cx1 (first meeting) selected based on the contextual information CX in the prompt PT. The generation unit 14 may also include an example sentence selected based on the contextual information CX in the prompt PT.

[0037] The generation unit 14 may include product information in the prompt PT that indicates the product to be recommended to the user. As shown in Figure 6, the generation unit 14 includes product information it1 (penlight) indicating the product name in the prompt PT. The generation unit 14 may also include product knowledge from the product information in the prompt PT.

[0038] The generation unit 14 may include the value segment SV in the prompt PT. As shown in Figure 6, the generation unit 14 includes the value segment sv1 acquired by the value acquisition unit 11 as information indicating the user's values ​​in the prompt PT.

[0039] As illustrated in Figure 6, the generation unit 14 may generate a prompt PT by embedding contextual information CX, product information, and value segment SV, etc., into the fields of various information in a pre-prepared prompt PT template.

[0040] Furthermore, the generation unit 14 may select example sentences based on contextual information CX and value segment SV, and include the selected example sentence ex1 in the prompt PT. In this case, the generation unit 14 may, for example, obtain example sentences by referring to a database that has previously stored example sentences associated with combinations of contextual information CX and value segment SV.

[0041] Furthermore, the generation unit 14 may include action information ac1 for instructing the NPC's actions and emotions in the prompt PT.

[0042] Referring again to Figure 1, the speech generation unit 15 inputs the generated information to the generation AI model md1 and acquires the content output from the generation AI model md1 as a speech sentence. This makes it possible to generate speech sentences that are personalized to the user and recommend products to the user, according to the situation and phase of the dialogue represented by the contextual information CX.

[0043] The speech output unit 16 causes the NPC to output the speech sentence u acquired by the speech generation unit 15. By having the NPC issue speech sentence u that is personalized to the user according to the situation and context of the conversation and recommends products, it is possible to promote the user's purchase of products.

[0044] Figure 7 is a flowchart showing the information processing method for generating the generated information (prompt PT) in the information processing system 1 and the information processing apparatus 10, and the processing content of the output of the utterance text of the NPC.

[0045] In step S1, the value view acquisition unit 11 acquires a value view segment indicating the value view of the user. In step S2, the context estimation unit 12 acquires context information based on the dialogue content between the user and the NPC in the virtual space.

[0046] In step S3, the product information acquisition unit 13 acquires product information indicating the product to be recommended to the user. Note that the processing in steps S1 to S3 may be performed in any order.

[0047] In step S4, the generation unit 14 generates generation information (prompt PT) for generating an utterance text by the NPC for recommending the product indicated in the product information to the user based on the value view segment SV and the context information CX.

[0048] In step S5, the utterance generation unit 15 inputs the generation information into the generation AI model md1. In step S6, the utterance generation unit 15 acquires the content output from the generation AI model md1 as an utterance text.

[0049] In step S7, the utterance output unit 16 causes the NPC to output the utterance text u acquired by the utterance generation unit 15.

[0050] Next, referring to FIG. 8, an information processing program for causing a computer to function as the information processing apparatus 10 of the present embodiment will be described. FIG. 8 is a diagram showing the configuration of the information processing program. The information processing program P1 includes a main module m10 that comprehensively controls the information processing in the information processing apparatus 10, a value view acquisition module m11, a context estimation module m12, a product information acquisition module m13, a generation module m14, an utterance generation module m15, and an utterance output module m16. Then, each function for each functional unit 11 to 16 is realized by each of the modules m11 to m16.

[0051] The information processing program P1 may be transmitted via a transmission medium such as a communication line, or it may be stored in a recording medium M1, as shown in Figure 8.

[0052] According to the information processing system 1, information processing device 10, information processing method, and information processing program P1 of this embodiment described above, a prompt PT is generated based on contextual information CX estimated from the content of the dialogue between the user and the NPC, and the user's value segment SV. When the prompt PT is input to a generation AI such as a large-scale language model (LLM), it becomes possible to generate utterances that are personalized to the user and recommend products to the user, according to the situation and phase of the dialogue represented by the contextual information CX. When the NPC then speaks the generated utterances, the appeal effect to promote the purchase of products is improved.

[0053] The information processing apparatus and information processing method relating to this disclosure may have the following configurations. The operation and effects of each configuration are described below.

[0054] An information processing device relating to one aspect of this disclosure includes a context acquisition unit that acquires contextual information indicating the context of a dialogue based on the content of a dialogue between a user and an NPC (non-player character) in a virtual space, and a generation unit that generates generation information for generating speech sentences by an NPC to recommend products and services (products, etc.) designated as items to be recommended to the user, based on a value segment which is information indicating the user's values ​​and the contextual information.

[0055] An information processing method relating to one aspect of this disclosure is a method executed by a processor, comprising: a context acquisition step of acquiring contextual information indicating the context of a dialogue based on the content of a dialogue between a user and an NPC (non-player character) in a virtual space; and a generation unit of generating generation information for generating speech sentences by an NPC to recommend products and services (products, etc.) designated as items to be recommended to the user, based on a value segment which is information indicating the user's values ​​and the contextual information.

[0056] Based on the aspects described above, generated information is created based on contextual information estimated from the content of the conversation between the user and the NPC, and the user's value segment. When this generated information is input into a generative AI such as a Large-Scale Language Model (LLM), it becomes possible to generate utterances that are personalized to the user and recommend products to the user, according to the situation and phase of the conversation represented by the contextual information. When the NPC then speaks these generated utterances, the appeal effect to promote the purchase of products is improved.

[0057] Furthermore, in information processing devices relating to other aspects, the value segments may be estimated based on user information, including at least the user's behavioral history, and may be estimated by a value estimation model constructed using machine learning with training data that associates user information with value segments, and which outputs value segments in response to user information input.

[0058] Based on the above aspects, it becomes possible to easily obtain value segments that appropriately represent and identify the user's values. Then, by inputting the generated information based on the value segments into the generating AI, it becomes possible to generate personalized speech sentences according to the user's values.

[0059] Furthermore, in information processing devices relating to other aspects, contextual information may be estimated based on at least two consecutive utterances made by the user or NPC prior to the present, and may be estimated by a context estimation model constructed by machine learning using training data that associates at least two utterances with context, and which outputs contextual information in response to input of at least two utterances.

[0060] Based on the aspects described above, it becomes possible to easily obtain contextual information that appropriately represents the state and situation of the interaction between the user and the NPC. Then, by inputting the generated information generated based on the contextual information into the generating AI, it becomes possible to generate appropriate utterances according to the state and situation of the interaction.

[0061] Furthermore, in information processing devices relating to other aspects, the generation unit may generate generated information that includes contextual information and at least one of the sentences and phrases selected according to the contextual information.

[0062] Based on the above aspects, it is possible to reliably include information representing the state and stage of the interaction between the user and the NPC in the generated information.

[0063] Furthermore, information processing devices relating to other aspects may further include a speech generation unit that inputs generated information into a generation AI model and acquires the content output from the generation AI model as spoken text.

[0064] Based on the aspects described above, it becomes possible to generate personalized utterances that recommend products to the user, depending on the situation and context of the dialogue as represented by contextual information.

[0065] Furthermore, in information processing devices relating to other aspects, the generation unit may include product information in the generated information that indicates the products to be recommended to the user.

[0066] Based on the above aspects, when the generated information is input into the generating AI, it becomes possible to obtain speech statements that recommend the purchase of products represented by the product information to the user.

[0067] Furthermore, in information processing devices relating to other aspects, the generation unit may include value segments in the generated information.

[0068] Based on the above aspects, by inputting the generated information into the generation AI, it becomes possible to obtain utterances that reflect the user's values.

[0069] The block diagram shown in Figure 1 represents functional units. These functional blocks (components) are realized by any combination of at least one of hardware and software. Furthermore, the method of realizing each functional block is not particularly limited. That is, each functional block may be realized using one device that is physically or logically coupled, or it may be realized using two or more physically or logically separated devices that are directly or indirectly connected (for example, using wired or wireless connections). A functional block may also be realized by combining software with the one or more devices described above.

[0070] Functions include, but are not limited to, judgment, decision, determination, calculation, calculation, processing, derivation, investigation, exploration, confirmation, reception, transmission, output, access, resolution, selection, selection, establishment, comparison, assumption, expectation, assumption, broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating (mapping), and assigning. For example, a functional block (configuration part) that enables transmission is called a transmitting unit or transmitter. In all cases, as mentioned above, the method of implementation is not particularly limited.

[0071] For example, the information processing device 10 in one embodiment of the present invention may function as a computer. Figure 9 is a diagram showing an example of the hardware configuration of the information processing device 10 according to this embodiment. Physically, the information processing device 10 may be configured as a computer device including a processor 1001, memory 1002, storage 1003, communication device 1004, input device 1005, output device 1006, bus 1007, etc.

[0072] In the following explanation, the term "device" can be replaced with "circuit," "device," "unit," etc. The hardware configuration of the information processing device 10 may include one or more of the devices shown in Figure 9, or it may be configured to omit some of the devices.

[0073] Each function in the information processing device 10 is realized by loading predetermined software (programs) onto hardware such as the processor 1001 and memory 1002, allowing the processor 1001 to perform calculations and control communication by the communication device 1004, as well as the reading and / or writing of data in the memory 1002 and storage 1003.

[0074] The processor 1001 controls the entire computer, for example, by running an operating system. The processor 1001 may consist of a central processing unit (CPU) that includes interfaces with peripheral devices, control devices, arithmetic units, registers, etc. For example, the various functional units 11 to 16 shown in Figure 1 may be implemented by the processor 1001.

[0075] Furthermore, the processor 1001 reads programs (program code), software modules, and data from the storage 1003 and / or communication device 1004 into the memory 1002, and executes various processes accordingly. The program used is one that causes the computer to execute at least a part of the operations described in the above embodiment. For example, each of the functional units 11 to 16 of the information processing device 10 may be stored in the memory 1002 and implemented by a control program that runs on the processor 1001. Although the above-described processes have been explained as being executed by one processor 1001, they may be executed simultaneously or sequentially by two or more processors 1001. The processor 1001 may be implemented on one or more chips. The program may also be transmitted from a network via a telecommunications line.

[0076] The memory 1002 is a computer-readable recording medium and may consist of at least one of the following: ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), RAM (Random Access Memory), etc. The memory 1002 may also be called a register, cache, main memory, etc. The memory 1002 can store executable programs (program code), software modules, etc., for carrying out an information processing method according to one embodiment of the present invention.

[0077] The storage 1003 is a computer-readable recording medium and may consist of at least one of the following: an optical disc such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, a magneto-optical disk (e.g., a compact disc, a digital multipurpose disc, a Blu-ray® disc), a smart card, flash memory (e.g., a card, a stick, a key drive), a floppy® disk, a magnetic strip, etc. The storage 1003 may also be called an auxiliary storage device. The above-mentioned storage medium may be, for example, a database, server, or other suitable medium including memory 1002 and / or storage 1003.

[0078] The communication device 1004 is hardware (transceiver / receiver device) for communicating between computers via a wired and / or wireless network, and is also referred to as a network device, network controller, network card, communication module, etc.

[0079] The input device 1005 is an input device that accepts input from an external source (e.g., a keyboard, mouse, microphone, switch, button, sensor, etc.). The output device 1006 is an output device that outputs to an external source (e.g., a display, speaker, LED lamp, etc.). The input device 1005 and the output device 1006 may be configured as an integrated unit (e.g., a touch panel).

[0080] Furthermore, each device, such as the processor 1001 and the memory 1002, is connected by a bus 1007 for communicating information. The bus 1007 may consist of a single bus, or different buses may be used for communication between devices.

[0081] Furthermore, the information processing device 10 may be configured to include hardware such as a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array), and some or all of each functional block may be realized by such hardware. For example, the processor 1001 may be implemented using at least one of these hardware components.

[0082] The notification of information is not limited to the embodiments described herein and may be carried out by other means. For example, the notification of information may be carried out by physical layer signaling (e.g., DCI (Downlink Control Information), UCI (Uplink Control Information)), upper layer signaling (e.g., RRC (Radio Resource Control) signaling, MAC (Medium Access Control) signaling, broadcast information (MIB (Master Information Block), SIB (System Information Block))), other signals, or combinations thereof. RRC signaling may also be called RRC messages, and may be, for example, RRC Connection Setup messages, RRC Connection Reconfiguration messages, etc.

[0083] Each aspect / embodiment described in this disclosure may be applied to at least one of the following systems: LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G (4th generation mobile communication system), 5G (5th generation mobile communication system), FRA (Future Radio Access), NR (new Radio), W-CDMA®, GSM®, CDMA2000, UMB (Ultra Mobile Broadband), IEEE 802.11 (Wi-Fi®), IEEE 802.16 (WiMAX®), IEEE 802.20, UWB (Ultra-WideBand), Bluetooth®, and other appropriate systems, as well as next-generation systems extended based thereon. Furthermore, multiple systems may be applied in combination (for example, a combination of at least one of LTE and LTE-A with 5G).

[0084] The processing procedures, sequences, flowcharts, etc., of each aspect / embodiment described in this disclosure may be reordered, provided they do not contradict each other. For example, the methods described in this disclosure present various step elements using exemplary order and are not limited to the specific order presented.

[0085] The specific operations described in this disclosure as being performed by a base station may, in some cases, be performed by its upper node. In a network consisting of one or more network nodes having a base station, it is clear that various operations performed for communication with a terminal can be performed by the base station and at least one other network node (for example, an MME or S-GW, but not limited to these). Although the above example illustrates the case where there is one other network node besides the base station, it may also be a combination of multiple other network nodes (for example, an MME and an S-GW).

[0086] Information can be output from a higher layer (or lower layer) to a lower layer (or higher layer). Input and output may also occur via multiple network nodes.

[0087] Input and output information may be stored in a specific location (e.g., memory) or managed in a management table. Input and output information may be overwritten, updated, or appended to. Output information may be deleted. Input information may be sent to other devices.

[0088] The determination may be made by a value represented by one bit (0 or 1), by a boolean value (true or false), or by a numerical comparison (for example, a comparison with a predetermined value).

[0089] Each aspect / embodiment described in this disclosure may be used individually, in combination, or switched between as needed during implementation. Furthermore, notification of specific information (e.g., notification that "X is") is not limited to explicit notification, but may also be implicit (e.g., by not providing such notification).

[0090] Although the present disclosure has been described in detail above, it will be clear to those skilled in the art that the present disclosure is not limited to the embodiments described herein. The present disclosure can be implemented in modified and altered forms without departing from the intent and scope of the present disclosure as defined by the claims. Therefore, the descriptions in the present disclosure are illustrative and not intended to be restrictive in any way.

[0091] Software should be broadly interpreted to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures, functions, and so on, whether they are called software, firmware, middleware, microcode, hardware description languages, or by any other name.

[0092] Furthermore, software, instructions, etc., may be transmitted and received via a transmission medium. For example, if software is transmitted from a website, server, or other remote source using wired technologies such as coaxial cable, fiber optic cable, twisted pair, and digital subscriber lines (DSL) and / or wireless technologies such as infrared, radio, and microwave, these wired and / or wireless technologies are included in the definition of a transmission medium.

[0093] The information, signals, etc. described in this disclosure may be represented using any of the various different techniques. For example, the data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may be represented by voltage, current, electromagnetic waves, magnetic fields or magnetic particles, optical fields or photons, or any combination thereof.

[0094] In addition, terms described in this disclosure and / or terms necessary for understanding this specification may be replaced with terms having the same or similar meaning.

[0095] The terms “system” and “network” as used in this disclosure are interchangeable.

[0096] Furthermore, the information, parameters, etc., described in this disclosure may be expressed as absolute values, relative values ​​from a given value, or by corresponding other information. For example, wireless resources may be indicated by an index.

[0097] The names used for the parameters described above are not restrictive in any way. Furthermore, the formulas and other expressions using these parameters may differ from those expressly disclosed in this disclosure. Various channels (e.g., PUCCH, PDCCH, etc.) and information elements can be identified by any suitable name, and therefore, the various names assigned to these various channels and information elements are not restrictive in any way.

[0098] As used in this disclosure, the terms “determining” and “determining” may encompass a wide variety of actions. “Determining” may include, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, or inquiring (e.g., searching in a table, database, or other data structure), or ascertaining. “Determining” may also include receiving (e.g., receiving information), transmitting (e.g., sending information), inputting, outputting, or accessing (e.g., accessing data in memory). Furthermore, "judgment" and "decision" can include considering something as having been "judged" or "decided" after resolving, selecting, choosing, establishing, comparing, etc. In other words, "judgment" and "decision" can include considering something as having been "judged" or "decided" after some action. Also, "judgment (decision)" can be reinterpreted as "assuming," "expecting," or "considering."

[0099] As used in this disclosure, the phrase "based on" does not mean "based solely on" unless otherwise specified. In other words, the phrase "based on" means both "based solely on" and "based at least on."

[0100] Where the terms “first,” “second,” etc., are used in this disclosure, no reference to those elements shall generally limit the quantity or order of those elements. These terms may be used herein as a convenient way to distinguish between two or more elements. Accordingly, references to the first and second elements shall not imply that only two elements may be employed therein, or that the first element must precede the second element in any way.

[0101] In the configuration of each of the above devices, "means" may be replaced with "part," "circuit," "device," etc.

[0102] To the extent that “include,” “including,” and their variations are used herein or in the claims, these terms are intended to be inclusive, as is the term “comprising.” Furthermore, the term “or” as used herein or in the claims is not intended to be exclusive OR.

[0103] In this disclosure, if articles are added through translation, such as a, an, and the in English, this disclosure may include the fact that the noun following these articles is plural.

[0104] In this disclosure, the term "A and B are different" may mean "A and B are different from each other." The term may also mean "A and B are each different from C." Terms such as "separate" and "combine" may be interpreted similarly to "different."

[0105] The information processing apparatus 10 and information processing method of this disclosure may have the following configurations.

[0106] [1] An information processing device comprising: a context acquisition unit that acquires contextual information indicating the context of a dialogue based on the content of a dialogue between a user and an NPC (non-player character) in a virtual space; and a generation unit that generates generation information for generating speech statements by the NPC to recommend products and services (products, etc.) designated as items to be recommended to the user, based on a value segment which is information indicating the user's values ​​and the contextual information.

[0107] [2] The information processing device according to [1], wherein the value segment is information estimated based on user information including at least the user's behavioral history, and is estimated by a value estimation model constructed by machine learning using learning data that associates the user information with the value segment, and outputs the value segment in response to the input of the user information.

[0108] [3] The information processing device according to [1] or [2], wherein the contextual information is information estimated based on at least two consecutive utterances made by the user or NPC prior to the present, and is estimated by a contextual estimation model constructed by machine learning using training data that associates the at least two utterances with the context, and outputs the contextual information in response to the input of the at least two utterances.

[0109] [4] The information processing apparatus according to any one of [1] to [3], wherein the generation unit generates the generated information which includes the contextual information and at least one of the sentences and phrases selected according to the contextual information.

[0110] [5] The information processing apparatus according to any one of [1] to [4], further comprising: a speech generation unit that inputs the generated information to a generation AI model and acquires the content output from the generation AI model as the utterance.

[0111] [6] The information processing apparatus according to any one of [1] to [5], wherein the generation unit includes product information indicating the product to be recommended to the user in the generated information.

[0112] [7] The information processing apparatus according to any one of [1] to [6], wherein the generation unit includes the value segment in the generated information.

[0113] [8] An information processing method executed by a processor, comprising: a context acquisition step of acquiring context information indicating the context of a dialogue based on the content of a dialogue between a user and an NPC (non-player character) in a virtual space; and a generation unit that generates generation information for generating speech sentences by the NPC to recommend products and services (products, etc.) designated as items to be recommended to the user, based on a value segment which is information indicating the user's values ​​and the context information.

[0114] 1... Information processing system, 10... Information processing device, 11... Value acquisition unit, 12... Context estimation unit, 13... Product information acquisition unit, 14... Generation unit, 15... Utterance generation unit, 16... Utterance output unit, 17... Value information storage unit, 18... Product information storage unit, 20... Server, M1... Recording medium, m11... Value acquisition module, m12... Context estimation module, m13... Product information acquisition module, m14... Generation module, m15... Utterance generation module, m16... Utterance output module, md1... Generation AI model, md2... Value estimation model, md3... Context estimation model, P1... Information processing program.

Claims

1. An information processing device comprising: a context acquisition unit that acquires contextual information indicating the context of a dialogue based on the content of a dialogue between a user and an NPC (non-player character) in a virtual space; and a generation unit that generates generation information for generating speech statements by the NPC to recommend products and services (products, etc.) designated as items to be recommended to the user, based on a value segment which is information indicating the user's values ​​and the contextual information.

2. The information processing apparatus according to claim 1, wherein the value segment is information estimated based on user information including at least the user's behavioral history, and is estimated by a value estimation model constructed by machine learning using training data that associates the user information with the value segment, and outputs the value segment in response to the input of the user information.

3. The information processing apparatus according to claim 1, wherein the contextual information is information estimated based on at least two consecutive utterances made by the user or NPC in the present or prior to the present, and is estimated by a contextual estimation model constructed by machine learning using training data that associates the at least two utterances with the context, and outputs the contextual information in response to the input of the at least two utterances.

4. The information processing apparatus according to claim 1, wherein the generation unit generates the generated information which includes the contextual information and at least one of the sentences and phrases selected according to the contextual information.

5. The information processing apparatus according to claim 1, further comprising a speech generation unit that inputs the generated information into a generation AI model and acquires the content output from the generation AI model as the utterance.

6. The information processing apparatus according to claim 1, wherein the generation unit includes product information indicating the product or other item to be recommended to the user in the generated information.

7. The information processing apparatus according to claim 1, wherein the generation unit includes the value segment in the generated information.

8. An information processing method executed by a processor, comprising: a context acquisition step of acquiring contextual information indicating the context of a dialogue based on the content of a dialogue between a user and an NPC (non-player character) in a virtual space; and a generation unit that generates generation information for generating speech statements by the NPC to recommend products and services (products, etc.) designated as items to be recommended to the user, based on a value segment which is information indicating the user's values ​​and the contextual information.