system
The system addresses the challenge of uniform responses by using generative AI to create personalized call scripts, analyze call results, and provide tailored customer service, enhancing service quality and satisfaction.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
AI Technical Summary
Conventional customer support systems provide uniform responses, failing to reflect individual customer needs and interests, and do not effectively utilize post-call information for personalized service improvements.
A system that collects customer reservation and response information, generates personalized call content using generative AI, automatically makes calls, analyzes results, and summarizes key information to tailor customer service.
Enables personalized customer service by generating tailored call scripts, improving service quality and customer satisfaction through efficient data utilization.
Smart Images

Figure 2026105383000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, the method including the steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] In a conventional customer support system, when making a call to a customer in advance, a uniform response is given to all customers, so there is a problem that different needs and interests of each customer cannot be fully reflected. In addition, there is also a problem that the information obtained after the call cannot be effectively utilized and cannot be used for customer service when the customer visits the store. As a result, the variation in customer service quality and the improvement of customer satisfaction are limited.
Means for Solving the Problems
[0005] This invention solves these problems by providing a system that collects customer reservation and response information and generates optimal call content using generation AI technology based on that information. Furthermore, it automatically makes calls using the generated call content, analyzes the results, and summarizes important information. Based on this summarized information, it notifies staff of customer service suggestions, providing a means to realize customer service tailored to each customer's needs.
[0006] "Customer" refers to any person who uses a store or system for the purpose of receiving goods or services.
[0007] "Reservation information" refers to data that represents the date and time of a customer's planned visit and related details.
[0008] "Response information" refers to the data of responses that customers provide to surveys and questions.
[0009] "Generative AI" refers to a technology that uses machine learning models to generate desired products from specific data.
[0010] "Call content" refers to the pre-planned speech prepared for customer service.
[0011] "Automated calling" refers to the act of automatically transmitting pre-set speech content to a customer through a machine.
[0012] "Call results" refer to the result data obtained after a call has been made, including information such as whether the call was successful or not and the customer's response.
[0013] A "summary" refers to a document that extracts and compiles important information and key points from a vast amount of data.
[0014] "Staff" refers to individuals who have received the necessary training to handle customer interactions and who provide customer service using the system.
[0015] "Proposal" refers to an instruction or advice that suggests the optimal actions or countermeasures in a specific situation.
Brief Explanation of Drawings
[0016] [Figure 1] It is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which multiple emotions are mapped. [Figure 10] It shows an emotion map to which multiple emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.
Embodiments for Carrying Out the Invention
[0017] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0018] First, the terms used in the following description will be explained.
[0019] In the following embodiments, the numbered processor (hereinafter simply referred to as "processor") may be one arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be one type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0020] In the following embodiments, the numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0021] In the following embodiments, the numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.
[0022] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0023] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0024] [First Embodiment]
[0025] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0026] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0027] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0028] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0029] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0030] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0031] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0032] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0033] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0034] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0035] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0036] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0037] In an embodiment of this invention, the system operates according to the following procedure.
[0038] First, the user makes a reservation to visit the store through an online platform. During this process, they enter information such as the date and time, family composition, and products of interest in a reservation form, which is immediately saved to a database. Next, the terminal sends the user's input to the server.
[0039] The server activates a specific generative AI model to generate appropriate call content based on the received reservation and response information. This generative AI model is trained to generate customized call scripts tailored to the customer's characteristics. For example, if a user is planning a visit with their family, the AI model will create a call script that includes information about a "special family package."
[0040] The generated call content is sent from the server to an automated calling system. This system automatically dials the customer's phone number and plays back the generated call content as synthesized speech or a recording. The user receives this call and can listen to specific instructions or make choices in real time as needed.
[0041] After the call, the server analyzes the call results obtained from the auto-dialing system and uses generative AI to summarize the results. The summary includes, for example, whether the call was successful, the user's response, and the options selected. This summary information is then formatted for use in subsequent customer service.
[0042] Finally, the server notifies staff based on the analyzed data. These notifications appear on terminals and provide suggestions on how staff should interact with customers upon their arrival. This enables staff to provide high-quality customer service tailored to each customer's individual needs.
[0043] The specific way this system operates can dramatically improve the efficiency and accuracy of traditional customer service processes. For example, if a user expresses interest in a campaign, staff can use that information to make specific suggestions, thus improving customer satisfaction.
[0044] The following describes the processing flow.
[0045] Step 1:
[0046] Users make reservations to visit the store via an online platform. The date and time entered in the reservation form, family composition, and information about products of interest are checked by the reservation system's terminal and sent to the server.
[0047] Step 2:
[0048] The server stores the received reservation information in a database and sends the stored data to a generating AI model. This AI model is trained to generate call content optimized for the customer based on the collected information.
[0049] Step 3:
[0050] The server delivers the call content received from the generated AI model to the auto-call system. This call content is customized according to the customer's specific needs, and the AI incorporates appropriate guidance and suggestions.
[0051] Step 4:
[0052] The automated calling system uses the call content received from the server to automatically dial the registered customer's phone number. The user receives this automated call and listens to the information provided.
[0053] Step 5:
[0054] After the call ends, the server collects call result data from the auto-call system. This includes whether the call was successful, the user's response, and the options selected.
[0055] Step 6:
[0056] The server analyzes the call results and uses a generative AI to create a summary, picking out meaningful information. This summary includes key points that should be used for future actions.
[0057] Step 7:
[0058] The summarized data is sent from the server to the staff's terminals. This information is displayed on the terminals, allowing staff to understand the specific suggestions needed to assist customers during their visits.
[0059] Step 8:
[0060] By using terminals to provide optimized customer service to customers, staff can offer customers a higher quality service experience.
[0061] (Example 1)
[0062] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0063] Traditional customer service processes face challenges in generating personalized call content and improving accuracy and efficiency in subsequent customer interactions. In particular, standardized call scripts may not adequately address the need for flexible responses tailored to individual customer needs. Furthermore, manual analysis of call results and summarization of information increases workload and slows response times.
[0064] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0065] In this invention, the server includes means for collecting customer reservation information, means for creating personalized call content using a generation AI model based on the collected information, and means for analyzing the outcome of the call and summarizing the key points. This enables the rapid generation of personalized call content and improved accuracy in customer service.
[0066] "Customer reservation information" refers to information provided by customers when they make a reservation to visit a store or use a service, including the planned date, time, number of people, and products or services they are interested in.
[0067] "Means of data collection" refers to systems that store information entered by customers on online platforms, etc., in databases or other similar systems.
[0068] A "generative AI model" is an artificial intelligence model that automatically generates personalized call scripts based on diverse customer information.
[0069] "Personalized call content" refers to a customized call script tailored to each customer's characteristics and needs.
[0070] "Means of creation" refers to a system that utilizes generative AI models to design and generate personalized call content.
[0071] "Means for executing automated calls" refers to a system that, based on the generated call content, makes a phone call to the customer and transmits information using synthesized or recorded voice.
[0072] "Means for analyzing call outcomes" refers to a system that collects call results as data and analyzes information such as customer reactions and selected options.
[0073] "Methods for summarizing key points" refers to the process of extracting important information from analyzed call data and summarizing it concisely.
[0074] A "means for providing customer service suggestions" is a system that provides staff with specific suggestions on how to interact with customers, based on summarized information.
[0075] "Means of communicating with the person in charge" refers to a system that notifies staff members of the details of the proposal to the customer via their terminals and issues instructions so that they can respond appropriately based on the information received.
[0076] "Means of providing to customers through audio output devices" refers to the process of using devices such as speakers and telephone receivers to deliver the generated call content to the customer as audio.
[0077] This invention is a system that streamlines and improves the accuracy of customer service. This system is achieved by generating call scripts tailored to individual customer needs based on online customer reservation information and incorporating them into automated calls.
[0078] First, the user accesses the online booking platform and enters the necessary booking information. This includes information such as the planned visit date, time, number of people, and products or services of interest. This information is then stored in the system's database via the user's device.
[0079] Next, the terminal sends the user-entered information to the server in an appropriate data format such as JSON. The server receives this information and activates a generative AI model to generate a personalized call script. For example, a high-performance AI model utilizing natural language processing technology is used as the generative AI model. This model has the ability to generate a script that is appropriate for a specific situation when given a prompt.
[0080] The generated script is sent from the server to the automated calling system. The automated calling system uses synthesized speech technology to convert the script into speech and automatically dials the customer's phone number. The call is placed to the user's registered phone number, and the customer receives automated guidance.
[0081] After the call, the server analyzes the call data obtained from the auto-call system and uses a generation AI to summarize the results. The summary includes the success or failure of the call, the options selected, and the user's responses, which helps staff prepare appropriate responses for the customer. The summary information is then transmitted back to the terminal and displayed as customer support instructions.
[0082] For example, if a user has booked a family visit for the weekend, the script generation process can include a suggestion for a "special family package." This ensures the call is tailored to the user's needs and provides a better customer experience. An example of a prompt would be: "Generate a call script for a customer planning a family visit. Please include detailed information about the special family package."
[0083] This system makes it possible to expedite and improve the accuracy of individualized responses, which was difficult with conventional methods.
[0084] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0085] Step 1:
[0086] The user accesses the online platform and enters their reservation information.
[0087] Input: Planned visit date, family composition, product information of interest, etc.
[0088] The user enters these details into the reservation form and clicks the submit button. This action saves the entered information to the database.
[0089] Step 2:
[0090] The terminal sends the input information to the server in a data format (such as JSON).
[0091] Input: Reservation information entered by the user.
[0092] Output: JSON formatted data sent to the server.
[0093] The device uses its transmission function to send the collected information to the API endpoint on the server side.
[0094] Step 3:
[0095] The server activates an AI model based on the information it receives, and generates a call script according to the prompt.
[0096] Input: User reservation information (JSON data).
[0097] Output: Personalized call script.
[0098] The server prompts the AI model to generate a script tailored to the user's characteristics. For example, in the case of a family visit, a script is created that includes details of a special package.
[0099] Step 4:
[0100] The server sends the generated call script to the auto-call system.
[0101] Input: A call script created by a generative AI model.
[0102] Output: Synthesized voice or recorded data used by the auto-call system.
[0103] The server converts the call script into appropriate audio data and provides it to the automated calling system.
[0104] Step 5:
[0105] The user receives a call from an automated calling system and interacts with various options.
[0106] Input: Phone call from the server.
[0107] Output: User selections, responses, and feedback.
[0108] The user answers the call and uses push buttons to select options of interest. This information is returned to the server.
[0109] Step 6:
[0110] The server analyzes the call results and uses a generative AI model to summarize them.
[0111] Input: Call logs and user selection data.
[0112] Output: Summarized call results.
[0113] The server analyzes the call content, extracts key points, and creates a summary.
[0114] Step 7:
[0115] The server uses the summary information to generate response instructions for staff and notifies their terminals.
[0116] Input: Summary of call results.
[0117] Output: Action instructions displayed to staff.
[0118] The terminal displays specific instructions on how to interact with the customer, and the staff member uses these instructions to handle the customer interaction.
[0119] (Application Example 1)
[0120] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0121] Traditional appointment booking systems have had drawbacks, such as insufficient guidance tailored to individual customer characteristics and the provision of personalized services. Furthermore, they generally focus on special plans and campaigns, making it difficult to provide information specific to customers' interests and circumstances. Against this backdrop, there is a need for a means to improve customer satisfaction and achieve more efficient staff response.
[0122] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0123] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for analyzing the call results and summarizing important points, means for making customer response suggestions based on the summarized information, means for generating customized guidance based on customer characteristics, and means for providing information based on the guidance to staff. This makes it possible to provide customers with individualized guidance and special plans, thereby improving customer satisfaction.
[0124] "Customer reservation information" refers to information entered by users through an online platform, such as the date and time of their visit, family composition, and products they are interested in.
[0125] "Response information" refers to information that shows responses such as survey responses and feedback obtained from customers.
[0126] "Optimal call content" refers to individualized and customized conversation scripts generated based on the customer's characteristics and needs.
[0127] "Means of making automated calls" refers to a mechanism that uses auto-call technology to automatically initiate pre-generated calls to customers.
[0128] "Call results" refer to the outcomes and information obtained after a call, such as whether the call was successful, the customer's reaction, and the options selected.
[0129] "Customer service suggestions" refer to providing store staff with instructions and advice based on analyzed call results to help them respond appropriately to customers.
[0130] "Customized guidance" refers to the provision of information and offers that are specifically tailored to the customer's characteristics and interests.
[0131] "Information based on guidance" refers to information that utilizes generated, customized guidance and is intended to be provided to customers.
[0132] To implement this invention, the system has the following functions: The server receives reservation and response information entered by the user on their smartphone and stores it in a database. By using MongoDB for the database, efficient management of the information is possible.
[0133] The server uses a generative AI model to generate optimal call content based on the collected information. This generative AI model is built in Python and generates customized scripts tailored to the customer's characteristics. The generated call script is sent to the auto-call system using the Twilio API, and the call is automatically made to the customer. The call content is played back as synthesized speech.
[0134] After the call ends, the server analyzes the results and summarizes the key points. This summarized information is then formatted as suggestions for how staff can appropriately respond to the customer and sent to their terminals. This notification system allows staff to improve the quality of service they provide.
[0135] As a concrete example, in one apparel store, when a family makes an appointment to visit, a generative AI model prepares a call informing them about the latest collections and special offers for families. This information is communicated to the customer in advance, and staff are notified so they can provide the best possible service based on the customer's interests.
[0136] An example of a prompt message would be: "This user has entered 'family of four' as their family structure and indicated interest in 'new arrivals' on the day of their visit. What would be an appropriate call to discuss?"
[0137] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0138] Step 1:
[0139] Users enter their reservation and response information using a smartphone application. This information includes the date and time of visit, products of interest, and family composition. The entered information is sent from the device to the server and stored in a MongoDB database. This enables centralized management of reservation information.
[0140] Step 2:
[0141] The server retrieves information stored in the database and activates the AI generation model. Using reservation and response information as input, the AI creates an optimized call script based on this data. Specifically, it uses a Python®-based text generation algorithm to generate a customized script tailored to the customer's characteristics. The output is the generated call content.
[0142] Step 3:
[0143] The server sends the generated call script to the auto-dialing system via the Twilio API. The auto-dialing system automatically dials the customer's phone number and plays back the call using synthesized speech. The input here is the generated call script, and the output is the execution of the call to the customer.
[0144] Step 4:
[0145] Once a call ends, the server retrieves the call results from the auto-call system and analyzes them. Using a specific analysis algorithm, it extracts information such as the success or failure of the call, customer responses, and selected options. This analysis result is formatted as a summary and used as input for the next step.
[0146] Step 5:
[0147] The server generates customer service suggestions based on the summarized call results. These suggestions are then sent to staff members' terminals. The notifications include information on special offers and customer service tips, providing guidance for staff to tailor their responses to each customer. The input is the analysis results, and the output is the notification to the staff.
[0148] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0149] In an embodiment of this invention, the system operates according to the following procedure.
[0150] Users first make a reservation through an online platform. This includes entering information about the date and time of their visit, family composition, and products they are interested in. This information is transmitted to the server via the terminal and stored in a database.
[0151] The server uses stored reservation information to run a generation AI model and generate optimal call content. This AI model considers not only regular information but also the user's emotional state in conjunction with an emotion engine. The emotion engine estimates the user's emotions from past data and input information and uses the results to customize the call content. For example, if the user is showing anxiety, the generated call content will include suggestions to reassure them.
[0152] The generated call content is sent from the server to the auto-dialing system. The auto-dialing system automatically dials the registered user's phone number and plays the generated call content back through an audio output device. The user can listen to this call and choose options as needed.
[0153] After a call ends, the server retrieves call result data. This data includes the user's emotion recognition results from the emotion engine, which are then analyzed to record changes in the user's emotions. Furthermore, based on this analysis, particularly important points are extracted and a summary is created.
[0154] The summarized information is sent from the server to the staff's terminal. The terminal receives this information and provides the staff with more personalized customer service suggestions tailored to the user's mood. For example, if the user appears happy, it is recommended to prioritize providing information about special promotions or discounts for their next visit.
[0155] This approach allows for more personalized customer service, enabling responses tailored to each user's different emotional state. As a result, the customer experience is improved, and deeper relationships are built.
[0156] The following describes the processing flow.
[0157] Step 1:
[0158] Users make appointments to visit the store using an online platform. During the booking process, they answer questionnaires regarding the date and time of their visit, family composition, and products of interest. This data is transmitted in real time via the device to a server and stored in a database.
[0159] Step 2:
[0160] The server processes the stored reservation information and survey responses, and uses this to launch a generative AI model. This model also acquires data for the emotion engine and is responsible for estimating the user's emotional state. The emotion data includes latent emotional information that can be analyzed from the text.
[0161] Step 3:
[0162] The server uses the results of the generative AI model and emotion engine to generate the most appropriate call content. For example, if the AI determines that the user is nervous, it will create reassuring messages such as, "We have prepared an environment where you can relax and enjoy yourself."
[0163] Step 4:
[0164] The generated call content is sent from the server to the auto-dialing system. This system automatically dials the registered user's phone number and plays back the call content as synthesized speech or a recording.
[0165] Step 5:
[0166] The user receives a call from the automated calling system and listens to the information provided. After listening, if the user makes a selection, that selection is also recorded.
[0167] Step 6:
[0168] After a call ends, the server receives the call results from the auto-call system and performs analysis. This analysis includes detailed information about the user's choices and emotional changes.
[0169] Step 7:
[0170] Using the analysis results, the server uses a generative AI to perform summarization and extract important information based on the user's emotions and choices.
[0171] Step 8:
[0172] The summarized information is sent from the server to the staff member's terminal. The terminal displays specific suggestions to help the staff member provide better service at their next interaction with the user. For example, it might say, "The user has shown interest in a new product line. Please introduce that product to them when they visit the store."
[0173] In this way, the entire system works together, making it possible to provide more personalized and emotionally resonant customer service.
[0174] (Example 2)
[0175] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0176] Traditional customer management systems have struggled to provide personalized services that take customer emotions into consideration. Furthermore, the generation of call content and subsequent responses were generic and uniform, resulting in insufficient responses tailored to the individual circumstances and emotions of each user, thus limiting improvements in customer satisfaction.
[0177] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0178] In this invention, the server includes means for acquiring and storing user reservation data, means for using a generative AI model that generates optimal call content based on the acquired data and past sentiment analysis data, and means for inputting generated prompt sentences into the generative AI model and customizing the call content in cooperation with the sentiment engine. This enables the generation of optimized call content that takes the user's emotions into consideration, and personalized responses based on that content.
[0179] "User" refers to a person who uses the system to receive services such as making reservations or making phone calls.
[0180] "Reservation data" refers to information related to the planned date and time of visit and products of interest entered by the user within the system.
[0181] A "generative AI model" refers to artificial intelligence technology that uses natural language processing based on input data to generate optimal call content.
[0182] A "prompt message" refers to a string of characters given to a generative AI model to instruct it on what content it should generate.
[0183] An "emotion engine" refers to an algorithm that estimates a user's emotional state based on their past data and current input information, and outputs the result.
[0184] A "voice output mechanism" refers to a device or technology that reproduces the generated call content as audio and provides it to the user.
[0185] "Automated calling" refers to a function that executes a call based on call content generated by the system, without direct human intervention.
[0186] "Summary information" refers to information that is compressed and represented by extracting important parts based on analysis of call results and acquired data.
[0187] "Staff" refers to individuals within an organization who provide services to users and receive support from the system.
[0188] This invention utilizes an integrated system in which a server, terminal, and user work together. The following details how each element collaborates to implement the invention.
[0189] Users enter reservation data through an online platform using an internet-connected device. During this process, users provide information such as their planned visit date and time, family composition, and products of interest. This information is transmitted to the server via the device. For security reasons, encryption protocols are used to protect the data during this process. The server stores the received reservation data in a database.
[0190] The server activates a generative AI model based on information stored in the database. This generative AI model estimates the user's emotional state through an emotion engine that works in conjunction with the input data. By using generative prompts, more personalized call content is generated. For example, a specific prompt such as "Generate a welcome message for this user that includes information about new products" can be used.
[0191] The generated call content is sent from the server to an automated call system. This system utilizes an audio output mechanism to automatically dial the registered user's phone number. The user can receive this call in audio format and make appropriate selections or responses based on its content.
[0192] After the call ends, the server automatically processes the results and analyzes the user's emotional changes using an emotion engine. Summary information is extracted from this analysis, and based on this, personalized suggestions for user interaction are created. These suggestions are then communicated to staff via the terminal, who then provide the user with complete service. For example, if a user showed interest in a new menu item, it might be recommended to offer a tasting coupon for their next visit.
[0193] Thus, in order to implement this invention, the entire system must be highly integrated, and each component must work together seamlessly. As a result, it becomes possible to provide personalized services that take into account the user's emotions and significantly improve customer satisfaction.
[0194] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0195] Step 1:
[0196] Users access the online platform via the internet using their devices and enter their reservation information. This information includes the planned date and time of visit, family composition, and products of interest. This entered data is converted into packets on the device and sent to the server via a secure connection.
[0197] Step 2:
[0198] The server receives reservation data sent from the terminal and stores it in the database. A database management system (DBMS) operates during this process, ensuring the data is stored in the appropriate tables and fields. Transactional processing maintains data integrity.
[0199] Step 3:
[0200] The server extracts reservation information stored in the database and activates a generative AI model. The reservation information and sentiment estimation from the sentiment engine are included in the generative prompt text and input into the AI model. The prompt text includes instructions such as "Generate content introducing new products to visiting customers." Based on these inputs, the AI model generates the optimal call content and outputs text data.
[0201] Step 4:
[0202] The server sends the generated call content to the automated calling system. Using VoIP technology, the server automatically places a call to the user's registered phone number. The audio output mechanism converts the text data of the call content into audio format and plays it back to the user.
[0203] Step 5:
[0204] After a call ends, the server automatically collects the call's results. It then utilizes the emotion engine again to analyze the user's emotional changes during the call. Based on this, it extracts key points and generates a summary. The analysis data and summary are output in digital format.
[0205] Step 6:
[0206] The server sends the generated summary information and analysis results to the terminal, which is then received by the staff. The terminal then notifies the staff with personalized response suggestions that take the user's emotions into consideration. For example, this might include a suggestion such as, "If you show interest in the new menu, we will offer you a tasting coupon on your next visit."
[0207] (Application Example 2)
[0208] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0209] Currently, customer service tends to be uniform, making it difficult to provide services that cater to the individual emotions and needs of each customer. Furthermore, recommendations for products and services in stores are not commonly offered, highlighting the need for improved customer experience.
[0210] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0211] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for estimating the customer's emotional state and adjusting the call content based on the estimated emotional state, and means for suggesting recommended products and services at the store. This makes it possible to provide personalized services that correspond to each customer's specific emotional state, thereby improving the customer experience and satisfaction.
[0212] "Customer reservation information" refers to information about the date, time, and content of a reservation that a customer provides when booking a service or product.
[0213] "Response information" refers to information provided by customers by answering questions or surveys related to the service.
[0214] "Optimal call content" refers to call content that is tailored to the customer's needs and emotional state, and is designed to effectively convey information.
[0215] "Automated calling" is a process that mechanically makes phone calls based on pre-set call content and transmits information.
[0216] "Call results" refer to data obtained after a call with a customer, such as the content of the conversation and the customer's reactions.
[0217] A "customer service proposal" is a suggestion that outlines the optimal way to respond to a customer based on their needs and emotions.
[0218] "Emotional state" refers to the psychological and sensory responses that a customer exhibits in a particular situation.
[0219] "Means of estimation" refers to methods of analyzing data to determine the emotional state and circumstances of customers.
[0220] "Recommended products and services in stores" are products and services offered to customers specifically with the intention of promoting sales.
[0221] To implement this invention, the main components of the system are a server, a terminal, and a user. The server first stores customer reservation information and response information in a database. Reservation information includes the date and time of visit and family composition, while response information is information about products of interest and emotions provided by the customer in advance.
[0222] The server uses a generative AI model to generate optimal call content based on this information. The generative AI model employs natural language processing technologies such as OpenAI® GPT to customize call content according to the information and the user's emotional state. To construct call content that provides greater reassurance to the customer based on the estimated emotional state, an emotion engine utilizing Microsoft® Azure® Emotion API and other technologies is applied.
[0223] The generated call content is automatically dialed to the user's phone number via an auto-dialing system. This auto-dialing system utilizes services such as Twilio and Nexmo, and calls are made through an audio output device. The user receives this call and customizes the service by selecting options.
[0224] Once the call ends, the server receives the call results and summarizes the key points. This summarized information is sent to the staff member's terminal, where personalized customer service suggestions are made. For example, if the estimated emotional state was positive, a special discount for the next visit can be offered.
[0225] For example, if a user makes a reservation to visit a theme park with their family on the weekend, the server will generate a call message based on the reservation information, including information about family plans and special events. An example of a prompt would be, "Please recommend family plans and suggest information about special events in an emotionally sensitive tone."
[0226] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0227] Step 1:
[0228] The server receives reservation and response information entered by users via an online platform. This information includes details such as the date and time of visit, family composition, and product interests. The server stores this input information in a database. The data format includes date and time information, text information, and choice information.
[0229] Step 2:
[0230] The server inputs information stored in the database into the generative AI model. During this process, the generative AI model compares past user data with current data to generate the optimal call content. Data processing includes organizing text information and preparing data for sentiment analysis. The output is a customized call script.
[0231] Step 3:
[0232] The server sends the generated call content to the emotion engine for analysis to estimate the user's emotional state. This engine uses the Microsoft Azure Emotion API to generate emotion tags from the input text information, providing insights into how the call content should be adjusted. The output is the emotion-adjusted call content.
[0233] Step 4:
[0234] The server sends the pre-arranged call details to the auto-dialing system. If the auto-dialing system uses the Twilio API, an audio file of the call is generated and automatically dialed to the user's phone number. The user listens to this call and selects their response according to the prompts. The output at this point is the user's selection information.
[0235] Step 5:
[0236] After the call ends, the server analyzes the call results based on the user's choices and emotional state, and summarizes the key points. This is done using a data summarization algorithm to extract information and generate a suggestion statement indicating what staff should actually prioritize. The output is summarized suggestion information.
[0237] Step 6:
[0238] The server sends summary information to staff terminals. This information is made available on a staff dashboard and used as part of customer service. Specifically, the terminals are configured to receive notifications in real time and display recommended actions.
[0239] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0240] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0241] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0242] [Second Embodiment]
[0243] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0244] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0245] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0246] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0247] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0248] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0249] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0250] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0251] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0252] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0253] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0254] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0255] In an embodiment of this invention, the system operates according to the following procedure.
[0256] First, the user makes a reservation to visit the store through an online platform. During this process, they enter information such as the date and time, family composition, and products of interest in a reservation form, which is immediately saved to a database. Next, the terminal sends the user's input to the server.
[0257] The server activates a specific generative AI model to generate appropriate call content based on the received reservation and response information. This generative AI model is trained to generate customized call scripts tailored to the customer's characteristics. For example, if a user is planning a visit with their family, the AI model will create a call script that includes information about a "special family package."
[0258] The generated call content is sent from the server to an automated calling system. This system automatically dials the customer's phone number and plays back the generated call content as synthesized speech or a recording. The user receives this call and can listen to specific instructions or make choices in real time as needed.
[0259] After the call, the server analyzes the call results obtained from the auto-dialing system and uses generative AI to summarize the results. The summary includes, for example, whether the call was successful, the user's response, and the options selected. This summary information is then formatted for use in subsequent customer service.
[0260] Finally, the server notifies staff based on the analyzed data. These notifications appear on terminals and provide suggestions on how staff should interact with customers upon their arrival. This enables staff to provide high-quality customer service tailored to each customer's individual needs.
[0261] The specific way this system operates can dramatically improve the efficiency and accuracy of traditional customer service processes. For example, if a user expresses interest in a campaign, staff can use that information to make specific suggestions, thus improving customer satisfaction.
[0262] The following describes the processing flow.
[0263] Step 1:
[0264] Users make reservations to visit the store via an online platform. The date and time entered in the reservation form, family composition, and information about products of interest are checked by the reservation system's terminal and sent to the server.
[0265] Step 2:
[0266] The server stores the received reservation information in a database and sends the stored data to a generating AI model. This AI model is trained to generate call content optimized for the customer based on the collected information.
[0267] Step 3:
[0268] The server delivers the call content received from the generated AI model to the auto-call system. This call content is customized according to the customer's specific needs, and the AI incorporates appropriate guidance and suggestions.
[0269] Step 4:
[0270] The automated calling system uses the call content received from the server to automatically dial the registered customer's phone number. The user receives this automated call and listens to the information provided.
[0271] Step 5:
[0272] After the call ends, the server collects call result data from the auto-call system. This includes whether the call was successful, the user's response, and the options selected.
[0273] Step 6:
[0274] The server analyzes the call results and uses a generative AI to create a summary, picking out meaningful information. This summary includes key points that should be used for future actions.
[0275] Step 7:
[0276] The summarized data is sent from the server to the staff's terminal. These pieces of information are displayed on the terminal, and the staff can grasp the specific proposals necessary for customer service when customers visit the store.
[0277] Step 8:
[0278] Based on the terminal, the staff can provide a higher-quality service experience for customers by providing optimized customer service to the customers who visit the store.
[0279] (Example 1)
[0280] Next, Example 1 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".
[0281] In the conventional customer service process, there are problems such as difficulty in generating individualized call content and improving the accuracy and efficiency in subsequent customer service. In particular, when flexible responses according to the needs of each customer are required, a standardized call script may not be sufficient to meet the requirements. Furthermore, there is also a problem that the workload increases and the response speed decreases due to the manual analysis of call results and summary of information.
[0282] The specific processing by the specific processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0283] In this invention, the server includes means for accumulating information regarding customer reservations, means for creating individualized call content using an AI model generated based on the accumulated information, and means for analyzing the results of the call and summarizing the key points. As a result, it becomes possible to quickly generate individualized call content and improve the accuracy of customer service.
[0284] The "information regarding customer reservations" refers to information on the scheduled visit date, time, number of people, and interested products or services provided by the customer when making a reservation for a store visit or service use.
[0285] The "means of accumulating" is a system that stores information input by customers on an online platform or the like in a database or the like.
[0286] The "generative AI model" is an artificial intelligence model that automatically generates individualized call scripts based on diverse customer information.
[0287] The "individualized call content" is a customized call script according to the characteristics and needs of each customer.
[0288] The "means of creating" is a system that executes a process of designing and generating individualized call content by utilizing a generative AI model.
[0289] The "means of executing an automatic call" is a system that makes a phone call to a customer based on the generated call content and conveys information in a synthetic voice or a recorded voice.
[0290] The "means of analyzing the results of a call" is a system that collects call results as data and analyzes information such as customer reactions and selected options.
[0291] The "means of summarizing key points" is a process of extracting important information from the analyzed call data and summarizing it concisely.
[0292] The "means of making proposals for customer service" is a system that provides specific proposals to staff on how to handle customers based on the information summarizing the key points.
[0293] The "means of communicating to the person in charge" is a system that notifies the content of the proposal for the customer to the staff's terminal and gives instructions so that they can respond appropriately based on the received information.
[0294] The "means of providing to the customer through an acoustic output device" is a process of using devices such as speakers and telephone handsets to deliver the created call content to the customer as voice.
[0295] This invention is a system that streamlines and improves the accuracy of customer service. This system is achieved by generating call scripts tailored to individual customer needs based on online customer reservation information and incorporating them into automated calls.
[0296] First, the user accesses the online booking platform and enters the necessary booking information. This includes information such as the planned visit date, time, number of people, and products or services of interest. This information is then stored in the system's database via the user's device.
[0297] Next, the terminal sends the user-entered information to the server in an appropriate data format such as JSON. The server receives this information and activates a generative AI model to generate a personalized call script. For example, a high-performance AI model utilizing natural language processing technology is used as the generative AI model. This model has the ability to generate a script that is appropriate for a specific situation when given a prompt.
[0298] The generated script is sent from the server to the automated calling system. The automated calling system uses synthesized speech technology to convert the script into speech and automatically dials the customer's phone number. The call is placed to the user's registered phone number, and the customer receives automated guidance.
[0299] After the call, the server analyzes the call data obtained from the auto-call system and uses a generation AI to summarize the results. The summary includes the success or failure of the call, the options selected, and the user's responses, which helps staff prepare appropriate responses for the customer. The summary information is then transmitted back to the terminal and displayed as customer support instructions.
[0300] As a specific example, when a user reserves a visit with their family on the weekend, a proposal for a "special package for families" can be included during script generation. This makes the content of the call align with the user's needs and enables the provision of a better customer experience. Also, examples of prompt sentences include "Please generate a call script for customers who are planning to visit with their family. In particular, please provide a detailed introduction about the special package for families."
[0301] This system makes it possible to achieve the acceleration and accuracy improvement of individual responses, which were difficult with conventional methods.
[0302] The flow of the specific process in Example 1 will be described using FIG. 11.
[0303] Step 1:
[0304] The user accesses the online platform and enters reservation information.
[0305] Input: Scheduled visit date, family composition, product information of interest, etc.
[0306] The user enters these details in the reservation form and presses the send button. By this operation, the input information is saved in the database.
[0307] Step 2:
[0308] The terminal sends the input information to the server in a data format (such as JSON).
[0309] Input: Reservation information entered by the user.
[0310] Output: Data in JSON format sent to the server.
[0311] The terminal uses the sending function to send the collected information to the API endpoint on the server side.
[0312] Step 3:
[0313] The server activates an AI model based on the information it receives, and generates a call script according to the prompt.
[0314] Input: User reservation information (JSON data).
[0315] Output: Personalized call script.
[0316] The server prompts the AI model to generate a script tailored to the user's characteristics. For example, in the case of a family visit, a script is created that includes details of a special package.
[0317] Step 4:
[0318] The server sends the generated call script to the auto-call system.
[0319] Input: A call script created by a generative AI model.
[0320] Output: Synthesized voice or recorded data used by the auto-call system.
[0321] The server converts the call script into appropriate audio data and provides it to the automated calling system.
[0322] Step 5:
[0323] The user receives a call from an automated calling system and interacts with various options.
[0324] Input: Phone call from the server.
[0325] Output: User selections, responses, and feedback.
[0326] The user answers the call and uses push buttons to select options of interest. This information is returned to the server.
[0327] Step 6:
[0328] The server analyzes the call results and uses a generative AI model to summarize them.
[0329] Input: Call logs and user selection data.
[0330] Output: Summarized call results.
[0331] The server analyzes the call content, extracts key points, and creates a summary.
[0332] Step 7:
[0333] The server uses the summary information to generate response instructions for staff and notifies their terminals.
[0334] Input: Summary of call results.
[0335] Output: Action instructions displayed to staff.
[0336] The terminal displays specific instructions on how to interact with the customer, and the staff member uses these instructions to handle the customer interaction.
[0337] (Application Example 1)
[0338] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0339] Traditional appointment booking systems have had drawbacks, such as insufficient guidance tailored to individual customer characteristics and the provision of personalized services. Furthermore, they generally focus on special plans and campaigns, making it difficult to provide information specific to customers' interests and circumstances. Against this backdrop, there is a need for a means to improve customer satisfaction and achieve more efficient staff response.
[0340] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0341] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for analyzing the call results and summarizing important points, means for making customer response suggestions based on the summarized information, means for generating customized guidance based on customer characteristics, and means for providing information based on the guidance to staff. This makes it possible to provide customers with individualized guidance and special plans, thereby improving customer satisfaction.
[0342] "Customer reservation information" refers to information entered by users through an online platform, such as the date and time of their visit, family composition, and products they are interested in.
[0343] "Response information" refers to information that shows responses such as survey responses and feedback obtained from customers.
[0344] "Optimal call content" refers to individualized and customized conversation scripts generated based on the customer's characteristics and needs.
[0345] "Means of making automated calls" refers to a mechanism that uses auto-call technology to automatically initiate pre-generated calls to customers.
[0346] "Call results" refer to the outcomes and information obtained after a call, such as whether the call was successful, the customer's reaction, and the options selected.
[0347] "Customer service suggestions" refer to providing store staff with instructions and advice based on analyzed call results to help them respond appropriately to customers.
[0348] "Customized guidance" refers to the provision of information and offers that are specifically tailored to the customer's characteristics and interests.
[0349] "Information based on guidance" refers to information that utilizes generated, customized guidance and is intended to be provided to customers.
[0350] To implement this invention, the system has the following functions: The server receives reservation and response information entered by the user on their smartphone and stores it in a database. By using MongoDB for the database, efficient management of the information is possible.
[0351] The server uses a generative AI model to generate optimal call content based on the collected information. This generative AI model is built in Python and generates customized scripts tailored to the customer's characteristics. The generated call script is sent to the auto-call system using the Twilio API, and the call is automatically made to the customer. The call content is played back as synthesized speech.
[0352] After the call ends, the server analyzes the results and summarizes the key points. This summarized information is then formatted as suggestions for how staff can appropriately respond to the customer and sent to their terminals. This notification system allows staff to improve the quality of service they provide.
[0353] As a concrete example, in one apparel store, when a family makes an appointment to visit, a generative AI model prepares a call informing them about the latest collections and special offers for families. This information is communicated to the customer in advance, and staff are notified so they can provide the best possible service based on the customer's interests.
[0354] An example of a prompt message would be: "This user has entered 'family of four' as their family structure and indicated interest in 'new arrivals' on the day of their visit. What would be an appropriate call to discuss?"
[0355] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0356] Step 1:
[0357] Users enter their reservation and response information using a smartphone application. This information includes the date and time of visit, products of interest, and family composition. The entered information is sent from the device to the server and stored in a MongoDB database. This enables centralized management of reservation information.
[0358] Step 2:
[0359] The server retrieves information stored in the database and activates a generative AI model. Using reservation and response information as input, the AI creates an optimized call script based on this data. Specifically, it uses a Python-based text generation algorithm to generate a customized script tailored to the customer's characteristics. The output is the generated call content.
[0360] Step 3:
[0361] The server sends the generated call script to the auto-dialing system via the Twilio API. The auto-dialing system automatically dials the customer's phone number and plays back the call using synthesized speech. The input here is the generated call script, and the output is the execution of the call to the customer.
[0362] Step 4:
[0363] Once a call ends, the server retrieves the call results from the auto-call system and analyzes them. Using a specific analysis algorithm, it extracts information such as the success or failure of the call, customer responses, and selected options. This analysis result is formatted as a summary and used as input for the next step.
[0364] Step 5:
[0365] The server generates customer service suggestions based on the summarized call results. These suggestions are then sent to staff members' terminals. The notifications include information on special offers and customer service tips, providing guidance for staff to tailor their responses to each customer. The input is the analysis results, and the output is the notification to the staff.
[0366] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0367] In an embodiment of this invention, the system operates according to the following procedure.
[0368] Users first make a reservation through an online platform. This includes entering information about the date and time of their visit, family composition, and products they are interested in. This information is transmitted to the server via the terminal and stored in a database.
[0369] The server uses stored reservation information to run a generation AI model and generate optimal call content. This AI model considers not only regular information but also the user's emotional state in conjunction with an emotion engine. The emotion engine estimates the user's emotions from past data and input information and uses the results to customize the call content. For example, if the user is showing anxiety, the generated call content will include suggestions to reassure them.
[0370] The generated call content is sent from the server to the auto-dialing system. The auto-dialing system automatically dials the registered user's phone number and plays the generated call content back through an audio output device. The user can listen to this call and choose options as needed.
[0371] After a call ends, the server retrieves call result data. This data includes the user's emotion recognition results from the emotion engine, which are then analyzed to record changes in the user's emotions. Furthermore, based on this analysis, particularly important points are extracted and a summary is created.
[0372] The summarized information is sent from the server to the staff's terminal. The terminal receives this information and provides the staff with more personalized customer service suggestions tailored to the user's mood. For example, if the user appears happy, it is recommended to prioritize providing information about special promotions or discounts for their next visit.
[0373] This approach allows for more personalized customer service, enabling responses tailored to each user's different emotional state. As a result, the customer experience is improved, and deeper relationships are built.
[0374] The following describes the processing flow.
[0375] Step 1:
[0376] Users make appointments to visit the store using an online platform. During the booking process, they answer questionnaires regarding the date and time of their visit, family composition, and products of interest. This data is transmitted in real time via the device to a server and stored in a database.
[0377] Step 2:
[0378] The server processes the stored reservation information and survey responses, and uses this to launch a generative AI model. This model also acquires data for the emotion engine and is responsible for estimating the user's emotional state. The emotion data includes latent emotional information that can be analyzed from the text.
[0379] Step 3:
[0380] The server uses the results of the generative AI model and emotion engine to generate the most appropriate call content. For example, if the AI determines that the user is nervous, it will create reassuring messages such as, "We have prepared an environment where you can relax and enjoy yourself."
[0381] Step 4:
[0382] The generated call content is sent from the server to the auto-dialing system. This system automatically dials the registered user's phone number and plays back the call content as synthesized speech or a recording.
[0383] Step 5:
[0384] The user receives a call from the automated calling system and listens to the information provided. After listening, if the user makes a selection, that selection is also recorded.
[0385] Step 6:
[0386] After a call ends, the server receives the call results from the auto-call system and performs analysis. This analysis includes detailed information about the user's choices and emotional changes.
[0387] Step 7:
[0388] Using the analysis results, the server uses a generative AI to perform summarization and extract important information based on the user's emotions and choices.
[0389] Step 8:
[0390] The summarized information is sent from the server to the staff member's terminal. The terminal displays specific suggestions to help the staff member provide better service at their next interaction with the user. For example, it might say, "The user has shown interest in a new product line. Please introduce that product to them when they visit the store."
[0391] In this way, the entire system works together, making it possible to provide more personalized and emotionally resonant customer service.
[0392] (Example 2)
[0393] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0394] Traditional customer management systems have struggled to provide personalized services that take customer emotions into consideration. Furthermore, the generation of call content and subsequent responses were generic and uniform, resulting in insufficient responses tailored to the individual circumstances and emotions of each user, thus limiting improvements in customer satisfaction.
[0395] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0396] In this invention, the server includes means for acquiring and storing user reservation data, means for using a generative AI model that generates optimal call content based on the acquired data and past sentiment analysis data, and means for inputting generated prompt sentences into the generative AI model and customizing the call content in cooperation with the sentiment engine. This enables the generation of optimized call content that takes the user's emotions into consideration, and personalized responses based on that content.
[0397] "User" refers to a person who uses the system to receive services such as making reservations or making phone calls.
[0398] "Reservation data" refers to information related to the planned date and time of visit and products of interest entered by the user within the system.
[0399] A "generative AI model" refers to artificial intelligence technology that uses natural language processing based on input data to generate optimal call content.
[0400] A "prompt message" refers to a string of characters given to a generative AI model to instruct it on what content it should generate.
[0401] An "emotion engine" refers to an algorithm that estimates a user's emotional state based on their past data and current input information, and outputs the result.
[0402] A "voice output mechanism" refers to a device or technology that reproduces the generated call content as audio and provides it to the user.
[0403] "Automated calling" refers to a function that executes a call based on call content generated by the system, without direct human intervention.
[0404] "Summary information" refers to information that is compressed and represented by extracting important parts based on analysis of call results and acquired data.
[0405] "Staff" refers to individuals within an organization who provide services to users and receive support from the system.
[0406] This invention utilizes an integrated system in which a server, terminal, and user work together. The following details how each element collaborates to implement the invention.
[0407] Users enter reservation data through an online platform using an internet-connected device. During this process, users provide information such as their planned visit date and time, family composition, and products of interest. This information is transmitted to the server via the device. For security reasons, encryption protocols are used to protect the data during this process. The server stores the received reservation data in a database.
[0408] The server activates a generative AI model based on information stored in the database. This generative AI model estimates the user's emotional state through an emotion engine that works in conjunction with the input data. By using generative prompts, more personalized call content is generated. For example, a specific prompt such as "Generate a welcome message for this user that includes information about new products" can be used.
[0409] The generated call content is sent from the server to an automated call system. This system utilizes an audio output mechanism to automatically dial the registered user's phone number. The user can receive this call in audio format and make appropriate selections or responses based on its content.
[0410] After the call ends, the server automatically processes the results and analyzes the user's emotional changes using an emotion engine. Summary information is extracted from this analysis, and based on this, personalized suggestions for user interaction are created. These suggestions are then communicated to staff via the terminal, who then provide the user with complete service. For example, if a user showed interest in a new menu item, it might be recommended to offer a tasting coupon for their next visit.
[0411] Thus, in order to implement this invention, the entire system must be highly integrated, and each component must work together seamlessly. As a result, it becomes possible to provide personalized services that take into account the user's emotions and significantly improve customer satisfaction.
[0412] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0413] Step 1:
[0414] Users access the online platform via the internet using their devices and enter their reservation information. This information includes the planned date and time of visit, family composition, and products of interest. This entered data is converted into packets on the device and sent to the server via a secure connection.
[0415] Step 2:
[0416] The server receives reservation data sent from the terminal and stores it in the database. A database management system (DBMS) operates during this process, ensuring the data is stored in the appropriate tables and fields. Transactional processing maintains data integrity.
[0417] Step 3:
[0418] The server extracts reservation information stored in the database and activates a generative AI model. The reservation information and sentiment estimation from the sentiment engine are included in the generative prompt text and input into the AI model. The prompt text includes instructions such as "Generate content introducing new products to visiting customers." Based on these inputs, the AI model generates the optimal call content and outputs text data.
[0419] Step 4:
[0420] The server sends the generated call content to the automated calling system. Using VoIP technology, the server automatically places a call to the user's registered phone number. The audio output mechanism converts the text data of the call content into audio format and plays it back to the user.
[0421] Step 5:
[0422] After a call ends, the server automatically collects the call's results. It then utilizes the emotion engine again to analyze the user's emotional changes during the call. Based on this, it extracts key points and generates a summary. The analysis data and summary are output in digital format.
[0423] Step 6:
[0424] The server sends the generated summary information and analysis results to the terminal, which is then received by the staff. The terminal then notifies the staff with personalized response suggestions that take the user's emotions into consideration. For example, this might include a suggestion such as, "If you show interest in the new menu, we will offer you a tasting coupon on your next visit."
[0425] (Application Example 2)
[0426] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the smart glasses 214 as the "terminal".
[0427] Currently, customer service tends to be uniform, making it difficult to provide services that cater to the individual emotions and needs of each customer. Furthermore, recommendations for products and services in stores are not commonly offered, highlighting the need for improved customer experience.
[0428] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0429] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for estimating the customer's emotional state and adjusting the call content based on the estimated emotional state, and means for suggesting recommended products and services at the store. This makes it possible to provide personalized services that correspond to each customer's specific emotional state, thereby improving the customer experience and satisfaction.
[0430] "Customer reservation information" refers to information about the date, time, and content of a reservation that a customer provides when booking a service or product.
[0431] "Response information" refers to information provided by customers by answering questions or surveys related to the service.
[0432] "Optimal call content" refers to call content that is tailored to the customer's needs and emotional state, and is designed to effectively convey information.
[0433] "Automated calling" is a process that mechanically makes phone calls based on pre-set call content and transmits information.
[0434] "Call results" refer to data obtained after a call with a customer, such as the content of the conversation and the customer's reactions.
[0435] A "customer service proposal" is a suggestion that outlines the optimal way to respond to a customer based on their needs and emotions.
[0436] "Emotional state" refers to the psychological and sensory responses that a customer exhibits in a particular situation.
[0437] "Means of estimation" refers to methods of analyzing data to determine the emotional state and circumstances of customers.
[0438] "Recommended products and services in stores" are products and services offered to customers specifically with the intention of promoting sales.
[0439] To implement this invention, the main components of the system are a server, a terminal, and a user. The server first stores customer reservation information and response information in a database. Reservation information includes the date and time of visit and family composition, while response information is information about products of interest and emotions provided by the customer in advance.
[0440] The server uses a generative AI model to generate optimal call content based on this information. The generative AI model employs natural language processing technologies such as OpenAI GPT to customize the call content according to the information and the user's emotional state. To construct call content that provides greater reassurance to the customer based on the estimated emotional state, an emotion engine utilizing Microsoft Azure's Emotion API is applied.
[0441] The generated call content is automatically dialed to the user's phone number via an auto-dialing system. This auto-dialing system utilizes services such as Twilio and Nexmo, and calls are made through an audio output device. The user receives this call and customizes the service by selecting options.
[0442] Once the call ends, the server receives the call results and summarizes the key points. This summarized information is sent to the staff member's terminal, where personalized customer service suggestions are made. For example, if the estimated emotional state was positive, a special discount for the next visit can be offered.
[0443] For example, if a user makes a reservation to visit a theme park with their family on the weekend, the server will generate a call message based on the reservation information, including information about family plans and special events. An example of a prompt would be, "Please recommend family plans and suggest information about special events in an emotionally sensitive tone."
[0444] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0445] Step 1:
[0446] The server receives reservation and response information entered by users via an online platform. This information includes details such as the date and time of visit, family composition, and product interests. The server stores this input information in a database. The data format includes date and time information, text information, and choice information.
[0447] Step 2:
[0448] The server inputs information stored in the database into the generative AI model. During this process, the generative AI model compares past user data with current data to generate the optimal call content. Data processing includes organizing text information and preparing data for sentiment analysis. The output is a customized call script.
[0449] Step 3:
[0450] The server sends the generated call content to the emotion engine for analysis to estimate the user's emotional state. This engine uses the Microsoft Azure Emotion API to generate emotion tags from the input text information, providing insights into how the call content should be adjusted. The output is the emotion-adjusted call content.
[0451] Step 4:
[0452] The server sends the pre-arranged call details to the auto-dialing system. If the auto-dialing system uses the Twilio API, an audio file of the call is generated and automatically dialed to the user's phone number. The user listens to this call and selects their response according to the prompts. The output at this point is the user's selection information.
[0453] Step 5:
[0454] After the call ends, the server analyzes the call results based on the user's choices and emotional state, and summarizes the key points. This is done using a data summarization algorithm to extract information and generate a suggestion statement indicating what staff should actually prioritize. The output is summarized suggestion information.
[0455] Step 6:
[0456] The server sends summary information to staff terminals. This information is made available on a staff dashboard and used as part of customer service. Specifically, the terminals are configured to receive notifications in real time and display recommended actions.
[0457] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0458] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0459] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0460] [Third Embodiment]
[0461] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0462] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0463] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0464] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0465] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0466] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0467] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0468] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0469] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0470] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0471] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0472] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0473] In an embodiment of this invention, the system operates according to the following procedure.
[0474] First, the user makes a reservation to visit the store through an online platform. During this process, they enter information such as the date and time, family composition, and products of interest in a reservation form, which is immediately saved to a database. Next, the terminal sends the user's input to the server.
[0475] The server activates a specific generative AI model to generate appropriate call content based on the received reservation and response information. This generative AI model is trained to generate customized call scripts tailored to the customer's characteristics. For example, if a user is planning a visit with their family, the AI model will create a call script that includes information about a "special family package."
[0476] The generated call content is sent from the server to an automated calling system. This system automatically dials the customer's phone number and plays back the generated call content as synthesized speech or a recording. The user receives this call and can listen to specific instructions or make choices in real time as needed.
[0477] After the call, the server analyzes the call results obtained from the auto-dialing system and uses generative AI to summarize the results. The summary includes, for example, whether the call was successful, the user's response, and the options selected. This summary information is then formatted for use in subsequent customer service.
[0478] Finally, the server notifies staff based on the analyzed data. These notifications appear on terminals and provide suggestions on how staff should interact with customers upon their arrival. This enables staff to provide high-quality customer service tailored to each customer's individual needs.
[0479] The specific way this system operates can dramatically improve the efficiency and accuracy of traditional customer service processes. For example, if a user expresses interest in a campaign, staff can use that information to make specific suggestions, thus improving customer satisfaction.
[0480] The following describes the processing flow.
[0481] Step 1:
[0482] Users make reservations to visit the store via an online platform. The date and time entered in the reservation form, family composition, and information about products of interest are checked by the reservation system's terminal and sent to the server.
[0483] Step 2:
[0484] The server stores the received reservation information in a database and sends the stored data to a generating AI model. This AI model is trained to generate call content optimized for the customer based on the collected information.
[0485] Step 3:
[0486] The server delivers the call content received from the generated AI model to the auto-call system. This call content is customized according to the customer's specific needs, and the AI incorporates appropriate guidance and suggestions.
[0487] Step 4:
[0488] The automated calling system uses the call content received from the server to automatically dial the registered customer's phone number. The user receives this automated call and listens to the information provided.
[0489] Step 5:
[0490] After the call ends, the server collects call result data from the auto-call system. This includes whether the call was successful, the user's response, and the options selected.
[0491] Step 6:
[0492] The server analyzes the call results and uses a generative AI to create a summary, picking out meaningful information. This summary includes key points that should be used for future actions.
[0493] Step 7:
[0494] The summarized data is sent from the server to the staff's terminals. This information is displayed on the terminals, allowing staff to understand the specific suggestions needed to assist customers during their visits.
[0495] Step 8:
[0496] By using terminals to provide optimized customer service to customers, staff can offer customers a higher quality service experience.
[0497] (Example 1)
[0498] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0499] Traditional customer service processes face challenges in generating personalized call content and improving accuracy and efficiency in subsequent customer interactions. In particular, standardized call scripts may not adequately address the need for flexible responses tailored to individual customer needs. Furthermore, manual analysis of call results and summarization of information increases workload and slows response times.
[0500] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0501] In this invention, the server includes means for collecting customer reservation information, means for creating personalized call content using a generation AI model based on the collected information, and means for analyzing the outcome of the call and summarizing the key points. This enables the rapid generation of personalized call content and improved accuracy in customer service.
[0502] "Customer reservation information" refers to information provided by customers when they make a reservation to visit a store or use a service, including the planned date, time, number of people, and products or services they are interested in.
[0503] "Means of data collection" refers to systems that store information entered by customers on online platforms, etc., in databases or other similar systems.
[0504] A "generative AI model" is an artificial intelligence model that automatically generates personalized call scripts based on diverse customer information.
[0505] "Personalized call content" refers to a customized call script tailored to each customer's characteristics and needs.
[0506] "Means of creation" refers to a system that utilizes generative AI models to design and generate personalized call content.
[0507] "Means for executing automated calls" refers to a system that, based on the generated call content, makes a phone call to the customer and transmits information using synthesized or recorded voice.
[0508] "Means for analyzing call outcomes" refers to a system that collects call results as data and analyzes information such as customer reactions and selected options.
[0509] "Methods for summarizing key points" refers to the process of extracting important information from analyzed call data and summarizing it concisely.
[0510] A "means for providing customer service suggestions" is a system that provides staff with specific suggestions on how to interact with customers, based on summarized information.
[0511] "Means of communicating with the person in charge" refers to a system that notifies staff members of the details of the proposal to the customer via their terminals and issues instructions so that they can respond appropriately based on the information received.
[0512] "Means of providing to customers through audio output devices" refers to the process of using devices such as speakers and telephone receivers to deliver the generated call content to the customer as audio.
[0513] This invention is a system that streamlines and improves the accuracy of customer service. This system is achieved by generating call scripts tailored to individual customer needs based on online customer reservation information and incorporating them into automated calls.
[0514] First, the user accesses the online booking platform and enters the necessary booking information. This includes information such as the planned visit date, time, number of people, and products or services of interest. This information is then stored in the system's database via the user's device.
[0515] Next, the terminal sends the user-entered information to the server in an appropriate data format such as JSON. The server receives this information and activates a generative AI model to generate a personalized call script. For example, a high-performance AI model utilizing natural language processing technology is used as the generative AI model. This model has the ability to generate a script that is appropriate for a specific situation when given a prompt.
[0516] The generated script is sent from the server to the automated calling system. The automated calling system uses synthesized speech technology to convert the script into speech and automatically dials the customer's phone number. The call is placed to the user's registered phone number, and the customer receives automated guidance.
[0517] After the call, the server analyzes the call data obtained from the auto-call system and uses a generation AI to summarize the results. The summary includes the success or failure of the call, the options selected, and the user's responses, which helps staff prepare appropriate responses for the customer. The summary information is then transmitted back to the terminal and displayed as customer support instructions.
[0518] For example, if a user has booked a family visit for the weekend, the script generation process can include a suggestion for a "special family package." This ensures the call is tailored to the user's needs and provides a better customer experience. An example of a prompt would be: "Generate a call script for a customer planning a family visit. Please include detailed information about the special family package."
[0519] This system makes it possible to expedite and improve the accuracy of individualized responses, which was difficult with conventional methods.
[0520] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0521] Step 1:
[0522] The user accesses the online platform and enters their reservation information.
[0523] Input: Planned visit date, family composition, product information of interest, etc.
[0524] The user enters these details into the reservation form and clicks the submit button. This action saves the entered information to the database.
[0525] Step 2:
[0526] The terminal sends the input information to the server in a data format (such as JSON).
[0527] Input: Reservation information entered by the user.
[0528] Output: JSON formatted data sent to the server.
[0529] The device uses its transmission function to send the collected information to the API endpoint on the server side.
[0530] Step 3:
[0531] The server activates an AI model based on the information it receives, and generates a call script according to the prompt.
[0532] Input: User reservation information (JSON data).
[0533] Output: Personalized call script.
[0534] The server prompts the AI model to generate a script tailored to the user's characteristics. For example, in the case of a family visit, a script is created that includes details of a special package.
[0535] Step 4:
[0536] The server sends the generated call script to the auto-call system.
[0537] Input: A call script created by a generative AI model.
[0538] Output: Synthesized voice or recorded data used by the auto-call system.
[0539] The server converts the call script into appropriate audio data and provides it to the automated calling system.
[0540] Step 5:
[0541] The user receives a call from an automated calling system and interacts with various options.
[0542] Input: Phone call from the server.
[0543] Output: User selections, responses, and feedback.
[0544] The user answers the call and uses push buttons to select options of interest. This information is returned to the server.
[0545] Step 6:
[0546] The server analyzes the call results and uses a generative AI model to summarize them.
[0547] Input: Call logs and user selection data.
[0548] Output: Summarized call results.
[0549] The server analyzes the call content, extracts key points, and creates a summary.
[0550] Step 7:
[0551] The server uses the summary information to generate response instructions for staff and notifies their terminals.
[0552] Input: Summary of call results.
[0553] Output: Action instructions displayed to staff.
[0554] The terminal displays specific instructions on how to interact with the customer, and the staff member uses these instructions to handle the customer interaction.
[0555] (Application Example 1)
[0556] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0557] Traditional appointment booking systems have had drawbacks, such as insufficient guidance tailored to individual customer characteristics and the provision of personalized services. Furthermore, they generally focus on special plans and campaigns, making it difficult to provide information specific to customers' interests and circumstances. Against this backdrop, there is a need for a means to improve customer satisfaction and achieve more efficient staff response.
[0558] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0559] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for analyzing the call results and summarizing important points, means for making customer response suggestions based on the summarized information, means for generating customized guidance based on customer characteristics, and means for providing information based on the guidance to staff. This makes it possible to provide customers with individualized guidance and special plans, thereby improving customer satisfaction.
[0560] "Customer reservation information" refers to information entered by users through an online platform, such as the date and time of their visit, family composition, and products they are interested in.
[0561] "Response information" refers to information that shows responses such as survey responses and feedback obtained from customers.
[0562] "Optimal call content" refers to individualized and customized conversation scripts generated based on the customer's characteristics and needs.
[0563] "Means of making automated calls" refers to a mechanism that uses auto-call technology to automatically initiate pre-generated calls to customers.
[0564] "Call results" refer to the outcomes and information obtained after a call, such as whether the call was successful, the customer's reaction, and the options selected.
[0565] "Customer service suggestions" refer to providing store staff with instructions and advice based on analyzed call results to help them respond appropriately to customers.
[0566] "Customized guidance" refers to the provision of information and offers that are specifically tailored to the customer's characteristics and interests.
[0567] "Information based on guidance" refers to information that utilizes generated, customized guidance and is intended to be provided to customers.
[0568] To implement this invention, the system has the following functions: The server receives reservation and response information entered by the user on their smartphone and stores it in a database. By using MongoDB for the database, efficient management of the information is possible.
[0569] The server uses a generative AI model to generate optimal call content based on the collected information. This generative AI model is built in Python and generates customized scripts tailored to the customer's characteristics. The generated call script is sent to the auto-call system using the Twilio API, and the call is automatically made to the customer. The call content is played back as synthesized speech.
[0570] After the call ends, the server analyzes the results and summarizes the key points. This summarized information is then formatted as suggestions for how staff can appropriately respond to the customer and sent to their terminals. This notification system allows staff to improve the quality of service they provide.
[0571] As a concrete example, in one apparel store, when a family makes an appointment to visit, a generative AI model prepares a call informing them about the latest collections and special offers for families. This information is communicated to the customer in advance, and staff are notified so they can provide the best possible service based on the customer's interests.
[0572] An example of a prompt message would be: "This user has entered 'family of four' as their family structure and indicated interest in 'new arrivals' on the day of their visit. What would be an appropriate call to discuss?"
[0573] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0574] Step 1:
[0575] Users enter their reservation and response information using a smartphone application. This information includes the date and time of visit, products of interest, and family composition. The entered information is sent from the device to the server and stored in a MongoDB database. This enables centralized management of reservation information.
[0576] Step 2:
[0577] The server retrieves information stored in the database and activates a generative AI model. Using reservation and response information as input, the AI creates an optimized call script based on this data. Specifically, it uses a Python-based text generation algorithm to generate a customized script tailored to the customer's characteristics. The output is the generated call content.
[0578] Step 3:
[0579] The server sends the generated call script to the auto-dialing system via the Twilio API. The auto-dialing system automatically dials the customer's phone number and plays back the call using synthesized speech. The input here is the generated call script, and the output is the execution of the call to the customer.
[0580] Step 4:
[0581] Once a call ends, the server retrieves the call results from the auto-call system and analyzes them. Using a specific analysis algorithm, it extracts information such as the success or failure of the call, customer responses, and selected options. This analysis result is formatted as a summary and used as input for the next step.
[0582] Step 5:
[0583] The server generates customer service suggestions based on the summarized call results. These suggestions are then sent to staff members' terminals. The notifications include information on special offers and customer service tips, providing guidance for staff to tailor their responses to each customer. The input is the analysis results, and the output is the notification to the staff.
[0584] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0585] In an embodiment of this invention, the system operates according to the following procedure.
[0586] Users first make a reservation through an online platform. This includes entering information about the date and time of their visit, family composition, and products they are interested in. This information is transmitted to the server via the terminal and stored in a database.
[0587] The server uses stored reservation information to run a generation AI model and generate optimal call content. This AI model considers not only regular information but also the user's emotional state in conjunction with an emotion engine. The emotion engine estimates the user's emotions from past data and input information and uses the results to customize the call content. For example, if the user is showing anxiety, the generated call content will include suggestions to reassure them.
[0588] The generated call content is sent from the server to the auto-dialing system. The auto-dialing system automatically dials the registered user's phone number and plays the generated call content back through an audio output device. The user can listen to this call and choose options as needed.
[0589] After a call ends, the server retrieves call result data. This data includes the user's emotion recognition results from the emotion engine, which are then analyzed to record changes in the user's emotions. Furthermore, based on this analysis, particularly important points are extracted and a summary is created.
[0590] The summarized information is sent from the server to the staff's terminal. The terminal receives this information and provides the staff with more personalized customer service suggestions tailored to the user's mood. For example, if the user appears happy, it is recommended to prioritize providing information about special promotions or discounts for their next visit.
[0591] This approach allows for more personalized customer service, enabling responses tailored to each user's different emotional state. As a result, the customer experience is improved, and deeper relationships are built.
[0592] The following describes the processing flow.
[0593] Step 1:
[0594] Users make appointments to visit the store using an online platform. During the booking process, they answer questionnaires regarding the date and time of their visit, family composition, and products of interest. This data is transmitted in real time via the device to a server and stored in a database.
[0595] Step 2:
[0596] The server processes the stored reservation information and survey responses, and uses this to launch a generative AI model. This model also acquires data for the emotion engine and is responsible for estimating the user's emotional state. The emotion data includes latent emotional information that can be analyzed from the text.
[0597] Step 3:
[0598] The server uses the results of the generative AI model and emotion engine to generate the most appropriate call content. For example, if the AI determines that the user is nervous, it will create reassuring messages such as, "We have prepared an environment where you can relax and enjoy yourself."
[0599] Step 4:
[0600] The generated call content is sent from the server to the auto-dialing system. This system automatically dials the registered user's phone number and plays back the call content as synthesized speech or a recording.
[0601] Step 5:
[0602] The user receives a call from the automated calling system and listens to the information provided. After listening, if the user makes a selection, that selection is also recorded.
[0603] Step 6:
[0604] After a call ends, the server receives the call results from the auto-call system and performs analysis. This analysis includes detailed information about the user's choices and emotional changes.
[0605] Step 7:
[0606] Using the analysis results, the server uses a generative AI to perform summarization and extract important information based on the user's emotions and choices.
[0607] Step 8:
[0608] The summarized information is sent from the server to the staff member's terminal. The terminal displays specific suggestions to help the staff member provide better service at their next interaction with the user. For example, it might say, "The user has shown interest in a new product line. Please introduce that product to them when they visit the store."
[0609] In this way, the entire system works together, making it possible to provide more personalized and emotionally resonant customer service.
[0610] (Example 2)
[0611] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0612] Traditional customer management systems have struggled to provide personalized services that take customer emotions into consideration. Furthermore, the generation of call content and subsequent responses were generic and uniform, resulting in insufficient responses tailored to the individual circumstances and emotions of each user, thus limiting improvements in customer satisfaction.
[0613] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0614] In this invention, the server includes means for acquiring and storing user reservation data, means for using a generative AI model that generates optimal call content based on the acquired data and past sentiment analysis data, and means for inputting generated prompt sentences into the generative AI model and customizing the call content in cooperation with the sentiment engine. This enables the generation of optimized call content that takes the user's emotions into consideration, and personalized responses based on that content.
[0615] "User" refers to a person who uses the system to receive services such as making reservations or making phone calls.
[0616] "Reservation data" refers to information related to the planned date and time of visit and products of interest entered by the user within the system.
[0617] A "generative AI model" refers to artificial intelligence technology that uses natural language processing based on input data to generate optimal call content.
[0618] A "prompt message" refers to a string of characters given to a generative AI model to instruct it on what content it should generate.
[0619] An "emotion engine" refers to an algorithm that estimates a user's emotional state based on their past data and current input information, and outputs the result.
[0620] A "voice output mechanism" refers to a device or technology that reproduces the generated call content as audio and provides it to the user.
[0621] "Automated calling" refers to a function that executes a call based on call content generated by the system, without direct human intervention.
[0622] "Summary information" refers to information that is compressed and represented by extracting important parts based on analysis of call results and acquired data.
[0623] "Staff" refers to individuals within an organization who provide services to users and receive support from the system.
[0624] This invention utilizes an integrated system in which a server, terminal, and user work together. The following details how each element collaborates to implement the invention.
[0625] Users enter reservation data through an online platform using an internet-connected device. During this process, users provide information such as their planned visit date and time, family composition, and products of interest. This information is transmitted to the server via the device. For security reasons, encryption protocols are used to protect the data during this process. The server stores the received reservation data in a database.
[0626] The server activates a generative AI model based on information stored in the database. This generative AI model estimates the user's emotional state through an emotion engine that works in conjunction with the input data. By using generative prompts, more personalized call content is generated. For example, a specific prompt such as "Generate a welcome message for this user that includes information about new products" can be used.
[0627] The generated call content is sent from the server to an automated call system. This system utilizes an audio output mechanism to automatically dial the registered user's phone number. The user can receive this call in audio format and make appropriate selections or responses based on its content.
[0628] After the call ends, the server automatically processes the results and analyzes the user's emotional changes using an emotion engine. Summary information is extracted from this analysis, and based on this, personalized suggestions for user interaction are created. These suggestions are then communicated to staff via the terminal, who then provide the user with complete service. For example, if a user showed interest in a new menu item, it might be recommended to offer a tasting coupon for their next visit.
[0629] Thus, in order to implement this invention, the entire system must be highly integrated, and each component must work together seamlessly. As a result, it becomes possible to provide personalized services that take into account the user's emotions and significantly improve customer satisfaction.
[0630] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0631] Step 1:
[0632] Users access the online platform via the internet using their devices and enter their reservation information. This information includes the planned date and time of visit, family composition, and products of interest. This entered data is converted into packets on the device and sent to the server via a secure connection.
[0633] Step 2:
[0634] The server receives reservation data sent from the terminal and stores it in the database. A database management system (DBMS) operates during this process, ensuring the data is stored in the appropriate tables and fields. Transactional processing maintains data integrity.
[0635] Step 3:
[0636] The server extracts reservation information stored in the database and activates a generative AI model. The reservation information and sentiment estimation from the sentiment engine are included in the generative prompt text and input into the AI model. The prompt text includes instructions such as "Generate content introducing new products to visiting customers." Based on these inputs, the AI model generates the optimal call content and outputs text data.
[0637] Step 4:
[0638] The server sends the generated call content to the automated calling system. Using VoIP technology, the server automatically places a call to the user's registered phone number. The audio output mechanism converts the text data of the call content into audio format and plays it back to the user.
[0639] Step 5:
[0640] After a call ends, the server automatically collects the call's results. It then utilizes the emotion engine again to analyze the user's emotional changes during the call. Based on this, it extracts key points and generates a summary. The analysis data and summary are output in digital format.
[0641] Step 6:
[0642] The server sends the generated summary information and analysis results to the terminal, which is then received by the staff. The terminal then notifies the staff with personalized response suggestions that take the user's emotions into consideration. For example, this might include a suggestion such as, "If you show interest in the new menu, we will offer you a tasting coupon on your next visit."
[0643] (Application Example 2)
[0644] Next, we will explain Application Example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0645] Currently, customer service tends to be uniform, making it difficult to provide services that cater to the individual emotions and needs of each customer. Furthermore, recommendations for products and services in stores are not commonly offered, highlighting the need for improved customer experience.
[0646] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0647] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for estimating the customer's emotional state and adjusting the call content based on the estimated emotional state, and means for suggesting recommended products and services at the store. This makes it possible to provide personalized services that correspond to each customer's specific emotional state, thereby improving the customer experience and satisfaction.
[0648] "Customer reservation information" refers to information about the date, time, and content of a reservation that a customer provides when booking a service or product.
[0649] "Response information" refers to information provided by customers by answering questions or surveys related to the service.
[0650] "Optimal call content" refers to call content that is tailored to the customer's needs and emotional state, and is designed to effectively convey information.
[0651] "Automated calling" is a process that mechanically makes phone calls based on pre-set call content and transmits information.
[0652] "Call results" refer to data obtained after a call with a customer, such as the content of the conversation and the customer's reactions.
[0653] A "customer service proposal" is a suggestion that outlines the optimal way to respond to a customer based on their needs and emotions.
[0654] "Emotional state" refers to the psychological and sensory responses that a customer exhibits in a particular situation.
[0655] "Means of estimation" refers to methods of analyzing data to determine the emotional state and circumstances of customers.
[0656] "Recommended products and services in stores" are products and services offered to customers specifically with the intention of promoting sales.
[0657] To implement this invention, the main components of the system are a server, a terminal, and a user. The server first stores customer reservation information and response information in a database. Reservation information includes the date and time of visit and family composition, while response information is information about products of interest and emotions provided by the customer in advance.
[0658] The server uses a generative AI model to generate optimal call content based on this information. The generative AI model employs natural language processing technologies such as OpenAI GPT to customize the call content according to the information and the user's emotional state. To construct call content that provides greater reassurance to the customer based on the estimated emotional state, an emotion engine utilizing Microsoft Azure's Emotion API is applied.
[0659] The generated call content is automatically dialed to the user's phone number via an auto-dialing system. This auto-dialing system utilizes services such as Twilio and Nexmo, and calls are made through an audio output device. The user receives this call and customizes the service by selecting options.
[0660] Once the call ends, the server receives the call results and summarizes the key points. This summarized information is sent to the staff member's terminal, where personalized customer service suggestions are made. For example, if the estimated emotional state was positive, a special discount for the next visit can be offered.
[0661] For example, if a user makes a reservation to visit a theme park with their family on the weekend, the server will generate a call message based on the reservation information, including information about family plans and special events. An example of a prompt would be, "Please recommend family plans and suggest information about special events in an emotionally sensitive tone."
[0662] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0663] Step 1:
[0664] The server receives reservation and response information entered by users via an online platform. This information includes details such as the date and time of visit, family composition, and product interests. The server stores this input information in a database. The data format includes date and time information, text information, and choice information.
[0665] Step 2:
[0666] The server inputs information stored in the database into the generative AI model. During this process, the generative AI model compares past user data with current data to generate the optimal call content. Data processing includes organizing text information and preparing data for sentiment analysis. The output is a customized call script.
[0667] Step 3:
[0668] The server sends the generated call content to the emotion engine for analysis to estimate the user's emotional state. This engine uses the Microsoft Azure Emotion API to generate emotion tags from the input text information, providing insights into how the call content should be adjusted. The output is the emotion-adjusted call content.
[0669] Step 4:
[0670] The server sends the pre-arranged call details to the auto-dialing system. If the auto-dialing system uses the Twilio API, an audio file of the call is generated and automatically dialed to the user's phone number. The user listens to this call and selects their response according to the prompts. The output at this point is the user's selection information.
[0671] Step 5:
[0672] After the call ends, the server analyzes the call results based on the user's choices and emotional state, and summarizes the key points. This is done using a data summarization algorithm to extract information and generate a suggestion statement indicating what staff should actually prioritize. The output is summarized suggestion information.
[0673] Step 6:
[0674] The server sends summary information to staff terminals. This information is made available on a staff dashboard and used as part of customer service. Specifically, the terminals are configured to receive notifications in real time and display recommended actions.
[0675] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0676] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0677] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0678] [Fourth Embodiment]
[0679] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0680] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0681] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0682] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0683] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0684] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0685] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0686] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0687] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0688] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0689] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0690] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0691] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0692] In an embodiment of this invention, the system operates according to the following procedure.
[0693] First, the user makes a reservation to visit the store through an online platform. During this process, they enter information such as the date and time, family composition, and products of interest in a reservation form, which is immediately saved to a database. Next, the terminal sends the user's input to the server.
[0694] The server activates a specific generative AI model to generate appropriate call content based on the received reservation and response information. This generative AI model is trained to generate customized call scripts tailored to the customer's characteristics. For example, if a user is planning a visit with their family, the AI model will create a call script that includes information about a "special family package."
[0695] The generated call content is sent from the server to an automated calling system. This system automatically dials the customer's phone number and plays back the generated call content as synthesized speech or a recording. The user receives this call and can listen to specific instructions or make choices in real time as needed.
[0696] After the call, the server analyzes the call results obtained from the auto-dialing system and uses generative AI to summarize the results. The summary includes, for example, whether the call was successful, the user's response, and the options selected. This summary information is then formatted for use in subsequent customer service.
[0697] Finally, the server notifies staff based on the analyzed data. These notifications appear on terminals and provide suggestions on how staff should interact with customers upon their arrival. This enables staff to provide high-quality customer service tailored to each customer's individual needs.
[0698] The specific way this system operates can dramatically improve the efficiency and accuracy of traditional customer service processes. For example, if a user expresses interest in a campaign, staff can use that information to make specific suggestions, thus improving customer satisfaction.
[0699] The following describes the processing flow.
[0700] Step 1:
[0701] Users make reservations to visit the store via an online platform. The date and time entered in the reservation form, family composition, and information about products of interest are checked by the reservation system's terminal and sent to the server.
[0702] Step 2:
[0703] The server stores the received reservation information in a database and sends the stored data to a generating AI model. This AI model is trained to generate call content optimized for the customer based on the collected information.
[0704] Step 3:
[0705] The server delivers the call content received from the generated AI model to the auto-call system. This call content is customized according to the customer's specific needs, and the AI incorporates appropriate guidance and suggestions.
[0706] Step 4:
[0707] The automated calling system uses the call content received from the server to automatically dial the registered customer's phone number. The user receives this automated call and listens to the information provided.
[0708] Step 5:
[0709] After the call ends, the server collects call result data from the auto-call system. This includes whether the call was successful, the user's response, and the options selected.
[0710] Step 6:
[0711] The server analyzes the call results and uses a generative AI to create a summary, picking out meaningful information. This summary includes key points that should be used for future actions.
[0712] Step 7:
[0713] The summarized data is sent from the server to the staff's terminals. This information is displayed on the terminals, allowing staff to understand the specific suggestions needed to assist customers during their visits.
[0714] Step 8:
[0715] By using terminals to provide optimized customer service to customers, staff can offer customers a higher quality service experience.
[0716] (Example 1)
[0717] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0718] Traditional customer service processes face challenges in generating personalized call content and improving accuracy and efficiency in subsequent customer interactions. In particular, standardized call scripts may not adequately address the need for flexible responses tailored to individual customer needs. Furthermore, manual analysis of call results and summarization of information increases workload and slows response times.
[0719] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0720] In this invention, the server includes means for collecting customer reservation information, means for creating personalized call content using a generation AI model based on the collected information, and means for analyzing the outcome of the call and summarizing the key points. This enables the rapid generation of personalized call content and improved accuracy in customer service.
[0721] "Customer reservation information" refers to information provided by customers when they make a reservation to visit a store or use a service, including the planned date, time, number of people, and products or services they are interested in.
[0722] "Means of data collection" refers to systems that store information entered by customers on online platforms, etc., in databases or other similar systems.
[0723] A "generative AI model" is an artificial intelligence model that automatically generates personalized call scripts based on diverse customer information.
[0724] "Personalized call content" refers to a customized call script tailored to each customer's characteristics and needs.
[0725] "Means of creation" refers to a system that utilizes generative AI models to design and generate personalized call content.
[0726] "Means for executing automated calls" refers to a system that, based on the generated call content, makes a phone call to the customer and transmits information using synthesized or recorded voice.
[0727] "Means for analyzing call outcomes" refers to a system that collects call results as data and analyzes information such as customer reactions and selected options.
[0728] "Methods for summarizing key points" refers to the process of extracting important information from analyzed call data and summarizing it concisely.
[0729] A "means for providing customer service suggestions" is a system that provides staff with specific suggestions on how to interact with customers, based on summarized information.
[0730] "Means of communicating with the person in charge" refers to a system that notifies staff members of the details of the proposal to the customer via their terminals and issues instructions so that they can respond appropriately based on the information received.
[0731] "Means of providing to customers through audio output devices" refers to the process of using devices such as speakers and telephone receivers to deliver the generated call content to the customer as audio.
[0732] This invention is a system that streamlines and improves the accuracy of customer service. This system is achieved by generating call scripts tailored to individual customer needs based on online customer reservation information and incorporating them into automated calls.
[0733] First, the user accesses the online booking platform and enters the necessary booking information. This includes information such as the planned visit date, time, number of people, and products or services of interest. This information is then stored in the system's database via the user's device.
[0734] Next, the terminal sends the user-entered information to the server in an appropriate data format such as JSON. The server receives this information and activates a generative AI model to generate a personalized call script. For example, a high-performance AI model utilizing natural language processing technology is used as the generative AI model. This model has the ability to generate a script that is appropriate for a specific situation when given a prompt.
[0735] The generated script is sent from the server to the automated calling system. The automated calling system uses synthesized speech technology to convert the script into speech and automatically dials the customer's phone number. The call is placed to the user's registered phone number, and the customer receives automated guidance.
[0736] After the call, the server analyzes the call data obtained from the auto-call system and uses a generation AI to summarize the results. The summary includes the success or failure of the call, the options selected, and the user's responses, which helps staff prepare appropriate responses for the customer. The summary information is then transmitted back to the terminal and displayed as customer support instructions.
[0737] For example, if a user has booked a family visit for the weekend, the script generation process can include a suggestion for a "special family package." This ensures the call is tailored to the user's needs and provides a better customer experience. An example of a prompt would be: "Generate a call script for a customer planning a family visit. Please include detailed information about the special family package."
[0738] This system makes it possible to expedite and improve the accuracy of individualized responses, which was difficult with conventional methods.
[0739] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0740] Step 1:
[0741] The user accesses the online platform and enters their reservation information.
[0742] Input: Planned visit date, family composition, product information of interest, etc.
[0743] The user enters these details into the reservation form and clicks the submit button. This action saves the entered information to the database.
[0744] Step 2:
[0745] The terminal sends the input information to the server in a data format (such as JSON).
[0746] Input: Reservation information entered by the user.
[0747] Output: JSON formatted data sent to the server.
[0748] The device uses its transmission function to send the collected information to the API endpoint on the server side.
[0749] Step 3:
[0750] The server activates an AI model based on the information it receives, and generates a call script according to the prompt.
[0751] Input: User reservation information (JSON data).
[0752] Output: Personalized call script.
[0753] The server prompts the AI model to generate a script tailored to the user's characteristics. For example, in the case of a family visit, a script is created that includes details of a special package.
[0754] Step 4:
[0755] The server sends the generated call script to the auto-call system.
[0756] Input: A call script created by a generative AI model.
[0757] Output: Synthesized voice or recorded data used by the auto-call system.
[0758] The server converts the call script into appropriate audio data and provides it to the automated calling system.
[0759] Step 5:
[0760] The user receives a call from an automated calling system and interacts with various options.
[0761] Input: Phone call from the server.
[0762] Output: User selections, responses, and feedback.
[0763] The user answers the call and uses push buttons to select options of interest. This information is returned to the server.
[0764] Step 6:
[0765] The server analyzes the call results and uses a generative AI model to summarize them.
[0766] Input: Call logs and user selection data.
[0767] Output: Summarized call results.
[0768] The server analyzes the call content, extracts key points, and creates a summary.
[0769] Step 7:
[0770] The server uses the summary information to generate response instructions for staff and notifies their terminals.
[0771] Input: Summary of call results.
[0772] Output: Action instructions displayed to staff.
[0773] The terminal displays specific instructions on how to interact with the customer, and the staff member uses these instructions to handle the customer interaction.
[0774] (Application Example 1)
[0775] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0776] Traditional appointment booking systems have had drawbacks, such as insufficient guidance tailored to individual customer characteristics and the provision of personalized services. Furthermore, they generally focus on special plans and campaigns, making it difficult to provide information specific to customers' interests and circumstances. Against this backdrop, there is a need for a means to improve customer satisfaction and achieve more efficient staff response.
[0777] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0778] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for analyzing the call results and summarizing important points, means for making customer response suggestions based on the summarized information, means for generating customized guidance based on customer characteristics, and means for providing information based on the guidance to staff. This makes it possible to provide customers with individualized guidance and special plans, thereby improving customer satisfaction.
[0779] "Customer reservation information" refers to information entered by users through an online platform, such as the date and time of their visit, family composition, and products they are interested in.
[0780] "Response information" refers to information that shows responses such as survey responses and feedback obtained from customers.
[0781] "Optimal call content" refers to individualized and customized conversation scripts generated based on the customer's characteristics and needs.
[0782] "Means of making automated calls" refers to a mechanism that uses auto-call technology to automatically initiate pre-generated calls to customers.
[0783] "Call results" refer to the outcomes and information obtained after a call, such as whether the call was successful, the customer's reaction, and the options selected.
[0784] "Customer service suggestions" refer to providing store staff with instructions and advice based on analyzed call results to help them respond appropriately to customers.
[0785] "Customized guidance" refers to the provision of information and offers that are specifically tailored to the customer's characteristics and interests.
[0786] "Information based on guidance" refers to information that utilizes generated, customized guidance and is intended to be provided to customers.
[0787] To implement this invention, the system has the following functions: The server receives reservation and response information entered by the user on their smartphone and stores it in a database. By using MongoDB for the database, efficient management of the information is possible.
[0788] The server uses a generative AI model to generate optimal call content based on the collected information. This generative AI model is built in Python and generates customized scripts tailored to the customer's characteristics. The generated call script is sent to the auto-call system using the Twilio API, and the call is automatically made to the customer. The call content is played back as synthesized speech.
[0789] After the call ends, the server analyzes the results and summarizes the key points. This summarized information is then formatted as suggestions for how staff can appropriately respond to the customer and sent to their terminals. This notification system allows staff to improve the quality of service they provide.
[0790] As a concrete example, in one apparel store, when a family makes an appointment to visit, a generative AI model prepares a call informing them about the latest collections and special offers for families. This information is communicated to the customer in advance, and staff are notified so they can provide the best possible service based on the customer's interests.
[0791] An example of a prompt message would be: "This user has entered 'family of four' as their family structure and indicated interest in 'new arrivals' on the day of their visit. What would be an appropriate call to discuss?"
[0792] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0793] Step 1:
[0794] Users enter their reservation and response information using a smartphone application. This information includes the date and time of visit, products of interest, and family composition. The entered information is sent from the device to the server and stored in a MongoDB database. This enables centralized management of reservation information.
[0795] Step 2:
[0796] The server retrieves information stored in the database and activates a generative AI model. Using reservation and response information as input, the AI creates an optimized call script based on this data. Specifically, it uses a Python-based text generation algorithm to generate a customized script tailored to the customer's characteristics. The output is the generated call content.
[0797] Step 3:
[0798] The server sends the generated call script to the auto-dialing system via the Twilio API. The auto-dialing system automatically dials the customer's phone number and plays back the call using synthesized speech. The input here is the generated call script, and the output is the execution of the call to the customer.
[0799] Step 4:
[0800] Once a call ends, the server retrieves the call results from the auto-call system and analyzes them. Using a specific analysis algorithm, it extracts information such as the success or failure of the call, customer responses, and selected options. This analysis result is formatted as a summary and used as input for the next step.
[0801] Step 5:
[0802] The server generates customer service suggestions based on the summarized call results. These suggestions are then sent to staff members' terminals. The notifications include information on special offers and customer service tips, providing guidance for staff to tailor their responses to each customer. The input is the analysis results, and the output is the notification to the staff.
[0803] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0804] In an embodiment of this invention, the system operates according to the following procedure.
[0805] Users first make a reservation through an online platform. This includes entering information about the date and time of their visit, family composition, and products they are interested in. This information is transmitted to the server via the terminal and stored in a database.
[0806] The server uses stored reservation information to run a generation AI model and generate optimal call content. This AI model considers not only regular information but also the user's emotional state in conjunction with an emotion engine. The emotion engine estimates the user's emotions from past data and input information and uses the results to customize the call content. For example, if the user is showing anxiety, the generated call content will include suggestions to reassure them.
[0807] The generated call content is sent from the server to the auto-dialing system. The auto-dialing system automatically dials the registered user's phone number and plays the generated call content back through an audio output device. The user can listen to this call and choose options as needed.
[0808] After a call ends, the server retrieves call result data. This data includes the user's emotion recognition results from the emotion engine, which are then analyzed to record changes in the user's emotions. Furthermore, based on this analysis, particularly important points are extracted and a summary is created.
[0809] The summarized information is sent from the server to the staff's terminal. The terminal receives this information and provides the staff with more personalized customer service suggestions tailored to the user's mood. For example, if the user appears happy, it is recommended to prioritize providing information about special promotions or discounts for their next visit.
[0810] This approach allows for more personalized customer service, enabling responses tailored to each user's different emotional state. As a result, the customer experience is improved, and deeper relationships are built.
[0811] The following describes the processing flow.
[0812] Step 1:
[0813] Users make appointments to visit the store using an online platform. During the booking process, they answer questionnaires regarding the date and time of their visit, family composition, and products of interest. This data is transmitted in real time via the device to a server and stored in a database.
[0814] Step 2:
[0815] The server processes the stored reservation information and survey responses, and uses this to launch a generative AI model. This model also acquires data for the emotion engine and is responsible for estimating the user's emotional state. The emotion data includes latent emotional information that can be analyzed from the text.
[0816] Step 3:
[0817] The server uses the results of the generative AI model and emotion engine to generate the most appropriate call content. For example, if the AI determines that the user is nervous, it will create reassuring messages such as, "We have prepared an environment where you can relax and enjoy yourself."
[0818] Step 4:
[0819] The generated call content is sent from the server to the auto-dialing system. This system automatically dials the registered user's phone number and plays back the call content as synthesized speech or a recording.
[0820] Step 5:
[0821] The user receives a call from the automated calling system and listens to the information provided. After listening, if the user makes a selection, that selection is also recorded.
[0822] Step 6:
[0823] After a call ends, the server receives the call results from the auto-call system and performs analysis. This analysis includes detailed information about the user's choices and emotional changes.
[0824] Step 7:
[0825] Using the analysis results, the server uses a generative AI to perform summarization and extract important information based on the user's emotions and choices.
[0826] Step 8:
[0827] The summarized information is sent from the server to the staff member's terminal. The terminal displays specific suggestions to help the staff member provide better service at their next interaction with the user. For example, it might say, "The user has shown interest in a new product line. Please introduce that product to them when they visit the store."
[0828] In this way, the entire system works together, making it possible to provide more personalized and emotionally resonant customer service.
[0829] (Example 2)
[0830] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0831] Traditional customer management systems have struggled to provide personalized services that take customer emotions into consideration. Furthermore, the generation of call content and subsequent responses were generic and uniform, resulting in insufficient responses tailored to the individual circumstances and emotions of each user, thus limiting improvements in customer satisfaction.
[0832] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0833] In this invention, the server includes means for acquiring and storing user reservation data, means for using a generative AI model that generates optimal call content based on the acquired data and past sentiment analysis data, and means for inputting generated prompt sentences into the generative AI model and customizing the call content in cooperation with the sentiment engine. This enables the generation of optimized call content that takes the user's emotions into consideration, and personalized responses based on that content.
[0834] "User" refers to a person who uses the system to receive services such as making reservations or making phone calls.
[0835] "Reservation data" refers to information related to the planned date and time of visit and products of interest entered by the user within the system.
[0836] A "generative AI model" refers to artificial intelligence technology that uses natural language processing based on input data to generate optimal call content.
[0837] A "prompt message" refers to a string of characters given to a generative AI model to instruct it on what content it should generate.
[0838] An "emotion engine" refers to an algorithm that estimates a user's emotional state based on their past data and current input information, and outputs the result.
[0839] A "voice output mechanism" refers to a device or technology that reproduces the generated call content as audio and provides it to the user.
[0840] "Automated calling" refers to a function that executes a call based on call content generated by the system, without direct human intervention.
[0841] "Summary information" refers to information that is compressed and represented by extracting important parts based on analysis of call results and acquired data.
[0842] "Staff" refers to individuals within an organization who provide services to users and receive support from the system.
[0843] This invention utilizes an integrated system in which a server, terminal, and user work together. The following details how each element collaborates to implement the invention.
[0844] Users enter reservation data through an online platform using an internet-connected device. During this process, users provide information such as their planned visit date and time, family composition, and products of interest. This information is transmitted to the server via the device. For security reasons, encryption protocols are used to protect the data during this process. The server stores the received reservation data in a database.
[0845] The server activates a generative AI model based on information stored in the database. This generative AI model estimates the user's emotional state through an emotion engine that works in conjunction with the input data. By using generative prompts, more personalized call content is generated. For example, a specific prompt such as "Generate a welcome message for this user that includes information about new products" can be used.
[0846] The generated call content is sent from the server to an automated call system. This system utilizes an audio output mechanism to automatically dial the registered user's phone number. The user can receive this call in audio format and make appropriate selections or responses based on its content.
[0847] After the call ends, the server automatically processes the results and analyzes the user's emotional changes using an emotion engine. Summary information is extracted from this analysis, and based on this, personalized suggestions for user interaction are created. These suggestions are then communicated to staff via the terminal, who then provide the user with complete service. For example, if a user showed interest in a new menu item, it might be recommended to offer a tasting coupon for their next visit.
[0848] Thus, in order to implement this invention, the entire system must be highly integrated, and each component must work together seamlessly. As a result, it becomes possible to provide personalized services that take into account the user's emotions and significantly improve customer satisfaction.
[0849] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0850] Step 1:
[0851] Users access the online platform via the internet using their devices and enter their reservation information. This information includes the planned date and time of visit, family composition, and products of interest. This entered data is converted into packets on the device and sent to the server via a secure connection.
[0852] Step 2:
[0853] The server receives reservation data sent from the terminal and stores it in the database. A database management system (DBMS) operates during this process, ensuring the data is stored in the appropriate tables and fields. Transactional processing maintains data integrity.
[0854] Step 3:
[0855] The server extracts reservation information stored in the database and activates a generative AI model. The reservation information and sentiment estimation from the sentiment engine are included in the generative prompt text and input into the AI model. The prompt text includes instructions such as "Generate content introducing new products to visiting customers." Based on these inputs, the AI model generates the optimal call content and outputs text data.
[0856] Step 4:
[0857] The server sends the generated call content to the automated calling system. Using VoIP technology, the server automatically places a call to the user's registered phone number. The audio output mechanism converts the text data of the call content into audio format and plays it back to the user.
[0858] Step 5:
[0859] After a call ends, the server automatically collects the call's results. It then utilizes the emotion engine again to analyze the user's emotional changes during the call. Based on this, it extracts key points and generates a summary. The analysis data and summary are output in digital format.
[0860] Step 6:
[0861] The server sends the generated summary information and analysis results to the terminal, which is then received by the staff. The terminal then notifies the staff with personalized response suggestions that take the user's emotions into consideration. For example, this might include a suggestion such as, "If you show interest in the new menu, we will offer you a tasting coupon on your next visit."
[0862] (Application Example 2)
[0863] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0864] Currently, customer service tends to be uniform, making it difficult to provide services that cater to the individual emotions and needs of each customer. Furthermore, recommendations for products and services in stores are not commonly offered, highlighting the need for improved customer experience.
[0865] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0866] In this invention, the server includes means for collecting customer reservation information and response information, means for generating optimal call content based on the collected information, means for making an automated call using the generated call content, means for estimating the customer's emotional state and adjusting the call content based on the estimated emotional state, and means for suggesting recommended products and services at the store. This makes it possible to provide personalized services that correspond to each customer's specific emotional state, thereby improving the customer experience and satisfaction.
[0867] "Customer reservation information" refers to information about the date, time, and content of a reservation that a customer provides when booking a service or product.
[0868] "Response information" refers to information provided by customers by answering questions or surveys related to the service.
[0869] "Optimal call content" refers to call content that is tailored to the customer's needs and emotional state, and is designed to effectively convey information.
[0870] "Automated calling" is a process that mechanically makes phone calls based on pre-set call content and transmits information.
[0871] "Call results" refer to data obtained after a call with a customer, such as the content of the conversation and the customer's reactions.
[0872] A "customer service proposal" is a suggestion that outlines the optimal way to respond to a customer based on their needs and emotions.
[0873] "Emotional state" refers to the psychological and sensory responses that a customer exhibits in a particular situation.
[0874] "Means of estimation" refers to methods of analyzing data to determine the emotional state and circumstances of customers.
[0875] "Recommended products and services in stores" are products and services offered to customers specifically with the intention of promoting sales.
[0876] To implement this invention, the main components of the system are a server, a terminal, and a user. The server first stores customer reservation information and response information in a database. Reservation information includes the date and time of visit and family composition, while response information is information about products of interest and emotions provided by the customer in advance.
[0877] The server uses a generative AI model to generate optimal call content based on this information. The generative AI model employs natural language processing technologies such as OpenAI GPT to customize the call content according to the information and the user's emotional state. To construct call content that provides greater reassurance to the customer based on the estimated emotional state, an emotion engine utilizing Microsoft Azure's Emotion API is applied.
[0878] The generated call content is automatically dialed to the user's phone number via an auto-dialing system. This auto-dialing system utilizes services such as Twilio and Nexmo, and calls are made through an audio output device. The user receives this call and customizes the service by selecting options.
[0879] Once the call ends, the server receives the call results and summarizes the key points. This summarized information is sent to the staff member's terminal, where personalized customer service suggestions are made. For example, if the estimated emotional state was positive, a special discount for the next visit can be offered.
[0880] For example, if a user makes a reservation to visit a theme park with their family on the weekend, the server will generate a call message based on the reservation information, including information about family plans and special events. An example of a prompt would be, "Please recommend family plans and suggest information about special events in an emotionally sensitive tone."
[0881] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0882] Step 1:
[0883] The server receives reservation and response information entered by users via an online platform. This information includes details such as the date and time of visit, family composition, and product interests. The server stores this input information in a database. The data format includes date and time information, text information, and choice information.
[0884] Step 2:
[0885] The server inputs information stored in the database into the generative AI model. During this process, the generative AI model compares past user data with current data to generate the optimal call content. Data processing includes organizing text information and preparing data for sentiment analysis. The output is a customized call script.
[0886] Step 3:
[0887] The server sends the generated call content to the emotion engine for analysis to estimate the user's emotional state. This engine uses the Microsoft Azure Emotion API to generate emotion tags from the input text information, providing insights into how the call content should be adjusted. The output is the emotion-adjusted call content.
[0888] Step 4:
[0889] The server sends the pre-arranged call details to the auto-dialing system. If the auto-dialing system uses the Twilio API, an audio file of the call is generated and automatically dialed to the user's phone number. The user listens to this call and selects their response according to the prompts. The output at this point is the user's selection information.
[0890] Step 5:
[0891] After the call ends, the server analyzes the call results based on the user's choices and emotional state, and summarizes the key points. This is done using a data summarization algorithm to extract information and generate a suggestion statement indicating what staff should actually prioritize. The output is summarized suggestion information.
[0892] Step 6:
[0893] The server sends summary information to staff terminals. This information is made available on a staff dashboard and used as part of customer service. Specifically, the terminals are configured to receive notifications in real time and display recommended actions.
[0894] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0895] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0896] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0897] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0898] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0899] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0900] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0901] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0902] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0903] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0904] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0905] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0906] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0907] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0908] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0909] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0910] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0911] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0912] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0913] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0914] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0915] The following is further disclosed regarding the embodiments described above.
[0916] (Claim 1)
[0917] Means for collecting customer reservation information and response information,
[0918] A means of generating optimal call content based on collected information,
[0919] A means of making an automated call using the generated call content,
[0920] A means of analyzing call results and summarizing important points,
[0921] A means of proposing customer service based on summarized information,
[0922] A system that includes this.
[0923] (Claim 2)
[0924] The system according to claim 1, further comprising means for notifying staff of customer service suggestions.
[0925] (Claim 3)
[0926] The system according to claim 1, further comprising means for providing the generated call content to the customer through an audio output device.
[0927] "Example 1"
[0928] (Claim 1)
[0929] A means of collecting customer reservation information,
[0930] A means of creating personalized call content using an AI model based on accumulated information,
[0931] A means of executing an automated call using the generated call content,
[0932] A method for analyzing the results of a call and summarizing the key points,
[0933] A means of proposing customer service based on compiled information,
[0934] A system that includes this.
[0935] (Claim 2)
[0936] The system according to claim 1, further comprising means for communicating customer service proposals to the person in charge.
[0937] (Claim 3)
[0938] The system according to claim 1, further comprising means for providing the generated call content to the customer through an acoustic output device.
[0939] "Application Example 1"
[0940] (Claim 1)
[0941] Means for collecting customer reservation information and response information,
[0942] A means of generating optimal call content based on collected information,
[0943] A means of making an automated call using the generated call content,
[0944] A means of analyzing call results and summarizing important points,
[0945] A means of proposing customer service based on summarized information,
[0946] A means of generating customized guidance based on customer characteristics,
[0947] Means of providing staff with information based on the guidance,
[0948] A system that includes this.
[0949] (Claim 2)
[0950] The system according to claim 1, further comprising means for notifying staff of customer service suggestions and means for providing information on special plans related to advance reservations.
[0951] (Claim 3)
[0952] The system according to claim 1, further comprising means for providing the generated call content to the customer through an audio output device, and means for adjusting the call content based on the customer's interests.
[0953] "Example 2 of combining an emotion engine"
[0954] (Claim 1)
[0955] A means of acquiring and saving user reservation data,
[0956] A means of using a generative AI model that generates optimal call content based on acquired data and past sentiment analysis data,
[0957] A method for inputting generated prompt text into a generation AI model and customizing call content in conjunction with an emotion engine,
[0958] A means of executing an automated call using a voice output mechanism based on the generated call content,
[0959] A means for analyzing call results, extracting specific parts, and generating a summary,
[0960] A means of proposing user support based on summary information,
[0961] A system that includes this.
[0962] (Claim 2)
[0963] The system according to claim 1, further comprising means for notifying staff of suggestions for user support.
[0964] (Claim 3)
[0965] The system according to claim 1, further comprising means for providing to a user the generated call content through an audio output mechanism that plays back the call content.
[0966] "Application example 2 when combining with an emotional engine"
[0967] (Claim 1)
[0968] Means for collecting customer reservation information and response information,
[0969] A means of generating optimal call content based on collected information,
[0970] A means of making an automated call using the generated call content,
[0971] A means of analyzing call results and summarizing important points,
[0972] A means of proposing customer service based on summarized information,
[0973] A means of estimating the emotional state of a customer,
[0974] A means of adjusting the content of a call based on the estimated emotional state,
[0975] Methods for suggesting recommended products and services in stores,
[0976] A system that includes this.
[0977] (Claim 2)
[0978] The system according to claim 1, further comprising means for notifying staff of customer service suggestions.
[0979] (Claim 3)
[0980] The system according to claim 1, further comprising means for providing the generated call content to the customer through an audio output device. [Explanation of Symbols]
[0981] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. Means for collecting customer reservation information and response information, A means of generating optimal call content based on collected information, A means of making an automated call using the generated call content, A means of analyzing call results and summarizing important points, A means of proposing customer service based on summarized information, A means of generating customized guidance based on customer characteristics, Means of providing staff with information based on the guidance, A system that includes this.
2. The system according to claim 1, further comprising means for notifying staff of customer service suggestions and means for providing information on special plans related to advance reservations.
3. The system according to claim 1, further comprising means for providing the generated call content to the customer through an audio output device, and means for adjusting the call content based on the customer's interests.