system

A system using machine learning and real-time emotional analysis optimizes communication response operations by predicting demand and adjusting staffing, addressing inefficiencies in conventional methods and enhancing customer satisfaction.

JP2026105403APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Conventional communication response operations rely on manual prediction of future communication volume and personnel allocation, leading to inefficiencies such as over-staffing or response delays due to inaccurate predictions and the inability to adapt to real-time changes in demand.

Method used

A system utilizing machine learning models to predict future communication volumes based on historical data and real-time monitoring, optimizing staffing to match demand dynamically, and incorporating real-time emotional analysis to enhance response quality.

Benefits of technology

Improves prediction accuracy and responsiveness, ensuring optimal staffing and enhanced customer satisfaction by adapting to fluctuating demand and emotional states in communication facilities.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026105403000001_ABST
    Figure 2026105403000001_ABST
Patent Text Reader

Abstract

Provide a system. 【Solution means】 Means for obtaining past communication records and activity information, Means for constructing a machine learning model using the obtained information and predicting future traffic volume, Means for monitoring the communication status of facilities in real time, Means for optimizing resource allocation at each facility based on the prediction result and the monitoring result, Means for notifying each facility of the optimized result of the allocation, Means for an administrator to check the allocation plan via a user interface and modify it as necessary, Means for collecting feedback and updating the machine learning model to improve the prediction accuracy for subsequent times, A system including the above.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance as a response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] In conventional communication response operations, future communication volume prediction and optimization of personnel allocation depend on the person in charge, and the prediction accuracy and efficiency are not sufficient. As a result, there may be over - staffing or response delays due to shortages, which may impede business operations. Also, since it is difficult to optimize personnel allocation in response to real - time situation changes, flexible response according to demand is required. The purpose of this invention is to solve these problems and improve the efficiency and accuracy of communication response operations.

Means for Solving the Problems

[0005] The system according to this invention includes means for constructing a machine learning model using past communication records and business activity information to accurately predict future communication volumes. It also includes means for monitoring the communication status of communication facilities nationwide in real time and optimizing staffing based on predictions and monitoring results. As a result, prediction accuracy is improved, enabling rapid and appropriate staffing in response to fluctuating demand. Furthermore, by notifying each facility of the optimization results, the system facilitates smoother business operations.

[0006] "Communication records" refer to historical data of past calls and messages, including information such as the caller, recipient, content of the communication, and duration of the communication.

[0007] "Business activity information" refers to data related to the sales activities and event schedules conducted by companies and organizations, including information on campaigns and new product launch dates.

[0008] A "machine learning model" refers to an algorithmic framework that learns patterns from data and uses them to predict and classify future data.

[0009] "Means for predicting communication volume" refers to a method or apparatus for predicting future communication demand using past communication records and business activity information as input.

[0010] A "communication response facility" refers to a physical location such as a center or office that handles communication operations, primarily a facility for processing incoming calls and inquiries from customers.

[0011] "Means of real-time monitoring" refers to methods or devices that enable immediate understanding of ongoing communication conditions and rapid response to changes.

[0012] "Means for optimizing personnel allocation" refers to methods or devices for allocating and deploying the appropriate number of personnel and skills to meet communication demands.

[0013] "Means for notifying the results of optimization of staffing arrangements" refers to a method or device for transmitting optimized staffing information to relevant communication facilities. [Brief explanation of the drawing]

[0014] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, which incorporates an emotion engine. [Figure 14]It is a sequence diagram showing the processing flow of a data processing system in Application Example 2 when a sentiment engine is combined.

Embodiments for Carrying Out the Invention

[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0016] First, the terms used in the following description will be explained.

[0017] In the following embodiments, a numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0018] In the following embodiments, a numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0019] In the following embodiments, a numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0022] [First Embodiment]

[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0035] This invention relates to a system that optimizes staffing by utilizing historical data and real-time conditions in order to improve the efficiency of communication response operations.

[0036] Data collection and prediction

[0037] The server collects communication records and business activity information from databases and external APIs. This data is based on past usage and future business events and contains information useful for predicting traffic volume. The server uses this data to train machine learning models to predict future traffic volume with high accuracy. This predictive information is used by the server to adjust optimal staffing.

[0038] Real-time monitoring and deployment optimization

[0039] The server acquires real-time communication data from communication response facilities and analyzes the communication load at each facility. This allows for an immediate understanding of how much incoming calls each facility is handling. Based on this information, the server optimizes staffing in real time according to the predicted communication volume. The optimization results are immediately notified to each communication response facility, enabling rapid reassignment of personnel.

[0040] User interface and feedback loop

[0041] Users can use their devices to view the predictions and staffing plans calculated by the server. The devices display the optimization results sent from the server, and users can approve them as needed or make modifications as necessary. Furthermore, the server operates a process to collect the results and user feedback and update the machine learning model to improve the accuracy of future predictions.

[0042] As a concrete example, if a special weekend campaign is scheduled, the server predicts the number of incoming calls before and after the campaign and instructs each customer service facility on the necessary staffing arrangements based on that prediction. The introduction of this system makes it possible to improve the efficiency of communication support operations and ensure high response quality.

[0043] The following describes the processing flow.

[0044] Step 1:

[0045] The server extracts data from a database of past communication records for a certain period, compares it with business activity information, and generates necessary predictive variables. If necessary, it retrieves the latest sales event information via external APIs.

[0046] Step 2:

[0047] The server performs machine learning preprocessing on the collected data. This preprocessing includes data cleansing, imputation of missing values, and removal of outliers. It also normalizes the data as a time series and extracts the features necessary for training.

[0048] Step 3:

[0049] The server builds a machine learning model and trains it using preprocessed data. Specifically, it uses a time series forecasting model (e.g., an LSTM network) to predict future traffic volume based on historical data. The parameters of this model are optimized based on the training data.

[0050] Step 4:

[0051] The server inputs new data into the trained model and predicts future traffic volume. The prediction results are output as expected traffic volume per hour and stored in a management database.

[0052] Step 5:

[0053] The server acquires real-time incoming call data from each communication response facility and calculates the current communication load. This makes it possible to understand the response capacity and the number of unanswered calls at each facility in real time.

[0054] Step 6:

[0055] The server integrates traffic volume forecast data and real-time load data, and uses an algorithm to calculate the optimal staffing allocation, thereby determining the optimal allocation of labor. This makes it possible to allocate the appropriate number of personnel to each facility in relation to demand.

[0056] Step 7:

[0057] The terminal displays the optimized staffing plan received from the server and prompts the user for status confirmation. The user can then approve the presented plan or make modifications as needed.

[0058] Step 8:

[0059] The server recollects the executed deployment plan and its results, and stores them as evaluation data. This data is used to perform model training in subsequent iterations, forming a feedback loop aimed at improving prediction accuracy.

[0060] (Example 1)

[0061] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0062] In communication support operations, efficiently allocating personnel while considering historical data and real-time conditions is difficult to optimize and can lead to wasted resources and a decline in response quality. To solve this problem, there is a need for a system that can optimize personnel allocation with high precision by utilizing historical data and real-time communication conditions.

[0063] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0064] In this invention, the server includes means for acquiring communication records and business information, means for constructing a data analysis model using the acquired information and predicting future communication volume, and means for continuously monitoring the communication status of communication facilities. This enables effective optimization of workforce allocation by utilizing historical and real-time data.

[0065] "Communication records" refer to information that details past communication activities, and may include the date and time of communication, sender and receiver, and content of the communication.

[0066] "Business information" refers to data related to business activities, including information such as sales events, customer service status, and working hours.

[0067] A "data analysis model" is a model that uses mathematical or statistical methods to predict future trends based on collected data.

[0068] "Continuous monitoring" involves continuously acquiring and analyzing data in real time, with the aim of always understanding the latest situation.

[0069] "Labor allocation" is the process of determining the number of people and roles required for a specific task or operation, and then assigning personnel based on that.

[0070] "Optimization results" refer to the outcome of deriving the most efficient solution or arrangement for a given purpose, based on prediction and analysis.

[0071] "Notifying" refers to the act of conveying certain information in a way that is understandable to a third party, and specifically means providing information output from a system to each facility.

[0072] This invention aims to build a system in which servers, terminals, and users cooperate to achieve efficient communication response operations. The main elements are data collection, utilization of machine learning models, real-time monitoring, deployment optimization, and a feedback loop.

[0073] Data collection

[0074] The server is equipped with a data collection module for acquiring communication records and business information. The server uses a database system (e.g., MySQL®, PostgreSQL) to efficiently import past communication activity records and sales event data. It also acquires external data through specific business APIs to enhance the information on business activities.

[0075] Utilization of machine learning models

[0076] The server builds machine learning models based on the collected data. Specifically, it uses open-source libraries such as TENSORFLOW® and scikit-learn to analyze past traffic and predict future demand. The server then uses the resulting predicted data as foundational information to optimize staffing.

[0077] Real-time monitoring

[0078] The server monitors the communication status from communication facilities in real time. The server periodically acquires data from sensors and communication systems located at each facility and processes it to understand the current workload.

[0079] Optimization of placement

[0080] The server optimizes staffing by combining predictions from machine learning models with real-time data. The server utilizes algorithms to quickly calculate the required workforce for each facility and notifies the facility of the results.

[0081] User Interface

[0082] Users can use a terminal to view forecast information and staffing plans generated by the server. The terminal has an intuitive interface, and users can modify the staffing plan as needed and provide feedback to the server.

[0083] As a concrete example, the server instructs each communication facility in real time to allocate the necessary personnel to handle the expected increase in traffic during special weekend campaigns. Furthermore, an example of a prompt message to the generated AI model would be: "We want to make highly accurate predictions of traffic volume in preparation for the special weekend campaign. Please make predictions based on past campaign data and propose a corresponding personnel allocation plan."

[0084] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0085] Step 1:

[0086] The server retrieves communication records and business information from databases and external APIs. As input, the server interacts with these data sources to collect data on past communication activity and business events. This allows the server to obtain information that forms the basis for predicting future communication volumes. Specifically, it periodically calls APIs and adds new information to the database.

[0087] Step 2:

[0088] The server trains a machine learning model based on the collected data. Historical communication data is used as input, and the server utilizes libraries such as TensorFlow for training. The output is a model that predicts future communication volume. Specifically, it prepares a training dataset and repeats calculations to optimize parameters.

[0089] Step 3:

[0090] The server acquires real-time communication data from communication facilities and analyzes the communication load of each facility. It receives real-time data on the number of communications and response times as input, and analyzes this data to understand the current situation. As output, it generates information on the current load status of each facility. Specifically, it periodically collects sensor data and analyzes it in real time.

[0091] Step 4:

[0092] The server optimizes staffing based on the results of a predictive model and real-time data. Predictive data and real-time load data are used as input, and the server executes an optimization algorithm. The output is an optimal staffing plan, which is then notified to each facility. The specific operation involves executing the algorithm, performing calculations, and transmitting the results over the network.

[0093] Step 5:

[0094] Users review the predictive information and optimized staffing plans generated by the server through their terminals. As input, users receive information provided on their terminals and make decisions based on it. As output, they return feedback to the server regarding the staffing plan as needed. Specifically, this includes reviewing data using a visual interface and making corrections.

[0095] Step 6:

[0096] The server collects execution results and user feedback to update the machine learning model. It takes feedback data and execution results as input and uses them to retrain the model to improve its accuracy. The output is an updated model that enables more accurate predictions. Specific operations include analyzing feedback and adjusting parameters based on the retrained model.

[0097] (Application Example 1)

[0098] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0099] In modern society, situations where the communication load is unclear, such as in communication support and security services, make appropriate resource allocation difficult. This can lead to a decline in service quality and response speed, resulting in reduced customer satisfaction. Furthermore, facilities require rapid personnel redeployment in response to changing circumstances, but there is a lack of systems to effectively achieve this. To solve these problems, dynamic resource management based on real-time and predictive data is necessary.

[0100] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0101] In this invention, the server includes means for acquiring past communication records and activity information, means for building a machine learning model using the acquired information to predict future communication volume, means for monitoring the facility's communication status in real time, means for administrators to check the deployment plan and make necessary modifications via a user interface, means for collecting feedback and updating the machine learning model to improve prediction accuracy for subsequent operations, means for analyzing notifications and warnings in specific areas based on real-time data and optimizing resource allocation accordingly, and means for strengthening the generated AI model based on feedback on resource allocation optimization and providing prompt statements for users to generate specific plans. This enables dynamic and optimal resource allocation and improved response quality.

[0102] "Communication records" refer to historical information such as past communication content, time, and connection status.

[0103] "Activity information" refers to data related to the operation of a company or service, such as events, campaigns, and daily work schedules.

[0104] A "machine learning model" is an algorithm or mathematical model built to predict future situations using past data.

[0105] "Communication status" refers to the current load and connection status of each facility where communication is taking place.

[0106] "Resource allocation" refers to actions and plans that appropriately distribute limited resources such as personnel and equipment.

[0107] "User interface" refers to the screens or control panels that system users can directly operate or check.

[0108] "Feedback" refers to information collected from system usage results and user opinions and evaluations, which is used to improve and optimize the system.

[0109] "Real-time data" refers to data about current situations and events that are generated and updated almost instantly.

[0110] A "generative AI model" is a system that uses artificial intelligence technology to analyze and understand data and generate solutions or predictions for specific problems.

[0111] A "prompt message" is a type of text used by users to give instructions or questions to a system.

[0112] In this embodiment, the server is operated using a cloud service (e.g., AWS® or Google® Cloud). The server acquires communication records and activity information from each facility and builds a machine learning model based on this data. This utilizes software such as Python's Scikit-learn and TensorFlow. Using the machine learning model, it is possible to predict future communication volume with high accuracy. To acquire real-time data, the server collaborates with the communication network of each facility and analyzes the current communication status.

[0113] Smartphones and tablet devices, acting as terminals, provide a user interface for administrators, displaying optimized resource allocation plans. Using frameworks like React Native, users can review the allocation plan and make modifications as needed. The terminals feed the results back to the server in real time, and the server continuously updates its machine learning model based on this information.

[0114] For example, if an increase in traffic volume is predicted for a specific region over the weekend, the server will adjust the personnel allocation in that region in advance to ensure appropriate staffing levels. A possible prompt message used by the user in this case might be, "Please provide an optimal staffing plan to handle the increased traffic volume over the weekend." This enables efficient operation of communication support and security services, and ensures high response quality.

[0115] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0116] Step 1:

[0117] The server retrieves historical communication records and activity information through databases and external APIs. This input data includes call volume, duration, and event information. The server cleans and preprocesses this data before arranging it in the appropriate format. At this stage, data processing such as noise removal and format standardization is performed.

[0118] Step 2:

[0119] The server builds a machine learning model using the processed data. First, it generates a training dataset using frameworks such as Python's Scikit-learn or TensorFlow. Based on this dataset, the server trains a generative AI model to predict future traffic volume. In this process, it extracts data features, selects an appropriate algorithm, and trains the model. The output is the model used for prediction.

[0120] Step 3:

[0121] The server collects real-time data from each communication facility. This real-time data includes current communication load and the number of calls in progress. The server analyzes this data and performs data processing to understand the current load situation. The output of this processing is real-time load information for each facility.

[0122] Step 4:

[0123] The terminal optimizes resource allocation based on predictions and real-time data received from the server. Specifically, it executes an optimization algorithm that allocates additional resources to areas with high load. At this stage, it performs data calculations to improve the allocation of personnel and equipment based on the set conditions and presents the optimal allocation plan as output.

[0124] Step 5:

[0125] The user reviews the optimized resource allocation plan displayed on the terminal. Through the user interface, they can modify or approve the proposed plan. In this step, the displayed plan is the target of user input, but ultimately, the plan modified and applied by the user is output.

[0126] Step 6:

[0127] The server collects user-modified and applied plans as feedback. To improve prediction accuracy in subsequent attempts, it integrates this feedback and updates the machine learning model. The server retrains the generative AI model, improving its performance. The feedback serves as input, and the improved accuracy of the new prediction model is output.

[0128] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0129] This invention relates to a system that recognizes the emotional state of users in communication support operations and utilizes this information to optimize responses and staffing. This system combines communication volume prediction based on past communication records and business activity information with an emotion engine for analyzing the emotional state of users.

[0130] Data analysis and emotion recognition

[0131] The server activates an emotion engine to analyze voice and text data acquired in real time from each communication facility. The emotion engine analyzes the words used by the user, their tone of voice, and the flow of the conversation to recognize the user's emotional state (e.g., joy, anger, anxiety). This information is stored in a database along with other communication status data and further analyzed.

[0132] Optimizing customer service and staffing

[0133] The server adjusts its response in real time based on the user's emotional state, as recognized by the emotion engine. For example, it dynamically allocates the most suitable resources, such as assigning a highly specialized operator to a user exhibiting a specific emotional state. Such adjustments are crucial for providing a better user experience and improving customer satisfaction.

[0134] User interface and feedback loop

[0135] The terminal receives sentiment analysis results and optimized response instructions from the server and presents them to the user. The administrator, acting as the user, can also contribute to the overall system improvement by evaluating the quality of responses based on the sentiment analysis results and providing feedback. The server utilizes this feedback to continuously improve the sentiment engine's algorithm, enabling even more accurate predictions and responses.

[0136] For example, if a customer calls in with a complaint and the emotion engine analyzes the call as "angry," the server can immediately assign the most suitable customer support representative based on that profile. This implementation improves the quality of service and facilitates smoother problem resolution.

[0137] The following describes the processing flow.

[0138] Step 1:

[0139] The server collects user voice and text data in real time from the communication reception facility. This data is acquired at appropriate times during the call and preprocessed for sentiment analysis.

[0140] Step 2:

[0141] The server uses an emotion engine to analyze the collected audio and text data. This analysis considers the user's tone of voice, word choice, and sentence flow to detect emotional states such as joy, anger, and sadness.

[0142] Step 3:

[0143] The server feeds back the recognized emotional state to the communication response facility and uses it as data to determine the appropriate response policy. It also prioritizes assigning calls to operators with specific expertise and capabilities based on the emotional state.

[0144] Step 4:

[0145] The terminal displays the sentiment analysis results and suggested response plans sent from the server. The administrator, as the user, can review this information and send instructions to the response staff as needed.

[0146] Step 5:

[0147] Users can monitor the progress of actual interactions and immediately modify their responses in anticipation of changes in customer emotions or reactions. This operation is performed via a terminal, and the system is prepared to incorporate that information and make further optimizations.

[0148] Step 6:

[0149] The server analyzes the accumulated emotion recognition data and the final response results to update the emotion engine's algorithm. This feedback loop is essential for improving the accuracy of future analyses and the ability to formulate response strategies.

[0150] (Example 2)

[0151] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0152] In customer service operations, it is difficult to optimize staffing based on communication volume and customer emotional state, leading to decreased customer satisfaction and increased operator workload. Conventional systems do not adequately predict communication volume or dynamically allocate staff based on emotional state, leaving challenges in achieving effective customer service.

[0153] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0154] In this invention, the server includes means for acquiring past communication records and business activity information, means for constructing an inference model using the acquired information and predicting future communication volume, means for monitoring the communication status of multiple communication response facilities in real time, means for analyzing user speech content and voice data to recognize emotional states, means for optimizing staffing at each response facility based on prediction results, monitoring results, and emotional state analysis results, and means for notifying each communication response facility of the optimized staffing results. This enables optimal staffing based on dynamic prediction of communication volume and recognition of emotional states, thereby improving customer satisfaction and reducing the burden on operators.

[0155] "Communication records" is a general term for data that includes the content, duration, and participant information of past phone calls and messages.

[0156] "Business activity information" refers to information about the daily business processes carried out by a company or organization, and includes data such as transaction history and interactions with customers.

[0157] An "inference model" is a computational model, either statistical or machine learning-based, that uses past data to predict future events or states.

[0158] A "communication support facility" is a dedicated facility for handling inquiries, orders, complaints, and other matters from customers and users.

[0159] "Emotional state" refers to the state in which a person expresses their emotions, and includes psychological states such as joy, anger, and anxiety.

[0160] "Personnel allocation" refers to placing the appropriate staff or operators in the appropriate locations according to the needs of the work.

[0161] "Optimization methods" refer to the process of making adjustments using calculations and algorithms to maximize the efficiency and effectiveness of the subject.

[0162] "Means of notification" refers to a method or system for transmitting information or instructions to a target person.

[0163] This invention relates to a system that recognizes the emotional state of users in communication support operations and optimizes response methods and staffing. The server acts as the central hub of this system, operating to implement various functions.

[0164] The server first collects communication records and business activity information, and then builds an inference model based on this information. The inference model is implemented using a programming language such as Python or R, and its associated machine learning libraries (e.g., TensorFlow, scikit-learn). This gives the system the ability to predict future communication volume.

[0165] Next, the server receives voice and text data in real time from multiple communication reception facilities. This data is obtained through VoIP services and messaging APIs. The received voice data is converted to text using speech recognition software such as the Google Speech-to-Text API. Then, the emotion engine is activated to analyze the user's speech and tone of voice, and recognize their emotional state in real time. The emotion engine is implemented using NLP (Natural Language Processing) technology and sentiment analysis algorithms.

[0166] The analyzed emotional states are stored in a database and analyzed in parallel in combination with predicted traffic volume data and monitoring data on communication status. This allows the server to determine the optimal response method based on the user's emotional state at each communication response facility and dynamically adjust staffing accordingly.

[0167] The terminal receives instructions from the server and visually presents optimized response methods to operators and administrators. Administrators, as users, can evaluate the quality of responses based on the presented information and provide feedback to further improve the system.

[0168] As a concrete example, if the emotion engine detects "anger" in a customer complaint, the server immediately assigns the task to the most suitable customer support representative based on that information. This improves the quality of service and facilitates smoother problem resolution.

[0169] Examples of prompt statements to input into a generative AI model include the following:

[0170] "Explain how to analyze a customer's emotional state in real time during a phone call and dynamically assign the most suitable operator based on the results."

[0171] This invention makes it possible to improve the user experience in communication support operations and to increase the overall efficiency of communication operations.

[0172] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0173] Step 1:

[0174] The server retrieves communication records and business activity information from the database. The retrieved data includes information such as past communication patterns and business trends. The input is past call history and customer interaction records, and the output is a dataset that organizes and aggregates these. The server then performs preprocessing based on this data to build an inference model.

[0175] Step 2:

[0176] The server uses preprocessed data and leverages machine learning libraries (e.g., TensorFlow) to build an inference model. Here, it receives a dataset as input, performs statistical and regression analysis, and generates a model to predict future traffic volume as output. Specifically, the server splits the data into a training set and a test set, evaluates and improves the model, and outputs an optimized model.

[0177] Step 3:

[0178] The server receives voice and text data in real time from the communication reception facility. The input data is sent via VoIP services and chat APIs, and the output is obtained in the form of text-converted voice data. The server uses speech recognition software to perform the conversion to text format.

[0179] Step 4:

[0180] The server analyzes the text data using an emotion engine. The input is the text data output from step 3, and the emotion recognition algorithm analyzes the content and context of the statements in the data, obtaining the user's emotional state (e.g., joy, anger, anxiety) as output. Specifically, the server uses NLP techniques to perform emotion scoring.

[0181] Step 5:

[0182] The server analyzes the results of emotional state analysis in combination with predicted traffic volume data to determine the optimal response method and staffing. The inputs are emotional state data and predicted traffic volume data, and the output provides operator placement instructions and guidelines for response methods. The server uses an algorithm to perform placement simulations and generate an optimized placement strategy.

[0183] Step 6:

[0184] The terminal displays the response instructions received from the server to the operator. The input is deployment instruction data from the server, and the output is operation guidelines displayed on the user interface. The terminal visualizes the instructions so that the operator can easily confirm them.

[0185] Step 7:

[0186] The user, acting as the administrator, evaluates the quality of interactions and provides necessary feedback to the server. Input consists of the administrator's observed interaction results and evaluation comments. Output is the feedback information sent to the server, which is used to improve sentiment recognition algorithms and placement optimization models. Specific actions include filling out a feedback form.

[0187] (Application Example 2)

[0188] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0189] Modern customer service requires flexible responses that adapt to the diverse emotional states of users. However, conventional systems have struggled to recognize user emotions in real time and adjust their responses accordingly. Furthermore, efficient staffing based on this information has been difficult, resulting in insufficient improvement in the quality of service and maximization of operational efficiency.

[0190] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0191] In this invention, the server includes means for acquiring past communication records and business activity information, means for analyzing voice and text data acquired in real time to recognize the user's emotional state, and means for adjusting the response method in real time based on the recognized emotional state. This makes it possible to improve the quality of service and increase customer satisfaction by providing responses that are in line with the user's emotions.

[0192] "Communication records" refer to information that includes past interactions with the user, such as text data and audio data.

[0193] "Business activity information" refers to data related to a company's operations, including business-related information such as sales and customer information.

[0194] A "machine learning model" is a mathematical algorithm that analyzes large amounts of data and uses those patterns to make predictions and classifications.

[0195] A "communication support facility" is a place where communication with users takes place directly via telephone, online chat, etc.

[0196] "Personnel allocation" means placing the necessary personnel in the appropriate locations to perform tasks efficiently.

[0197] An "emotion engine" is an algorithm used to identify a user's emotional state from voice and text data.

[0198] "Response method" refers to the specific means and protocols used when interacting with users, including the personnel and information involved.

[0199] The system for realizing this invention consists of an integrated set of multiple components. The server first acquires historical communication records and business activity information, and uses this to build a machine learning model. This model is used to predict future communication volumes and serves as the basis for efficiently processing data.

[0200] In environments requiring real-time communication responses, the server acquires voice and text data from multiple communication response facilities. This data is input to analyze the user's emotional state using an emotion engine. The emotion engine utilizes an analysis service provided on the cloud (for example, the Emotion Recognition API of Azure Cognitive Services).

[0201] The analyzed emotion data is stored in the server's database and forms the basis for adjusting response methods in real time. This adjustment also contributes to dynamically assigning the appropriate operator, providing the user with the best possible response.

[0202] Furthermore, the terminal, based on instructions from the server, presents the user with response methods through its interface and collects feedback from the user. This feedback is used to improve the emotion engine's algorithm.

[0203] For example, if the server analyzes the user's voice and determines that they are "happy," it can notify the device to provide relaxing content. This allows the user to have a more relaxing time at home.

[0204] An example of a prompt message could be, "If the user's emotion is analyzed as 'fatigue,' what is the best course of action to take?" This could be used to provide feedback to the generative AI model.

[0205] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0206] Step 1:

[0207] The server retrieves historical communication records and business activity information from communication facilities. This data is passed to the server as input, and the information is stored as a dataset for use in machine learning algorithms as output. Specifically, it executes queries from the database, organizes the retrieved data, and prepares it for subsequent analysis processes.

[0208] Step 2:

[0209] The server uses the acquired data to build a machine learning model and predict future traffic volume. A well-organized dataset is fed to the machine learning algorithm as input, and a traffic volume prediction model is generated as output. Specifically, the server uses a data analysis library with Python to train the model. During this process, it analyzes past trends and patterns.

[0210] Step 3:

[0211] The server acquires voice and text data in real time from communication reception facilities nationwide. Voice streams and text messages are received by the server as input, and this data is transferred to the emotion engine as output. Specifically, the data is streamed via an API and prepared for immediate processing.

[0212] Step 4:

[0213] The server uses an emotion engine to analyze voice and text data and recognize the user's emotional state. The emotion engine receives voice and text data as input, and the analyzed emotional state is recorded in a database as output. Specifically, it uses speech recognition software to convert speech to text and an emotion analysis algorithm to classify the emotional state.

[0214] Step 5:

[0215] The server adjusts its response method in real time based on the recognized emotional state. Analysis results are used as input, and the adjusted response procedure and assigned personnel selection are sent to the response facility as output. Specifically, the system selects an appropriate response script and sends instructions to the assigned operator. It also suggests recommended actions for adjusting the response content.

[0216] Step 6:

[0217] The server presents real-time response instructions to the user via the terminal and collects feedback. Response instructions are displayed on the terminal as input, and feedback data is sent to the server as output. In terms of specific operation, the interface visually displays instructions, and user feedback is received through an interactive form.

[0218] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0219] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0220] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0221] [Second Embodiment]

[0222] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0223] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0224] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0225] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0226] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0227] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0228] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0229] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0230] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0231] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0232] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0233] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0234] This invention relates to a system that optimizes staffing by utilizing historical data and real-time conditions in order to improve the efficiency of communication response operations.

[0235] Data collection and prediction

[0236] The server collects communication records and business activity information from databases and external APIs. This data is based on past usage and future business events and contains information useful for predicting traffic volume. The server uses this data to train machine learning models to predict future traffic volume with high accuracy. This predictive information is used by the server to adjust optimal staffing.

[0237] Real-time monitoring and deployment optimization

[0238] The server acquires real-time communication data from communication response facilities and analyzes the communication load at each facility. This allows for an immediate understanding of how much incoming calls each facility is handling. Based on this information, the server optimizes staffing in real time according to the predicted communication volume. The optimization results are immediately notified to each communication response facility, enabling rapid reassignment of personnel.

[0239] User interface and feedback loop

[0240] Users can use their devices to view the predictions and staffing plans calculated by the server. The devices display the optimization results sent from the server, and users can approve them as needed or make modifications as necessary. Furthermore, the server operates a process to collect the results and user feedback and update the machine learning model to improve the accuracy of future predictions.

[0241] As a concrete example, if a special weekend campaign is scheduled, the server predicts the number of incoming calls before and after the campaign and instructs each customer service facility on the necessary staffing arrangements based on that prediction. The introduction of this system makes it possible to improve the efficiency of communication support operations and ensure high response quality.

[0242] The following describes the processing flow.

[0243] Step 1:

[0244] The server extracts data from a database of past communication records for a certain period, compares it with business activity information, and generates necessary predictive variables. If necessary, it retrieves the latest sales event information via external APIs.

[0245] Step 2:

[0246] The server performs machine learning preprocessing on the collected data. This preprocessing includes data cleansing, imputation of missing values, and removal of outliers. It also normalizes the data as a time series and extracts the features necessary for training.

[0247] Step 3:

[0248] The server builds a machine learning model and trains it using preprocessed data. Specifically, it uses a time series forecasting model (e.g., an LSTM network) to predict future traffic volume based on historical data. The parameters of this model are optimized based on the training data.

[0249] Step 4:

[0250] The server inputs new data into the trained model and predicts future traffic volume. The prediction results are output as expected traffic volume per hour and stored in a management database.

[0251] Step 5:

[0252] The server acquires real-time incoming call data from each communication response facility and calculates the current communication load. This makes it possible to understand the response capacity and the number of unanswered calls at each facility in real time.

[0253] Step 6:

[0254] The server integrates traffic volume forecast data and real-time load data, and uses an algorithm to calculate the optimal staffing allocation, thereby determining the optimal allocation of labor. This makes it possible to allocate the appropriate number of personnel to each facility in relation to demand.

[0255] Step 7:

[0256] The terminal displays the optimized staffing plan received from the server and prompts the user for status confirmation. The user can then approve the presented plan or make modifications as needed.

[0257] Step 8:

[0258] The server recollects the executed deployment plan and its results, and stores them as evaluation data. This data is used to perform model training in subsequent iterations, forming a feedback loop aimed at improving prediction accuracy.

[0259] (Example 1)

[0260] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0261] In communication support operations, efficiently allocating personnel while considering historical data and real-time conditions is difficult to optimize and can lead to wasted resources and a decline in response quality. To solve this problem, there is a need for a system that can optimize personnel allocation with high precision by utilizing historical data and real-time communication conditions.

[0262] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0263] In this invention, the server includes means for acquiring communication records and business information, means for constructing a data analysis model using the acquired information and predicting future communication volume, and means for continuously monitoring the communication status of communication facilities. This enables effective optimization of workforce allocation by utilizing historical and real-time data.

[0264] "Communication records" refer to information that details past communication activities, and may include the date and time of communication, sender and receiver, and content of the communication.

[0265] "Business information" refers to data related to business activities, including information such as sales events, customer service status, and working hours.

[0266] A "data analysis model" is a model that uses mathematical or statistical methods to predict future trends based on collected data.

[0267] "Continuous monitoring" involves continuously acquiring and analyzing data in real time, with the aim of always understanding the latest situation.

[0268] "Labor allocation" is the process of determining the number of people and roles required for a specific task or operation, and then assigning personnel based on that.

[0269] "Optimization results" refer to the outcome of deriving the most efficient solution or arrangement for a given purpose, based on prediction and analysis.

[0270] "Notifying" refers to the act of conveying certain information in a way that is understandable to a third party, and specifically means providing information output from a system to each facility.

[0271] This invention aims to build a system in which servers, terminals, and users cooperate to achieve efficient communication response operations. The main elements are data collection, utilization of machine learning models, real-time monitoring, deployment optimization, and a feedback loop.

[0272] Data collection

[0273] The server includes a data collection module for acquiring communication records and business information. The server uses a database system (e.g., MySQL, PostgreSQL) to efficiently import historical communication activity records and sales event data. It also acquires external data through specific business APIs to enhance the information on business activities.

[0274] Utilization of machine learning models

[0275] The server builds machine learning models based on the collected data. Specifically, it uses open-source libraries such as TensorFlow and scikit-learn to analyze past traffic and predict future demand. The server then uses this predicted data as foundational information to optimize staffing.

[0276] Real-time monitoring

[0277] The server monitors the communication status from communication facilities in real time. The server periodically acquires data from sensors and communication systems located at each facility and processes it to understand the current workload.

[0278] Optimization of placement

[0279] The server optimizes staffing by combining predictions from machine learning models with real-time data. The server utilizes algorithms to quickly calculate the required workforce for each facility and notifies the facility of the results.

[0280] User Interface

[0281] Users can use a terminal to view forecast information and staffing plans generated by the server. The terminal has an intuitive interface, and users can modify the staffing plan as needed and provide feedback to the server.

[0282] As a specific example, in order for the server to handle the expected increased traffic during weekend special campaigns, the server instructs each communication facility on the necessary staffing arrangements in real time. Also, as an example of a prompt sentence for the generative AI model, a sentence such as "In preparation for the weekend special campaign, I want to accurately predict the traffic volume. Please make a prediction based on past campaign data and propose a staffing plan accordingly." is used.

[0283] The flow of the specific process in Example 1 will be described using FIG. 11.

[0284] Step 1:

[0285] The server obtains communication records and business information from the database and external APIs. As input, the server collaborates with these data sources to collect data on past communication activities and business events. Thereby, the server can obtain information that serves as a basis for predicting future communication volumes. Specifically, the server periodically calls the API and performs an operation to add new information to the database.

[0286] Step 2:

[0287] The server trains a machine learning model based on the collected data. At this time, past communication data is used as input, and the server utilizes libraries such as TensorFlow for learning. As output, a model for predicting future traffic volumes is generated. As a specific operation, a training dataset is prepared and calculations are repeated to optimize the parameters.

[0288] Step 3:

[0289] The server acquires real-time communication data from communication facilities and analyzes the communication load of each facility. It receives real-time data on the number of communications and response times as input, and analyzes this data to understand the current situation. As output, it generates information on the current load status of each facility. Specifically, it periodically collects sensor data and analyzes it in real time.

[0290] Step 4:

[0291] The server optimizes staffing based on the results of a predictive model and real-time data. Predictive data and real-time load data are used as input, and the server executes an optimization algorithm. The output is an optimal staffing plan, which is then notified to each facility. The specific operation involves executing the algorithm, performing calculations, and transmitting the results over the network.

[0292] Step 5:

[0293] Users review the predictive information and optimized staffing plans generated by the server through their terminals. As input, users receive information provided on their terminals and make decisions based on it. As output, they return feedback to the server regarding the staffing plan as needed. Specifically, this includes reviewing data using a visual interface and making corrections.

[0294] Step 6:

[0295] The server collects execution results and user feedback to update the machine learning model. It takes feedback data and execution results as input and uses them to retrain the model to improve its accuracy. The output is an updated model that enables more accurate predictions. Specific operations include analyzing feedback and adjusting parameters based on the retrained model.

[0296] (Application Example 1)

[0297] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0298] In modern society, situations where the communication load is unclear, such as in communication support and security services, make appropriate resource allocation difficult. This can lead to a decline in service quality and response speed, resulting in reduced customer satisfaction. Furthermore, facilities require rapid personnel redeployment in response to changing circumstances, but there is a lack of systems to effectively achieve this. To solve these problems, dynamic resource management based on real-time and predictive data is necessary.

[0299] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0300] In this invention, the server includes means for acquiring past communication records and activity information, means for building a machine learning model using the acquired information to predict future communication volume, means for monitoring the facility's communication status in real time, means for administrators to check the deployment plan and make necessary modifications via a user interface, means for collecting feedback and updating the machine learning model to improve prediction accuracy for subsequent operations, means for analyzing notifications and warnings in specific areas based on real-time data and optimizing resource allocation accordingly, and means for strengthening the generated AI model based on feedback on resource allocation optimization and providing prompt statements for users to generate specific plans. This enables dynamic and optimal resource allocation and improved response quality.

[0301] "Communication records" refer to historical information such as past communication content, time, and connection status.

[0302] "Activity information" refers to data related to the operation of a company or service, such as events, campaigns, and daily work schedules.

[0303] A "machine learning model" is an algorithm or mathematical model constructed using past data to predict future situations.

[0304] "Communication status" refers to the current load and connection status of each facility where communication is taking place.

[0305] "Resource allocation" refers to the actions or plans of appropriately distributing limited resources such as personnel and equipment.

[0306] "User interface" refers to the screen or operation panel through which the users of the system can directly perform operations and confirmations.

[0307] "Feedback" refers to the information collected on the usage results of the system and the opinions / evaluations of the users, which is used for improvement and optimization.

[0308] "Real-time data" refers to data regarding current situations and events that are generated and updated almost immediately.

[0309] [ "Generative AI model" is a mechanism for analyzing and understanding data using artificial intelligence technology and generating solutions or predictions for specific problems.

[0310] "Prompt text" is the text format used by users to give instructions or ask questions to the system.

[0311] In this embodiment, the server is operated using cloud services (e.g., AWS or Google Cloud). The server obtains each communication record and activity information from each facility and constructs a machine learning model based on them. Software such as Python's Scikit-learn and TensorFlow is utilized for this. It is possible to accurately predict future traffic volumes using the machine learning model. To obtain real-time data, it cooperates with the communication networks of each facility and analyzes the current communication status.

[0312] Smartphones and tablet devices, acting as terminals, provide a user interface for administrators, displaying optimized resource allocation plans. Using frameworks like React Native, users can review the allocation plan and make modifications as needed. The terminals feed the results back to the server in real time, and the server continuously updates its machine learning model based on this information.

[0313] For example, if an increase in traffic volume is predicted for a specific region over the weekend, the server will adjust the personnel allocation in that region in advance to ensure appropriate staffing levels. A possible prompt message used by the user in this case might be, "Please provide an optimal staffing plan to handle the increased traffic volume over the weekend." This enables efficient operation of communication support and security services, and ensures high response quality.

[0314] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0315] Step 1:

[0316] The server retrieves historical communication records and activity information through databases and external APIs. This input data includes call volume, duration, and event information. The server cleans and preprocesses this data before arranging it in the appropriate format. At this stage, data processing such as noise removal and format standardization is performed.

[0317] Step 2:

[0318] The server builds a machine learning model using the processed data. First, it generates a training dataset using frameworks such as Python's Scikit-learn or TensorFlow. Based on this dataset, the server trains a generative AI model to predict future traffic volume. In this process, it extracts data features, selects an appropriate algorithm, and trains the model. The output is the model used for prediction.

[0319] Step 3:

[0320] The server collects real-time data from each communication facility. This real-time data includes current communication load and the number of calls in progress. The server analyzes this data and performs data processing to understand the current load situation. The output of this processing is real-time load information for each facility.

[0321] Step 4:

[0322] The terminal optimizes resource allocation based on predictions and real-time data received from the server. Specifically, it executes an optimization algorithm that allocates additional resources to areas with high load. At this stage, it performs data calculations to improve the allocation of personnel and equipment based on the set conditions and presents the optimal allocation plan as output.

[0323] Step 5:

[0324] The user reviews the optimized resource allocation plan displayed on the terminal. Through the user interface, they can modify or approve the proposed plan. In this step, the displayed plan is the target of user input, but ultimately, the plan modified and applied by the user is output.

[0325] Step 6:

[0326] The server collects user-modified and applied plans as feedback. To improve prediction accuracy in subsequent attempts, it integrates this feedback and updates the machine learning model. The server retrains the generative AI model, improving its performance. The feedback serves as input, and the improved accuracy of the new prediction model is output.

[0327] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0328] This invention relates to a system that recognizes the emotional state of users in communication support operations and utilizes this information to optimize responses and staffing. This system combines communication volume prediction based on past communication records and business activity information with an emotion engine for analyzing the emotional state of users.

[0329] Data analysis and emotion recognition

[0330] The server activates an emotion engine to analyze voice and text data acquired in real time from each communication facility. The emotion engine analyzes the words used by the user, their tone of voice, and the flow of the conversation to recognize the user's emotional state (e.g., joy, anger, anxiety). This information is stored in a database along with other communication status data and further analyzed.

[0331] Optimizing customer service and staffing

[0332] The server adjusts its response in real time based on the user's emotional state, as recognized by the emotion engine. For example, it dynamically allocates the most suitable resources, such as assigning a highly specialized operator to a user exhibiting a specific emotional state. Such adjustments are crucial for providing a better user experience and improving customer satisfaction.

[0333] User interface and feedback loop

[0334] The terminal receives sentiment analysis results and optimized response instructions from the server and presents them to the user. The administrator, acting as the user, can also contribute to the overall system improvement by evaluating the quality of responses based on the sentiment analysis results and providing feedback. The server utilizes this feedback to continuously improve the sentiment engine's algorithm, enabling even more accurate predictions and responses.

[0335] For example, if a customer calls in with a complaint and the emotion engine analyzes the call as "angry," the server can immediately assign the most suitable customer support representative based on that profile. This implementation improves the quality of service and facilitates smoother problem resolution.

[0336] The following describes the processing flow.

[0337] Step 1:

[0338] The server collects user voice and text data in real time from the communication reception facility. This data is acquired at appropriate times during the call and preprocessed for sentiment analysis.

[0339] Step 2:

[0340] The server uses an emotion engine to analyze the collected audio and text data. This analysis considers the user's tone of voice, word choice, and sentence flow to detect emotional states such as joy, anger, and sadness.

[0341] Step 3:

[0342] The server feeds back the recognized emotional state to the communication response facility and uses it as data to determine the appropriate response policy. It also prioritizes assigning calls to operators with specific expertise and capabilities based on the emotional state.

[0343] Step 4:

[0344] The terminal displays the sentiment analysis results and suggested response plans sent from the server. The administrator, as the user, can review this information and send instructions to the response staff as needed.

[0345] Step 5:

[0346] Users can monitor the progress of actual interactions and immediately modify their responses in anticipation of changes in customer emotions or reactions. This operation is performed via a terminal, and the system is prepared to incorporate that information and make further optimizations.

[0347] Step 6:

[0348] The server analyzes the accumulated emotion recognition data and the final response results to update the emotion engine's algorithm. This feedback loop is essential for improving the accuracy of future analyses and the ability to formulate response strategies.

[0349] (Example 2)

[0350] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0351] In customer service operations, it is difficult to optimize staffing based on communication volume and customer emotional state, leading to decreased customer satisfaction and increased operator workload. Conventional systems do not adequately predict communication volume or dynamically allocate staff based on emotional state, leaving challenges in achieving effective customer service.

[0352] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0353] In this invention, the server includes means for acquiring past communication records and business activity information, means for constructing an inference model using the acquired information and predicting future communication volume, means for monitoring the communication status of multiple communication response facilities in real time, means for analyzing user speech content and voice data to recognize emotional states, means for optimizing staffing at each response facility based on prediction results, monitoring results, and emotional state analysis results, and means for notifying each communication response facility of the optimized staffing results. This enables optimal staffing based on dynamic prediction of communication volume and recognition of emotional states, thereby improving customer satisfaction and reducing the burden on operators.

[0354] "Communication records" is a general term for data that includes the content, duration, and participant information of past phone calls and messages.

[0355] "Business activity information" refers to information about the daily business processes carried out by a company or organization, and includes data such as transaction history and interactions with customers.

[0356] An "inference model" is a computational model, either statistical or machine learning-based, that uses past data to predict future events or states.

[0357] A "communication support facility" is a dedicated facility for handling inquiries, orders, complaints, and other matters from customers and users.

[0358] "Emotional state" refers to the state in which a person expresses their emotions, and includes psychological states such as joy, anger, and anxiety.

[0359] "Personnel allocation" refers to placing the appropriate staff or operators in the appropriate locations according to the needs of the work.

[0360] "Optimization methods" refer to the process of making adjustments using calculations and algorithms to maximize the efficiency and effectiveness of the subject.

[0361] "Means of notification" refers to a method or system for transmitting information or instructions to a target person.

[0362] This invention relates to a system that recognizes the emotional state of users in communication support operations and optimizes response methods and staffing. The server acts as the central hub of this system, operating to implement various functions.

[0363] The server first collects communication records and business activity information, and then builds an inference model based on this information. The inference model is implemented using a programming language such as Python or R, and its associated machine learning libraries (e.g., TensorFlow, scikit-learn). This gives the system the ability to predict future communication volume.

[0364] Next, the server receives voice and text data in real time from multiple communication reception facilities. This data is obtained through VoIP services and messaging APIs. The received voice data is converted to text using speech recognition software such as the Google Speech-to-Text API. Then, the emotion engine is activated to analyze the user's speech and tone of voice, and recognize their emotional state in real time. The emotion engine is implemented using NLP (Natural Language Processing) technology and sentiment analysis algorithms.

[0365] The analyzed emotional states are stored in a database and analyzed in parallel in combination with predicted traffic volume data and monitoring data on communication status. This allows the server to determine the optimal response method based on the user's emotional state at each communication response facility and dynamically adjust staffing accordingly.

[0366] The terminal receives instructions from the server and visually presents optimized response methods to operators and administrators. Administrators, as users, can evaluate the quality of responses based on the presented information and provide feedback to further improve the system.

[0367] As a concrete example, if the emotion engine detects "anger" in a customer complaint, the server immediately assigns the task to the most suitable customer support representative based on that information. This improves the quality of service and facilitates smoother problem resolution.

[0368] Examples of prompt statements to input into a generative AI model include the following:

[0369] "Explain how to analyze a customer's emotional state in real time during a phone call and dynamically assign the most suitable operator based on the results."

[0370] This invention makes it possible to improve the user experience in communication support operations and to increase the overall efficiency of communication operations.

[0371] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0372] Step 1:

[0373] The server retrieves communication records and business activity information from the database. The retrieved data includes information such as past communication patterns and business trends. The input is past call history and customer interaction records, and the output is a dataset that organizes and aggregates these. The server then performs preprocessing based on this data to build an inference model.

[0374] Step 2:

[0375] The server uses preprocessed data and leverages machine learning libraries (e.g., TensorFlow) to build an inference model. Here, it receives a dataset as input, performs statistical and regression analysis, and generates a model to predict future traffic volume as output. Specifically, the server splits the data into a training set and a test set, evaluates and improves the model, and outputs an optimized model.

[0376] Step 3:

[0377] The server receives voice and text data in real time from the communication reception facility. The input data is sent via VoIP services and chat APIs, and the output is obtained in the form of text-converted voice data. The server uses speech recognition software to perform the conversion to text format.

[0378] Step 4:

[0379] The server analyzes the text data using an emotion engine. The input is the text data output from step 3, and the emotion recognition algorithm analyzes the content and context of the statements in the data, obtaining the user's emotional state (e.g., joy, anger, anxiety) as output. Specifically, the server uses NLP techniques to perform emotion scoring.

[0380] Step 5:

[0381] The server analyzes the results of emotional state analysis in combination with predicted traffic volume data to determine the optimal response method and staffing. The inputs are emotional state data and predicted traffic volume data, and the output provides operator placement instructions and guidelines for response methods. The server uses an algorithm to perform placement simulations and generate an optimized placement strategy.

[0382] Step 6:

[0383] The terminal displays the response instructions received from the server to the operator. The input is deployment instruction data from the server, and the output is operation guidelines displayed on the user interface. The terminal visualizes the instructions so that the operator can easily confirm them.

[0384] Step 7:

[0385] The user, acting as the administrator, evaluates the quality of interactions and provides necessary feedback to the server. Input consists of the administrator's observed interaction results and evaluation comments. Output is the feedback information sent to the server, which is used to improve sentiment recognition algorithms and placement optimization models. Specific actions include filling out a feedback form.

[0386] (Application Example 2)

[0387] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0388] Modern customer service requires flexible responses that adapt to the diverse emotional states of users. However, conventional systems have struggled to recognize user emotions in real time and adjust their responses accordingly. Furthermore, efficient staffing based on this information has been difficult, resulting in insufficient improvement in the quality of service and maximization of operational efficiency.

[0389] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0390] In this invention, the server includes means for acquiring past communication records and business activity information, means for analyzing voice and text data acquired in real time to recognize the user's emotional state, and means for adjusting the response method in real time based on the recognized emotional state. This makes it possible to improve the quality of service and increase customer satisfaction by providing responses that are in line with the user's emotions.

[0391] "Communication records" refer to information that includes past interactions with the user, such as text data and audio data.

[0392] "Business activity information" refers to data related to a company's operations, including business-related information such as sales and customer information.

[0393] A "machine learning model" is a mathematical algorithm that analyzes large amounts of data and uses those patterns to make predictions and classifications.

[0394] A "communication support facility" is a place where communication with users takes place directly via telephone, online chat, etc.

[0395] "Personnel allocation" means placing the necessary personnel in the appropriate locations to perform tasks efficiently.

[0396] An "emotion engine" is an algorithm used to identify a user's emotional state from voice and text data.

[0397] "Response method" refers to the specific means and protocols used when interacting with users, including the personnel and information involved.

[0398] The system for realizing this invention consists of an integrated set of multiple components. The server first acquires historical communication records and business activity information, and uses this to build a machine learning model. This model is used to predict future communication volumes and serves as the basis for efficiently processing data.

[0399] In environments requiring real-time communication responses, the server acquires voice and text data from multiple communication response facilities. This data is input to analyze the user's emotional state using an emotion engine. The emotion engine utilizes an analysis service provided on the cloud (for example, the Emotion Recognition API in Azure Cognitive Services).

[0400] The analyzed emotion data is stored in the server's database and forms the basis for adjusting response methods in real time. This adjustment also contributes to dynamically assigning the appropriate operator, providing the user with the best possible response.

[0401] Furthermore, the terminal, based on instructions from the server, presents the user with response methods through its interface and collects feedback from the user. This feedback is used to improve the emotion engine's algorithm.

[0402] For example, if the server analyzes the user's voice and determines that they are "happy," it can notify the device to provide relaxing content. This allows the user to have a more relaxing time at home.

[0403] An example of a prompt message could be, "If the user's emotion is analyzed as 'fatigue,' what is the best course of action to take?" This could be used to provide feedback to the generative AI model.

[0404] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0405] Step 1:

[0406] The server retrieves historical communication records and business activity information from communication facilities. This data is passed to the server as input, and the information is stored as a dataset for use in machine learning algorithms as output. Specifically, it executes queries from the database, organizes the retrieved data, and prepares it for subsequent analysis processes.

[0407] Step 2:

[0408] The server uses the acquired data to build a machine learning model and predict future traffic volume. A well-organized dataset is fed to the machine learning algorithm as input, and a traffic volume prediction model is generated as output. Specifically, the server uses a data analysis library with Python to train the model. During this process, it analyzes past trends and patterns.

[0409] Step 3:

[0410] The server acquires voice and text data in real time from communication reception facilities nationwide. Voice streams and text messages are received by the server as input, and this data is transferred to the emotion engine as output. Specifically, the data is streamed via an API and prepared for immediate processing.

[0411] Step 4:

[0412] The server uses an emotion engine to analyze voice and text data and recognize the user's emotional state. The emotion engine receives voice and text data as input, and the analyzed emotional state is recorded in a database as output. Specifically, it uses speech recognition software to convert speech to text and an emotion analysis algorithm to classify the emotional state.

[0413] Step 5:

[0414] The server adjusts its response method in real time based on the recognized emotional state. Analysis results are used as input, and the adjusted response procedure and assigned personnel selection are sent to the response facility as output. Specifically, the system selects an appropriate response script and sends instructions to the assigned operator. It also suggests recommended actions for adjusting the response content.

[0415] Step 6:

[0416] The server presents real-time response instructions to the user via the terminal and collects feedback. Response instructions are displayed on the terminal as input, and feedback data is sent to the server as output. In terms of specific operation, the interface visually displays instructions, and user feedback is received through an interactive form.

[0417] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0418] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0419] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0420] [Third Embodiment]

[0421] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0422] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0423] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0424] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0425] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0426] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0427] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0428] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0429] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0430] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0431] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0432] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0433] This invention relates to a system that optimizes staffing by utilizing historical data and real-time conditions in order to improve the efficiency of communication response operations.

[0434] Data collection and prediction

[0435] The server collects communication records and business activity information from databases and external APIs. This data is based on past usage and future business events and contains information useful for predicting traffic volume. The server uses this data to train machine learning models to predict future traffic volume with high accuracy. This predictive information is used by the server to adjust optimal staffing.

[0436] Real-time monitoring and deployment optimization

[0437] The server acquires real-time communication data from communication response facilities and analyzes the communication load at each facility. This allows for an immediate understanding of how much incoming calls each facility is handling. Based on this information, the server optimizes staffing in real time according to the predicted communication volume. The optimization results are immediately notified to each communication response facility, enabling rapid reassignment of personnel.

[0438] User interface and feedback loop

[0439] Users can use their devices to view the predictions and staffing plans calculated by the server. The devices display the optimization results sent from the server, and users can approve them as needed or make modifications as necessary. Furthermore, the server operates a process to collect the results and user feedback and update the machine learning model to improve the accuracy of future predictions.

[0440] As a concrete example, if a special weekend campaign is scheduled, the server predicts the number of incoming calls before and after the campaign and instructs each customer service facility on the necessary staffing arrangements based on that prediction. The introduction of this system makes it possible to improve the efficiency of communication support operations and ensure high response quality.

[0441] The following describes the processing flow.

[0442] Step 1:

[0443] The server extracts data from a database of past communication records for a certain period, compares it with business activity information, and generates necessary predictive variables. If necessary, it retrieves the latest sales event information via external APIs.

[0444] Step 2:

[0445] The server performs machine learning preprocessing on the collected data. This preprocessing includes data cleansing, imputation of missing values, and removal of outliers. It also normalizes the data as a time series and extracts the features necessary for training.

[0446] Step 3:

[0447] The server builds a machine learning model and trains it using preprocessed data. Specifically, it uses a time series forecasting model (e.g., an LSTM network) to predict future traffic volume based on historical data. The parameters of this model are optimized based on the training data.

[0448] Step 4:

[0449] The server inputs new data into the trained model and predicts future traffic volume. The prediction results are output as expected traffic volume per hour and stored in a management database.

[0450] Step 5:

[0451] The server acquires real-time incoming call data from each communication response facility and calculates the current communication load. This makes it possible to understand the response capacity and the number of unanswered calls at each facility in real time.

[0452] Step 6:

[0453] The server integrates traffic volume forecast data and real-time load data, and uses an algorithm to calculate the optimal staffing allocation, thereby determining the optimal allocation of labor. This makes it possible to allocate the appropriate number of personnel to each facility in relation to demand.

[0454] Step 7:

[0455] The terminal displays the optimized staffing plan received from the server and prompts the user for status confirmation. The user can then approve the presented plan or make modifications as needed.

[0456] Step 8:

[0457] The server recollects the executed deployment plan and its results, and stores them as evaluation data. This data is used to perform model training in subsequent iterations, forming a feedback loop aimed at improving prediction accuracy.

[0458] (Example 1)

[0459] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0460] In communication support operations, efficiently allocating personnel while considering historical data and real-time conditions is difficult to optimize and can lead to wasted resources and a decline in response quality. To solve this problem, there is a need for a system that can optimize personnel allocation with high precision by utilizing historical data and real-time communication conditions.

[0461] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0462] In this invention, the server includes means for acquiring communication records and business information, means for constructing a data analysis model using the acquired information and predicting future communication volume, and means for continuously monitoring the communication status of communication facilities. This enables effective optimization of workforce allocation by utilizing historical and real-time data.

[0463] "Communication records" refer to information that details past communication activities, and may include the date and time of communication, sender and receiver, and content of the communication.

[0464] "Business information" refers to data related to business activities, including information such as sales events, customer service status, and working hours.

[0465] A "data analysis model" is a model that uses mathematical or statistical methods to predict future trends based on collected data.

[0466] "Continuous monitoring" involves continuously acquiring and analyzing data in real time, with the aim of always understanding the latest situation.

[0467] "Labor allocation" is the process of determining the number of people and roles required for a specific task or operation, and then assigning personnel based on that.

[0468] "Optimization results" refer to the outcome of deriving the most efficient solution or arrangement for a given purpose, based on prediction and analysis.

[0469] "Notifying" refers to the act of conveying certain information in a way that is understandable to a third party, and specifically means providing information output from a system to each facility.

[0470] This invention aims to build a system in which servers, terminals, and users cooperate to achieve efficient communication response operations. The main elements are data collection, utilization of machine learning models, real-time monitoring, deployment optimization, and a feedback loop.

[0471] Data collection

[0472] The server includes a data collection module for acquiring communication records and business information. The server uses a database system (e.g., MySQL, PostgreSQL) to efficiently import historical communication activity records and sales event data. It also acquires external data through specific business APIs to enhance the information on business activities.

[0473] Utilization of machine learning models

[0474] The server builds machine learning models based on the collected data. Specifically, it uses open-source libraries such as TensorFlow and scikit-learn to analyze past traffic and predict future demand. The server then uses this predicted data as foundational information to optimize staffing.

[0475] Real-time monitoring

[0476] The server monitors the communication status from communication facilities in real time. The server periodically acquires data from sensors and communication systems located at each facility and processes it to understand the current workload.

[0477] Optimization of placement

[0478] The server optimizes staffing by combining predictions from machine learning models with real-time data. The server utilizes algorithms to quickly calculate the required workforce for each facility and notifies the facility of the results.

[0479] User Interface

[0480] Users can use a terminal to view forecast information and staffing plans generated by the server. The terminal has an intuitive interface, and users can modify the staffing plan as needed and provide feedback to the server.

[0481] As a concrete example, the server instructs each communication facility in real time to allocate the necessary personnel to handle the expected increase in traffic during special weekend campaigns. Furthermore, an example of a prompt message to the generated AI model would be: "We want to make highly accurate predictions of traffic volume in preparation for the special weekend campaign. Please make predictions based on past campaign data and propose a corresponding personnel allocation plan."

[0482] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0483] Step 1:

[0484] The server retrieves communication records and business information from databases and external APIs. As input, the server interacts with these data sources to collect data on past communication activity and business events. This allows the server to obtain information that forms the basis for predicting future communication volumes. Specifically, it periodically calls APIs and adds new information to the database.

[0485] Step 2:

[0486] The server trains a machine learning model based on the collected data. Historical communication data is used as input, and the server utilizes libraries such as TensorFlow for training. The output is a model that predicts future communication volume. Specifically, it prepares a training dataset and repeats calculations to optimize parameters.

[0487] Step 3:

[0488] The server acquires real-time communication data from communication facilities and analyzes the communication load of each facility. It receives real-time data on the number of communications and response times as input, and analyzes this data to understand the current situation. As output, it generates information on the current load status of each facility. Specifically, it periodically collects sensor data and analyzes it in real time.

[0489] Step 4:

[0490] The server optimizes staffing based on the results of a predictive model and real-time data. Predictive data and real-time load data are used as input, and the server executes an optimization algorithm. The output is an optimal staffing plan, which is then notified to each facility. The specific operation involves executing the algorithm, performing calculations, and transmitting the results over the network.

[0491] Step 5:

[0492] Users review the predictive information and optimized staffing plans generated by the server through their terminals. As input, users receive information provided on their terminals and make decisions based on it. As output, they return feedback to the server regarding the staffing plan as needed. Specifically, this includes reviewing data using a visual interface and making corrections.

[0493] Step 6:

[0494] The server collects execution results and user feedback to update the machine learning model. It takes feedback data and execution results as input and uses them to retrain the model to improve its accuracy. The output is an updated model that enables more accurate predictions. Specific operations include analyzing feedback and adjusting parameters based on the retrained model.

[0495] (Application Example 1)

[0496] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0497] In modern society, situations where the communication load is unclear, such as in communication support and security services, make appropriate resource allocation difficult. This can lead to a decline in service quality and response speed, resulting in reduced customer satisfaction. Furthermore, facilities require rapid personnel redeployment in response to changing circumstances, but there is a lack of systems to effectively achieve this. To solve these problems, dynamic resource management based on real-time and predictive data is necessary.

[0498] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0499] In this invention, the server includes means for acquiring past communication records and activity information, means for building a machine learning model using the acquired information to predict future communication volume, means for monitoring the facility's communication status in real time, means for administrators to check the deployment plan and make necessary modifications via a user interface, means for collecting feedback and updating the machine learning model to improve prediction accuracy for subsequent operations, means for analyzing notifications and warnings in specific areas based on real-time data and optimizing resource allocation accordingly, and means for strengthening the generated AI model based on feedback on resource allocation optimization and providing prompt statements for users to generate specific plans. This enables dynamic and optimal resource allocation and improved response quality.

[0500] "Communication records" refer to historical information such as past communication content, time, and connection status.

[0501] "Activity information" refers to data related to the operation of a company or service, such as events, campaigns, and daily work schedules.

[0502] A "machine learning model" is an algorithm or mathematical model built to predict future situations using past data.

[0503] "Communication status" refers to the current load and connection status of each facility where communication is taking place.

[0504] "Resource allocation" refers to actions and plans that appropriately distribute limited resources such as personnel and equipment.

[0505] "User interface" refers to the screens or control panels that system users can directly operate or check.

[0506] "Feedback" refers to information collected from system usage results and user opinions and evaluations, which is used to improve and optimize the system.

[0507] "Real-time data" refers to data about current situations and events that are generated and updated almost instantly.

[0508] A "generative AI model" is a system that uses artificial intelligence technology to analyze and understand data and generate solutions or predictions for specific problems.

[0509] A "prompt message" is a type of text used by users to give instructions or questions to a system.

[0510] In this embodiment, the server is operated using a cloud service (e.g., AWS or Google Cloud). The server acquires communication records and activity information from each facility and builds a machine learning model based on this data. This utilizes software such as Python's Scikit-learn and TensorFlow. Using the machine learning model, it is possible to predict future communication volume with high accuracy. To acquire real-time data, it collaborates with the communication network of each facility and analyzes the current communication status.

[0511] Smartphones and tablet devices, acting as terminals, provide a user interface for administrators, displaying optimized resource allocation plans. Using frameworks like React Native, users can review the allocation plan and make modifications as needed. The terminals feed the results back to the server in real time, and the server continuously updates its machine learning model based on this information.

[0512] For example, if an increase in traffic volume is predicted for a specific region over the weekend, the server will adjust the personnel allocation in that region in advance to ensure appropriate staffing levels. A possible prompt message used by the user in this case might be, "Please provide an optimal staffing plan to handle the increased traffic volume over the weekend." This enables efficient operation of communication support and security services, and ensures high response quality.

[0513] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0514] Step 1:

[0515] The server retrieves historical communication records and activity information through databases and external APIs. This input data includes call volume, duration, and event information. The server cleans and preprocesses this data before arranging it in the appropriate format. At this stage, data processing such as noise removal and format standardization is performed.

[0516] Step 2:

[0517] The server builds a machine learning model using the processed data. First, it generates a training dataset using frameworks such as Python's Scikit-learn or TensorFlow. Based on this dataset, the server trains a generative AI model to predict future traffic volume. In this process, it extracts data features, selects an appropriate algorithm, and trains the model. The output is the model used for prediction.

[0518] Step 3:

[0519] The server collects real-time data from each communication facility. This real-time data includes current communication load and the number of calls in progress. The server analyzes this data and performs data processing to understand the current load situation. The output of this processing is real-time load information for each facility.

[0520] Step 4:

[0521] The terminal optimizes resource allocation based on predictions and real-time data received from the server. Specifically, it executes an optimization algorithm that allocates additional resources to areas with high load. At this stage, it performs data calculations to improve the allocation of personnel and equipment based on the set conditions and presents the optimal allocation plan as output.

[0522] Step 5:

[0523] The user reviews the optimized resource allocation plan displayed on the terminal. Through the user interface, they can modify or approve the proposed plan. In this step, the displayed plan is the target of user input, but ultimately, the plan modified and applied by the user is output.

[0524] Step 6:

[0525] The server collects user-modified and applied plans as feedback. To improve prediction accuracy in subsequent attempts, it integrates this feedback and updates the machine learning model. The server retrains the generative AI model, improving its performance. The feedback serves as input, and the improved accuracy of the new prediction model is output.

[0526] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0527] This invention relates to a system that recognizes the emotional state of users in communication support operations and utilizes this information to optimize responses and staffing. This system combines communication volume prediction based on past communication records and business activity information with an emotion engine for analyzing the emotional state of users.

[0528] Data analysis and emotion recognition

[0529] The server activates an emotion engine to analyze voice and text data acquired in real time from each communication facility. The emotion engine analyzes the words used by the user, their tone of voice, and the flow of the conversation to recognize the user's emotional state (e.g., joy, anger, anxiety). This information is stored in a database along with other communication status data and further analyzed.

[0530] Optimizing customer service and staffing

[0531] The server adjusts its response in real time based on the user's emotional state, as recognized by the emotion engine. For example, it dynamically allocates the most suitable resources, such as assigning a highly specialized operator to a user exhibiting a specific emotional state. Such adjustments are crucial for providing a better user experience and improving customer satisfaction.

[0532] User interface and feedback loop

[0533] The terminal receives sentiment analysis results and optimized response instructions from the server and presents them to the user. The administrator, acting as the user, can also contribute to the overall system improvement by evaluating the quality of responses based on the sentiment analysis results and providing feedback. The server utilizes this feedback to continuously improve the sentiment engine's algorithm, enabling even more accurate predictions and responses.

[0534] For example, if a customer calls in with a complaint and the emotion engine analyzes the call as "angry," the server can immediately assign the most suitable customer support representative based on that profile. This implementation improves the quality of service and facilitates smoother problem resolution.

[0535] The following describes the processing flow.

[0536] Step 1:

[0537] The server collects user voice and text data in real time from the communication reception facility. This data is acquired at appropriate times during the call and preprocessed for sentiment analysis.

[0538] Step 2:

[0539] The server uses an emotion engine to analyze the collected audio and text data. This analysis considers the user's tone of voice, word choice, and sentence flow to detect emotional states such as joy, anger, and sadness.

[0540] Step 3:

[0541] The server feeds back the recognized emotional state to the communication response facility and uses it as data to determine the appropriate response policy. It also prioritizes assigning calls to operators with specific expertise and capabilities based on the emotional state.

[0542] Step 4:

[0543] The terminal displays the sentiment analysis results and suggested response plans sent from the server. The administrator, as the user, can review this information and send instructions to the response staff as needed.

[0544] Step 5:

[0545] Users can monitor the progress of actual interactions and immediately modify their responses in anticipation of changes in customer emotions or reactions. This operation is performed via a terminal, and the system is prepared to incorporate that information and make further optimizations.

[0546] Step 6:

[0547] The server analyzes the accumulated emotion recognition data and the final response results to update the emotion engine's algorithm. This feedback loop is essential for improving the accuracy of future analyses and the ability to formulate response strategies.

[0548] (Example 2)

[0549] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0550] In customer service operations, it is difficult to optimize staffing based on communication volume and customer emotional state, leading to decreased customer satisfaction and increased operator workload. Conventional systems do not adequately predict communication volume or dynamically allocate staff based on emotional state, leaving challenges in achieving effective customer service.

[0551] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0552] In this invention, the server includes means for acquiring past communication records and business activity information, means for constructing an inference model using the acquired information and predicting future communication volume, means for monitoring the communication status of multiple communication response facilities in real time, means for analyzing user speech content and voice data to recognize emotional states, means for optimizing staffing at each response facility based on prediction results, monitoring results, and emotional state analysis results, and means for notifying each communication response facility of the optimized staffing results. This enables optimal staffing based on dynamic prediction of communication volume and recognition of emotional states, thereby improving customer satisfaction and reducing the burden on operators.

[0553] "Communication records" is a general term for data that includes the content, duration, and participant information of past phone calls and messages.

[0554] "Business activity information" refers to information about the daily business processes carried out by a company or organization, and includes data such as transaction history and interactions with customers.

[0555] An "inference model" is a computational model, either statistical or machine learning-based, that uses past data to predict future events or states.

[0556] A "communication support facility" is a dedicated facility for handling inquiries, orders, complaints, and other matters from customers and users.

[0557] "Emotional state" refers to the state in which a person expresses their emotions, and includes psychological states such as joy, anger, and anxiety.

[0558] "Personnel allocation" refers to placing the appropriate staff or operators in the appropriate locations according to the needs of the work.

[0559] "Optimization methods" refer to the process of making adjustments using calculations and algorithms to maximize the efficiency and effectiveness of the subject.

[0560] "Means of notification" refers to a method or system for transmitting information or instructions to a target person.

[0561] This invention relates to a system that recognizes the emotional state of users in communication support operations and optimizes response methods and staffing. The server acts as the central hub of this system, operating to implement various functions.

[0562] The server first collects communication records and business activity information, and then builds an inference model based on this information. The inference model is implemented using a programming language such as Python or R, and its associated machine learning libraries (e.g., TensorFlow, scikit-learn). This gives the system the ability to predict future communication volume.

[0563] Next, the server receives voice and text data in real time from multiple communication reception facilities. This data is obtained through VoIP services and messaging APIs. The received voice data is converted to text using speech recognition software such as the Google Speech-to-Text API. Then, the emotion engine is activated to analyze the user's speech and tone of voice, and recognize their emotional state in real time. The emotion engine is implemented using NLP (Natural Language Processing) technology and sentiment analysis algorithms.

[0564] The analyzed emotional states are stored in a database and analyzed in parallel in combination with predicted traffic volume data and monitoring data on communication status. This allows the server to determine the optimal response method based on the user's emotional state at each communication response facility and dynamically adjust staffing accordingly.

[0565] The terminal receives instructions from the server and visually presents optimized response methods to operators and administrators. Administrators, as users, can evaluate the quality of responses based on the presented information and provide feedback to further improve the system.

[0566] As a concrete example, if the emotion engine detects "anger" in a customer complaint, the server immediately assigns the task to the most suitable customer support representative based on that information. This improves the quality of service and facilitates smoother problem resolution.

[0567] Examples of prompt statements to input into a generative AI model include the following:

[0568] "Explain how to analyze a customer's emotional state in real time during a phone call and dynamically assign the most suitable operator based on the results."

[0569] This invention makes it possible to improve the user experience in communication support operations and to increase the overall efficiency of communication operations.

[0570] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0571] Step 1:

[0572] The server retrieves communication records and business activity information from the database. The retrieved data includes information such as past communication patterns and business trends. The input is past call history and customer interaction records, and the output is a dataset that organizes and aggregates these. The server then performs preprocessing based on this data to build an inference model.

[0573] Step 2:

[0574] The server uses preprocessed data and leverages machine learning libraries (e.g., TensorFlow) to build an inference model. Here, it receives a dataset as input, performs statistical and regression analysis, and generates a model to predict future traffic volume as output. Specifically, the server splits the data into a training set and a test set, evaluates and improves the model, and outputs an optimized model.

[0575] Step 3:

[0576] The server receives voice and text data in real time from the communication reception facility. The input data is sent via VoIP services and chat APIs, and the output is obtained in the form of text-converted voice data. The server uses speech recognition software to perform the conversion to text format.

[0577] Step 4:

[0578] The server analyzes the text data using an emotion engine. The input is the text data output from step 3, and the emotion recognition algorithm analyzes the content and context of the statements in the data, obtaining the user's emotional state (e.g., joy, anger, anxiety) as output. Specifically, the server uses NLP techniques to perform emotion scoring.

[0579] Step 5:

[0580] The server analyzes the results of emotional state analysis in combination with predicted traffic volume data to determine the optimal response method and staffing. The inputs are emotional state data and predicted traffic volume data, and the output provides operator placement instructions and guidelines for response methods. The server uses an algorithm to perform placement simulations and generate an optimized placement strategy.

[0581] Step 6:

[0582] The terminal displays the response instructions received from the server to the operator. The input is deployment instruction data from the server, and the output is operation guidelines displayed on the user interface. The terminal visualizes the instructions so that the operator can easily confirm them.

[0583] Step 7:

[0584] The user, acting as the administrator, evaluates the quality of interactions and provides necessary feedback to the server. Input consists of the administrator's observed interaction results and evaluation comments. Output is the feedback information sent to the server, which is used to improve sentiment recognition algorithms and placement optimization models. Specific actions include filling out a feedback form.

[0585] (Application Example 2)

[0586] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0587] Modern customer service requires flexible responses that adapt to the diverse emotional states of users. However, conventional systems have struggled to recognize user emotions in real time and adjust their responses accordingly. Furthermore, efficient staffing based on this information has been difficult, resulting in insufficient improvement in the quality of service and maximization of operational efficiency.

[0588] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0589] In this invention, the server includes means for acquiring past communication records and business activity information, means for analyzing voice and text data acquired in real time to recognize the user's emotional state, and means for adjusting the response method in real time based on the recognized emotional state. This makes it possible to improve the quality of service and increase customer satisfaction by providing responses that are in line with the user's emotions.

[0590] "Communication records" refer to information that includes past interactions with the user, such as text data and audio data.

[0591] "Business activity information" refers to data related to a company's operations, including business-related information such as sales and customer information.

[0592] A "machine learning model" is a mathematical algorithm that analyzes large amounts of data and uses those patterns to make predictions and classifications.

[0593] A "communication support facility" is a place where communication with users takes place directly via telephone, online chat, etc.

[0594] "Personnel allocation" means placing the necessary personnel in the appropriate locations to perform tasks efficiently.

[0595] An "emotion engine" is an algorithm used to identify a user's emotional state from voice and text data.

[0596] "Response method" refers to the specific means and protocols used when interacting with users, including the personnel and information involved.

[0597] The system for realizing this invention consists of an integrated set of multiple components. The server first acquires historical communication records and business activity information, and uses this to build a machine learning model. This model is used to predict future communication volumes and serves as the basis for efficiently processing data.

[0598] In environments requiring real-time communication responses, the server acquires voice and text data from multiple communication response facilities. This data is input to analyze the user's emotional state using an emotion engine. The emotion engine utilizes an analysis service provided on the cloud (for example, the Emotion Recognition API in Azure Cognitive Services).

[0599] The analyzed emotion data is stored in the server's database and forms the basis for adjusting response methods in real time. This adjustment also contributes to dynamically assigning the appropriate operator, providing the user with the best possible response.

[0600] Furthermore, the terminal, based on instructions from the server, presents the user with response methods through its interface and collects feedback from the user. This feedback is used to improve the emotion engine's algorithm.

[0601] For example, if the server analyzes the user's voice and determines that they are "happy," it can notify the device to provide relaxing content. This allows the user to have a more relaxing time at home.

[0602] An example of a prompt message could be, "If the user's emotion is analyzed as 'fatigue,' what is the best course of action to take?" This could be used to provide feedback to the generative AI model.

[0603] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0604] Step 1:

[0605] The server retrieves historical communication records and business activity information from communication facilities. This data is passed to the server as input, and the information is stored as a dataset for use in machine learning algorithms as output. Specifically, it executes queries from the database, organizes the retrieved data, and prepares it for subsequent analysis processes.

[0606] Step 2:

[0607] The server uses the acquired data to build a machine learning model and predict future traffic volume. A well-organized dataset is fed to the machine learning algorithm as input, and a traffic volume prediction model is generated as output. Specifically, the server uses a data analysis library with Python to train the model. During this process, it analyzes past trends and patterns.

[0608] Step 3:

[0609] The server acquires voice and text data in real time from communication reception facilities nationwide. Voice streams and text messages are received by the server as input, and this data is transferred to the emotion engine as output. Specifically, the data is streamed via an API and prepared for immediate processing.

[0610] Step 4:

[0611] The server uses an emotion engine to analyze voice and text data and recognize the user's emotional state. The emotion engine receives voice and text data as input, and the analyzed emotional state is recorded in a database as output. Specifically, it uses speech recognition software to convert speech to text and an emotion analysis algorithm to classify the emotional state.

[0612] Step 5:

[0613] The server adjusts its response method in real time based on the recognized emotional state. Analysis results are used as input, and the adjusted response procedure and assigned personnel selection are sent to the response facility as output. Specifically, the system selects an appropriate response script and sends instructions to the assigned operator. It also suggests recommended actions for adjusting the response content.

[0614] Step 6:

[0615] The server presents real-time response instructions to the user via the terminal and collects feedback. Response instructions are displayed on the terminal as input, and feedback data is sent to the server as output. In terms of specific operation, the interface visually displays instructions, and user feedback is received through an interactive form.

[0616] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0617] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0618] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0619] [Fourth Embodiment]

[0620] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0621] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0622] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0623] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0624] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0625] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0626] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0627] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0628] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0629] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0630] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0631] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0632] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0633] This invention relates to a system that optimizes staffing by utilizing historical data and real-time conditions in order to improve the efficiency of communication response operations.

[0634] Data collection and prediction

[0635] The server collects communication records and business activity information from databases and external APIs. This data is based on past usage and future business events and contains information useful for predicting traffic volume. The server uses this data to train machine learning models to predict future traffic volume with high accuracy. This predictive information is used by the server to adjust optimal staffing.

[0636] Real-time monitoring and deployment optimization

[0637] The server acquires real-time communication data from communication response facilities and analyzes the communication load at each facility. This allows for an immediate understanding of how much incoming calls each facility is handling. Based on this information, the server optimizes staffing in real time according to the predicted communication volume. The optimization results are immediately notified to each communication response facility, enabling rapid reassignment of personnel.

[0638] User interface and feedback loop

[0639] Users can use their devices to view the predictions and staffing plans calculated by the server. The devices display the optimization results sent from the server, and users can approve them as needed or make modifications as necessary. Furthermore, the server operates a process to collect the results and user feedback and update the machine learning model to improve the accuracy of future predictions.

[0640] As a concrete example, if a special weekend campaign is scheduled, the server predicts the number of incoming calls before and after the campaign and instructs each customer service facility on the necessary staffing arrangements based on that prediction. The introduction of this system makes it possible to improve the efficiency of communication support operations and ensure high response quality.

[0641] The following describes the processing flow.

[0642] Step 1:

[0643] The server extracts data from a database of past communication records for a certain period, compares it with business activity information, and generates necessary predictive variables. If necessary, it retrieves the latest sales event information via external APIs.

[0644] Step 2:

[0645] The server performs machine learning preprocessing on the collected data. This preprocessing includes data cleansing, imputation of missing values, and removal of outliers. It also normalizes the data as a time series and extracts the features necessary for training.

[0646] Step 3:

[0647] The server builds a machine learning model and trains it using preprocessed data. Specifically, it uses a time series forecasting model (e.g., an LSTM network) to predict future traffic volume based on historical data. The parameters of this model are optimized based on the training data.

[0648] Step 4:

[0649] The server inputs new data into the trained model and predicts future traffic volume. The prediction results are output as expected traffic volume per hour and stored in a management database.

[0650] Step 5:

[0651] The server acquires real-time incoming call data from each communication response facility and calculates the current communication load. This makes it possible to understand the response capacity and the number of unanswered calls at each facility in real time.

[0652] Step 6:

[0653] The server integrates traffic volume forecast data and real-time load data, and uses an algorithm to calculate the optimal staffing allocation, thereby determining the optimal allocation of labor. This makes it possible to allocate the appropriate number of personnel to each facility in relation to demand.

[0654] Step 7:

[0655] The terminal displays the optimized staffing plan received from the server and prompts the user for status confirmation. The user can then approve the presented plan or make modifications as needed.

[0656] Step 8:

[0657] The server recollects the executed deployment plan and its results, and stores them as evaluation data. This data is used to perform model training in subsequent iterations, forming a feedback loop aimed at improving prediction accuracy.

[0658] (Example 1)

[0659] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0660] In communication support operations, efficiently allocating personnel while considering historical data and real-time conditions is difficult to optimize and can lead to wasted resources and a decline in response quality. To solve this problem, there is a need for a system that can optimize personnel allocation with high precision by utilizing historical data and real-time communication conditions.

[0661] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0662] In this invention, the server includes means for acquiring communication records and business information, means for constructing a data analysis model using the acquired information and predicting future communication volume, and means for continuously monitoring the communication status of communication facilities. This enables effective optimization of workforce allocation by utilizing historical and real-time data.

[0663] "Communication records" refer to information that details past communication activities, and may include the date and time of communication, sender and receiver, and content of the communication.

[0664] "Business information" refers to data related to business activities, including information such as sales events, customer service status, and working hours.

[0665] A "data analysis model" is a model that uses mathematical or statistical methods to predict future trends based on collected data.

[0666] "Continuous monitoring" involves continuously acquiring and analyzing data in real time, with the aim of always understanding the latest situation.

[0667] "Labor allocation" is the process of determining the number of people and roles required for a specific task or operation, and then assigning personnel based on that.

[0668] "Optimization results" refer to the outcome of deriving the most efficient solution or arrangement for a given purpose, based on prediction and analysis.

[0669] "Notifying" refers to the act of conveying certain information in a way that is understandable to a third party, and specifically means providing information output from a system to each facility.

[0670] This invention aims to build a system in which servers, terminals, and users cooperate to achieve efficient communication response operations. The main elements are data collection, utilization of machine learning models, real-time monitoring, deployment optimization, and a feedback loop.

[0671] Data collection

[0672] The server includes a data collection module for acquiring communication records and business information. The server uses a database system (e.g., MySQL, PostgreSQL) to efficiently import historical communication activity records and sales event data. It also acquires external data through specific business APIs to enhance the information on business activities.

[0673] Utilization of machine learning models

[0674] The server builds machine learning models based on the collected data. Specifically, it uses open-source libraries such as TensorFlow and scikit-learn to analyze past traffic and predict future demand. The server then uses this predicted data as foundational information to optimize staffing.

[0675] Real-time monitoring

[0676] The server monitors the communication status from communication facilities in real time. The server periodically acquires data from sensors and communication systems located at each facility and processes it to understand the current workload.

[0677] Optimization of placement

[0678] The server optimizes staffing by combining predictions from machine learning models with real-time data. The server utilizes algorithms to quickly calculate the required workforce for each facility and notifies the facility of the results.

[0679] User Interface

[0680] Users can use a terminal to view forecast information and staffing plans generated by the server. The terminal has an intuitive interface, and users can modify the staffing plan as needed and provide feedback to the server.

[0681] As a concrete example, the server instructs each communication facility in real time to allocate the necessary personnel to handle the expected increase in traffic during special weekend campaigns. Furthermore, an example of a prompt message to the generated AI model would be: "We want to make highly accurate predictions of traffic volume in preparation for the special weekend campaign. Please make predictions based on past campaign data and propose a corresponding personnel allocation plan."

[0682] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0683] Step 1:

[0684] The server retrieves communication records and business information from databases and external APIs. As input, the server interacts with these data sources to collect data on past communication activity and business events. This allows the server to obtain information that forms the basis for predicting future communication volumes. Specifically, it periodically calls APIs and adds new information to the database.

[0685] Step 2:

[0686] The server trains a machine learning model based on the collected data. Historical communication data is used as input, and the server utilizes libraries such as TensorFlow for training. The output is a model that predicts future communication volume. Specifically, it prepares a training dataset and repeats calculations to optimize parameters.

[0687] Step 3:

[0688] The server acquires real-time communication data from communication facilities and analyzes the communication load of each facility. It receives real-time data on the number of communications and response times as input, and analyzes this data to understand the current situation. As output, it generates information on the current load status of each facility. Specifically, it periodically collects sensor data and analyzes it in real time.

[0689] Step 4:

[0690] The server optimizes staffing based on the results of a predictive model and real-time data. Predictive data and real-time load data are used as input, and the server executes an optimization algorithm. The output is an optimal staffing plan, which is then notified to each facility. The specific operation involves executing the algorithm, performing calculations, and transmitting the results over the network.

[0691] Step 5:

[0692] Users review the predictive information and optimized staffing plans generated by the server through their terminals. As input, users receive information provided on their terminals and make decisions based on it. As output, they return feedback to the server regarding the staffing plan as needed. Specifically, this includes reviewing data using a visual interface and making corrections.

[0693] Step 6:

[0694] The server collects execution results and user feedback to update the machine learning model. It takes feedback data and execution results as input and uses them to retrain the model to improve its accuracy. The output is an updated model that enables more accurate predictions. Specific operations include analyzing feedback and adjusting parameters based on the retrained model.

[0695] (Application Example 1)

[0696] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0697] In modern society, situations where the communication load is unclear, such as in communication support and security services, make appropriate resource allocation difficult. This can lead to a decline in service quality and response speed, resulting in reduced customer satisfaction. Furthermore, facilities require rapid personnel redeployment in response to changing circumstances, but there is a lack of systems to effectively achieve this. To solve these problems, dynamic resource management based on real-time and predictive data is necessary.

[0698] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0699] In this invention, the server includes means for acquiring past communication records and activity information, means for building a machine learning model using the acquired information to predict future communication volume, means for monitoring the facility's communication status in real time, means for administrators to check the deployment plan and make necessary modifications via a user interface, means for collecting feedback and updating the machine learning model to improve prediction accuracy for subsequent operations, means for analyzing notifications and warnings in specific areas based on real-time data and optimizing resource allocation accordingly, and means for strengthening the generated AI model based on feedback on resource allocation optimization and providing prompt statements for users to generate specific plans. This enables dynamic and optimal resource allocation and improved response quality.

[0700] "Communication records" refer to historical information such as past communication content, time, and connection status.

[0701] "Activity information" refers to data related to the operation of a company or service, such as events, campaigns, and daily work schedules.

[0702] A "machine learning model" is an algorithm or mathematical model built to predict future situations using past data.

[0703] "Communication status" refers to the current load and connection status of each facility where communication is taking place.

[0704] "Resource allocation" refers to actions and plans that appropriately distribute limited resources such as personnel and equipment.

[0705] "User interface" refers to the screens or control panels that system users can directly operate or check.

[0706] "Feedback" refers to information collected from system usage results and user opinions and evaluations, which is used to improve and optimize the system.

[0707] "Real-time data" refers to data about current situations and events that are generated and updated almost instantly.

[0708] A "generative AI model" is a system that uses artificial intelligence technology to analyze and understand data and generate solutions or predictions for specific problems.

[0709] A "prompt message" is a type of text used by users to give instructions or questions to a system.

[0710] In this embodiment, the server is operated using a cloud service (e.g., AWS or Google Cloud). The server acquires communication records and activity information from each facility and builds a machine learning model based on this data. This utilizes software such as Python's Scikit-learn and TensorFlow. Using the machine learning model, it is possible to predict future communication volume with high accuracy. To acquire real-time data, it collaborates with the communication network of each facility and analyzes the current communication status.

[0711] Smartphones and tablet devices, acting as terminals, provide a user interface for administrators, displaying optimized resource allocation plans. Using frameworks like React Native, users can review the allocation plan and make modifications as needed. The terminals feed the results back to the server in real time, and the server continuously updates its machine learning model based on this information.

[0712] For example, if an increase in traffic volume is predicted for a specific region over the weekend, the server will adjust the personnel allocation in that region in advance to ensure appropriate staffing levels. A possible prompt message used by the user in this case might be, "Please provide an optimal staffing plan to handle the increased traffic volume over the weekend." This enables efficient operation of communication support and security services, and ensures high response quality.

[0713] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0714] Step 1:

[0715] The server retrieves historical communication records and activity information through databases and external APIs. This input data includes call volume, duration, and event information. The server cleans and preprocesses this data before arranging it in the appropriate format. At this stage, data processing such as noise removal and format standardization is performed.

[0716] Step 2:

[0717] The server builds a machine learning model using the processed data. First, it generates a training dataset using frameworks such as Python's Scikit-learn or TensorFlow. Based on this dataset, the server trains a generative AI model to predict future traffic volume. In this process, it extracts data features, selects an appropriate algorithm, and trains the model. The output is the model used for prediction.

[0718] Step 3:

[0719] The server collects real-time data from each communication facility. This real-time data includes current communication load and the number of calls in progress. The server analyzes this data and performs data processing to understand the current load situation. The output of this processing is real-time load information for each facility.

[0720] Step 4:

[0721] The terminal optimizes resource allocation based on predictions and real-time data received from the server. Specifically, it executes an optimization algorithm that allocates additional resources to areas with high load. At this stage, it performs data calculations to improve the allocation of personnel and equipment based on the set conditions and presents the optimal allocation plan as output.

[0722] Step 5:

[0723] The user reviews the optimized resource allocation plan displayed on the terminal. Through the user interface, they can modify or approve the proposed plan. In this step, the displayed plan is the target of user input, but ultimately, the plan modified and applied by the user is output.

[0724] Step 6:

[0725] The server collects user-modified and applied plans as feedback. To improve prediction accuracy in subsequent attempts, it integrates this feedback and updates the machine learning model. The server retrains the generative AI model, improving its performance. The feedback serves as input, and the improved accuracy of the new prediction model is output.

[0726] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0727] This invention relates to a system that recognizes the emotional state of users in communication support operations and utilizes this information to optimize responses and staffing. This system combines communication volume prediction based on past communication records and business activity information with an emotion engine for analyzing the emotional state of users.

[0728] Data analysis and emotion recognition

[0729] The server activates an emotion engine to analyze voice and text data acquired in real time from each communication facility. The emotion engine analyzes the words used by the user, their tone of voice, and the flow of the conversation to recognize the user's emotional state (e.g., joy, anger, anxiety). This information is stored in a database along with other communication status data and further analyzed.

[0730] Optimizing customer service and staffing

[0731] The server adjusts its response in real time based on the user's emotional state, as recognized by the emotion engine. For example, it dynamically allocates the most suitable resources, such as assigning a highly specialized operator to a user exhibiting a specific emotional state. Such adjustments are crucial for providing a better user experience and improving customer satisfaction.

[0732] User interface and feedback loop

[0733] The terminal receives sentiment analysis results and optimized response instructions from the server and presents them to the user. The administrator, acting as the user, can also contribute to the overall system improvement by evaluating the quality of responses based on the sentiment analysis results and providing feedback. The server utilizes this feedback to continuously improve the sentiment engine's algorithm, enabling even more accurate predictions and responses.

[0734] For example, if a customer calls in with a complaint and the emotion engine analyzes the call as "angry," the server can immediately assign the most suitable customer support representative based on that profile. This implementation improves the quality of service and facilitates smoother problem resolution.

[0735] The following describes the processing flow.

[0736] Step 1:

[0737] The server collects user voice and text data in real time from the communication reception facility. This data is acquired at appropriate times during the call and preprocessed for sentiment analysis.

[0738] Step 2:

[0739] The server uses an emotion engine to analyze the collected audio and text data. This analysis considers the user's tone of voice, word choice, and sentence flow to detect emotional states such as joy, anger, and sadness.

[0740] Step 3:

[0741] The server feeds back the recognized emotional state to the communication response facility and uses it as data to determine the appropriate response policy. It also prioritizes assigning calls to operators with specific expertise and capabilities based on the emotional state.

[0742] Step 4:

[0743] The terminal displays the sentiment analysis results and suggested response plans sent from the server. The administrator, as the user, can review this information and send instructions to the response staff as needed.

[0744] Step 5:

[0745] Users can monitor the progress of actual interactions and immediately modify their responses in anticipation of changes in customer emotions or reactions. This operation is performed via a terminal, and the system is prepared to incorporate that information and make further optimizations.

[0746] Step 6:

[0747] The server analyzes the accumulated emotion recognition data and the final response results to update the emotion engine's algorithm. This feedback loop is essential for improving the accuracy of future analyses and the ability to formulate response strategies.

[0748] (Example 2)

[0749] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0750] In customer service operations, it is difficult to optimize staffing based on communication volume and customer emotional state, leading to decreased customer satisfaction and increased operator workload. Conventional systems do not adequately predict communication volume or dynamically allocate staff based on emotional state, leaving challenges in achieving effective customer service.

[0751] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0752] In this invention, the server includes means for acquiring past communication records and business activity information, means for constructing an inference model using the acquired information and predicting future communication volume, means for monitoring the communication status of multiple communication response facilities in real time, means for analyzing user speech content and voice data to recognize emotional states, means for optimizing staffing at each response facility based on prediction results, monitoring results, and emotional state analysis results, and means for notifying each communication response facility of the optimized staffing results. This enables optimal staffing based on dynamic prediction of communication volume and recognition of emotional states, thereby improving customer satisfaction and reducing the burden on operators.

[0753] "Communication records" is a general term for data that includes the content, duration, and participant information of past phone calls and messages.

[0754] "Business activity information" refers to information about the daily business processes carried out by a company or organization, and includes data such as transaction history and interactions with customers.

[0755] An "inference model" is a computational model, either statistical or machine learning-based, that uses past data to predict future events or states.

[0756] A "communication support facility" is a dedicated facility for handling inquiries, orders, complaints, and other matters from customers and users.

[0757] "Emotional state" refers to the state in which a person expresses their emotions, and includes psychological states such as joy, anger, and anxiety.

[0758] "Personnel allocation" refers to placing the appropriate staff or operators in the appropriate locations according to the needs of the work.

[0759] "Optimization methods" refer to the process of making adjustments using calculations and algorithms to maximize the efficiency and effectiveness of the subject.

[0760] "Means of notification" refers to a method or system for transmitting information or instructions to a target person.

[0761] This invention relates to a system that recognizes the emotional state of users in communication support operations and optimizes response methods and staffing. The server acts as the central hub of this system, operating to implement various functions.

[0762] The server first collects communication records and business activity information, and then builds an inference model based on this information. The inference model is implemented using a programming language such as Python or R, and its associated machine learning libraries (e.g., TensorFlow, scikit-learn). This gives the system the ability to predict future communication volume.

[0763] Next, the server receives voice and text data in real time from multiple communication reception facilities. This data is obtained through VoIP services and messaging APIs. The received voice data is converted to text using speech recognition software such as the Google Speech-to-Text API. Then, the emotion engine is activated to analyze the user's speech and tone of voice, and recognize their emotional state in real time. The emotion engine is implemented using NLP (Natural Language Processing) technology and sentiment analysis algorithms.

[0764] The analyzed emotional states are stored in a database and analyzed in parallel in combination with predicted traffic volume data and monitoring data on communication status. This allows the server to determine the optimal response method based on the user's emotional state at each communication response facility and dynamically adjust staffing accordingly.

[0765] The terminal receives instructions from the server and visually presents optimized response methods to operators and administrators. Administrators, as users, can evaluate the quality of responses based on the presented information and provide feedback to further improve the system.

[0766] As a concrete example, if the emotion engine detects "anger" in a customer complaint, the server immediately assigns the task to the most suitable customer support representative based on that information. This improves the quality of service and facilitates smoother problem resolution.

[0767] Examples of prompt statements to input into a generative AI model include the following:

[0768] "Explain how to analyze a customer's emotional state in real time during a phone call and dynamically assign the most suitable operator based on the results."

[0769] This invention makes it possible to improve the user experience in communication support operations and to increase the overall efficiency of communication operations.

[0770] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0771] Step 1:

[0772] The server retrieves communication records and business activity information from the database. The retrieved data includes information such as past communication patterns and business trends. The input is past call history and customer interaction records, and the output is a dataset that organizes and aggregates these. The server then performs preprocessing based on this data to build an inference model.

[0773] Step 2:

[0774] The server uses preprocessed data and leverages machine learning libraries (e.g., TensorFlow) to build an inference model. Here, it receives a dataset as input, performs statistical and regression analysis, and generates a model to predict future traffic volume as output. Specifically, the server splits the data into a training set and a test set, evaluates and improves the model, and outputs an optimized model.

[0775] Step 3:

[0776] The server receives voice and text data in real time from the communication reception facility. The input data is sent via VoIP services and chat APIs, and the output is obtained in the form of text-converted voice data. The server uses speech recognition software to perform the conversion to text format.

[0777] Step 4:

[0778] The server analyzes the text data using an emotion engine. The input is the text data output from step 3, and the emotion recognition algorithm analyzes the content and context of the statements in the data, obtaining the user's emotional state (e.g., joy, anger, anxiety) as output. Specifically, the server uses NLP techniques to perform emotion scoring.

[0779] Step 5:

[0780] The server analyzes the results of emotional state analysis in combination with predicted traffic volume data to determine the optimal response method and staffing. The inputs are emotional state data and predicted traffic volume data, and the output provides operator placement instructions and guidelines for response methods. The server uses an algorithm to perform placement simulations and generate an optimized placement strategy.

[0781] Step 6:

[0782] The terminal displays the response instructions received from the server to the operator. The input is deployment instruction data from the server, and the output is operation guidelines displayed on the user interface. The terminal visualizes the instructions so that the operator can easily confirm them.

[0783] Step 7:

[0784] The user, acting as the administrator, evaluates the quality of interactions and provides necessary feedback to the server. Input consists of the administrator's observed interaction results and evaluation comments. Output is the feedback information sent to the server, which is used to improve sentiment recognition algorithms and placement optimization models. Specific actions include filling out a feedback form.

[0785] (Application Example 2)

[0786] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0787] Modern customer service requires flexible responses that adapt to the diverse emotional states of users. However, conventional systems have struggled to recognize user emotions in real time and adjust their responses accordingly. Furthermore, efficient staffing based on this information has been difficult, resulting in insufficient improvement in the quality of service and maximization of operational efficiency.

[0788] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0789] In this invention, the server includes means for acquiring past communication records and business activity information, means for analyzing voice and text data acquired in real time to recognize the user's emotional state, and means for adjusting the response method in real time based on the recognized emotional state. This makes it possible to improve the quality of service and increase customer satisfaction by providing responses that are in line with the user's emotions.

[0790] "Communication records" refer to information that includes past interactions with the user, such as text data and audio data.

[0791] "Business activity information" refers to data related to a company's operations, including business-related information such as sales and customer information.

[0792] A "machine learning model" is a mathematical algorithm that analyzes large amounts of data and uses those patterns to make predictions and classifications.

[0793] A "communication support facility" is a place where communication with users takes place directly via telephone, online chat, etc.

[0794] "Personnel allocation" means placing the necessary personnel in the appropriate locations to perform tasks efficiently.

[0795] An "emotion engine" is an algorithm used to identify a user's emotional state from voice and text data.

[0796] "Response method" refers to the specific means and protocols used when interacting with users, including the personnel and information involved.

[0797] The system for realizing this invention consists of an integrated set of multiple components. The server first acquires historical communication records and business activity information, and uses this to build a machine learning model. This model is used to predict future communication volumes and serves as the basis for efficiently processing data.

[0798] In environments requiring real-time communication responses, the server acquires voice and text data from multiple communication response facilities. This data is input to analyze the user's emotional state using an emotion engine. The emotion engine utilizes an analysis service provided on the cloud (for example, the Emotion Recognition API in Azure Cognitive Services).

[0799] The analyzed emotion data is stored in the server's database and forms the basis for adjusting response methods in real time. This adjustment also contributes to dynamically assigning the appropriate operator, providing the user with the best possible response.

[0800] Furthermore, the terminal, based on instructions from the server, presents the user with response methods through its interface and collects feedback from the user. This feedback is used to improve the emotion engine's algorithm.

[0801] For example, if the server analyzes the user's voice and determines that they are "happy," it can notify the device to provide relaxing content. This allows the user to have a more relaxing time at home.

[0802] An example of a prompt message could be, "If the user's emotion is analyzed as 'fatigue,' what is the best course of action to take?" This could be used to provide feedback to the generative AI model.

[0803] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0804] Step 1:

[0805] The server retrieves historical communication records and business activity information from communication facilities. This data is passed to the server as input, and the information is stored as a dataset for use in machine learning algorithms as output. Specifically, it executes queries from the database, organizes the retrieved data, and prepares it for subsequent analysis processes.

[0806] Step 2:

[0807] The server uses the acquired data to build a machine learning model and predict future traffic volume. A well-organized dataset is fed to the machine learning algorithm as input, and a traffic volume prediction model is generated as output. Specifically, the server uses a data analysis library with Python to train the model. During this process, it analyzes past trends and patterns.

[0808] Step 3:

[0809] The server acquires voice and text data in real time from communication reception facilities nationwide. Voice streams and text messages are received by the server as input, and this data is transferred to the emotion engine as output. Specifically, the data is streamed via an API and prepared for immediate processing.

[0810] Step 4:

[0811] The server uses an emotion engine to analyze voice and text data and recognize the user's emotional state. The emotion engine receives voice and text data as input, and the analyzed emotional state is recorded in a database as output. Specifically, it uses speech recognition software to convert speech to text and an emotion analysis algorithm to classify the emotional state.

[0812] Step 5:

[0813] The server adjusts its response method in real time based on the recognized emotional state. Analysis results are used as input, and the adjusted response procedure and assigned personnel selection are sent to the response facility as output. Specifically, the system selects an appropriate response script and sends instructions to the assigned operator. It also suggests recommended actions for adjusting the response content.

[0814] Step 6:

[0815] The server presents real-time response instructions to the user via the terminal and collects feedback. Response instructions are displayed on the terminal as input, and feedback data is sent to the server as output. In terms of specific operation, the interface visually displays instructions, and user feedback is received through an interactive form.

[0816] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0817] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0818] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0819] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0820] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0821] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0822] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0823] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0824] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0825] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0826] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0827] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0828] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0829] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0830] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0831] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0832] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0833] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0834] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0835] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0836] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted as being incorporated by reference.

[0837] The following is further disclosed regarding the embodiments described above.

[0838] (Claim 1)

[0839] Means for obtaining past communication records and business activity information,

[0840] A means for constructing a machine learning model using the acquired information and predicting future communication volume,

[0841] A means of monitoring the communication status of communication facilities nationwide in real time,

[0842] A means for optimizing staffing at each reception facility based on the aforementioned prediction and monitoring results,

[0843] A means of notifying each communication-responding facility of the optimization results of the layout,

[0844] A system that includes this.

[0845] (Claim 2)

[0846] The system according to claim 1, further comprising means for updating the machine learning model based on monitoring results of communication facilities and past deployment instruction records.

[0847] (Claim 3)

[0848] The system according to claim 1, including an interface that allows an administrator to input modifications to the optimization of personnel allocation.

[0849] "Example 1"

[0850] (Claim 1)

[0851] Means for acquiring communication records and business information,

[0852] A means to build a data analysis model using the acquired information and predict future communication volume,

[0853] A means of continuously monitoring the communication status of communication facilities,

[0854] A means for optimizing the workforce allocation at each facility based on prediction and monitoring results,

[0855] A means of notifying each communication facility of the optimization results,

[0856] A system that includes this.

[0857] (Claim 2)

[0858] The system according to claim 1, further comprising means for updating the data analysis model based on the monitoring results of communication facilities and past deployment instruction records.

[0859] (Claim 3)

[0860] The system according to claim 1, which includes a display device that allows an administrator to input modifications to the optimization of the labor force allocation.

[0861] "Application Example 1"

[0862] (Claim 1)

[0863] Means for obtaining past communication records and activity information,

[0864] A means for constructing a machine learning model using the acquired information and predicting future communication volume,

[0865] A means of monitoring the facility's communication status in real time,

[0866] A means for optimizing resource allocation at each facility based on the aforementioned prediction and monitoring results,

[0867] A means of notifying each facility of the optimization results of the layout,

[0868] A means for administrators to review the deployment plan via a user interface and modify it as needed,

[0869] A means of collecting feedback and updating the machine learning model to improve the accuracy of future predictions,

[0870] A system that includes this.

[0871] (Claim 2)

[0872] The system according to claim 1, comprising means for analyzing notifications and warnings in a specific area based on real-time data and optimizing resource allocation accordingly.

[0873] (Claim 3)

[0874] The system according to claim 1, comprising means for enhancing a generative AI model based on feedback for optimizing resource allocation and providing prompt statements for the user to generate a specific plan.

[0875] "Example 2 of combining an emotion engine"

[0876] (Claim 1)

[0877] Means for obtaining past communication records and business activity information,

[0878] A means for constructing an inference model using the acquired information and predicting future communication volume,

[0879] A means of monitoring the communication status of multiple communication facilities in real time,

[0880] A means of recognizing the emotional state by analyzing the user's speech content and voice data,

[0881] A means for optimizing staffing at each customer service facility based on the aforementioned prediction results, monitoring results, and emotional state analysis results,

[0882] A means for notifying each communication-responding facility of the optimized placement results,

[0883] A system that includes this.

[0884] (Claim 2)

[0885] The system according to claim 1, comprising means for updating the inference model based on the monitoring results of communication response facilities, past deployment instruction records, and the results of recognizing emotional states.

[0886] (Claim 3)

[0887] The system according to claim 1, including a user interface that allows an administrator to input modifications to the optimization of personnel allocation.

[0888] "Application example 2 when combining with an emotional engine"

[0889] (Claim 1)

[0890] Means for obtaining past communication records and business activity information,

[0891] A means for constructing a machine learning model using the acquired information and predicting future communication volume,

[0892] A means of monitoring the communication status of communication facilities nationwide in real time,

[0893] A means for optimizing staffing at each reception facility based on the aforementioned prediction and monitoring results,

[0894] A means of notifying each communication-responding facility of the optimization results of the layout,

[0895] A means of analyzing voice and text data acquired in real time to recognize the user's emotional state,

[0896] A means of adjusting the response method in real time based on the recognized emotional state,

[0897] A system that includes this.

[0898] (Claim 2)

[0899] The system according to claim 1, further comprising means for updating the machine learning model based on monitoring results of communication facilities and past deployment instruction records.

[0900] (Claim 3)

[0901] The system according to claim 1, including an interface that allows an administrator to input modifications to the optimization of personnel allocation. [Explanation of Symbols]

[0902] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. Means for obtaining past communication records and activity information, A means for constructing a machine learning model using the acquired information and predicting future communication volume, A means of monitoring the facility's communication status in real time, A means for optimizing resource allocation at each facility based on the aforementioned prediction and monitoring results, A means of notifying each facility of the optimization results of the layout, A means for administrators to review the deployment plan via a user interface and modify it as needed, A means of collecting feedback and updating the machine learning model to improve the accuracy of future predictions, A system that includes this.

2. The system according to claim 1, comprising means for analyzing notifications and warnings in a specific area based on real-time data and optimizing resource allocation accordingly.

3. The system according to claim 1, comprising means for enhancing a generative AI model based on feedback for optimizing resource allocation and providing prompt statements for the user to generate a specific plan.