system

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
The system addresses the challenge of real-time customer psychological state recognition by analyzing past data, generating tailored content, and adjusting presentations dynamically, enhancing customer satisfaction and business efficiency.

JP2026100583APending Publication Date: 2026-06-19SOFTBANK GROUP CORP

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: SOFTBANK GROUP CORP
Filing Date: 2024-12-09
Publication Date: 2026-06-19

Smart Images

Figure 2026100583000001_ABST

Patent Text Reader

Abstract

We provide the system. [Solution] An analytical method that analyzes past business presentation data and psychological data to identify customer interests and reaction patterns, A generation method that automatically generates optimal presentation text and materials based on analysis results, An emotion recognition system that captures the customer's facial expressions and voice in real time and analyzes their psychological state, A presentation method that provides suggestions for improving presentations and presenting information based on the customer's psychological state, A follow-up method that reanalyzes data after the presentation and generates improvement suggestions for the next time, A system that includes this.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] In a modern business environment, salespersons are required to make quick and individualized presentations to a variety of customers. However, with conventional methods, it is difficult to grasp the psychological state and reactions of customers in real time and immediately optimize the content of the presentation. Also, in the follow-up after the presentation, it was not possible to automatically obtain specific and effective improvement suggestions. For this reason, there is a current situation where the improvement of customer satisfaction and business efficiency is hindered.

Means for Solving the Problems

[0005] This invention provides an analysis means for analyzing past business presentation data and psychological data. This means identifies customer interests and reaction patterns, and includes a generation means for automatically generating optimal presentation text and materials based on the analysis results. Furthermore, it utilizes an emotion recognition means for real-time capture of customer facial expressions and voice to analyze their psychological state, and a presentation means for providing presentation improvement suggestions and information tailored to the customer's psychological state. The problem is further solved by a follow-up means that re-analyzes the data after the presentation and generates improvement suggestions for the next presentation. This enables effective presentations and follow-up for customers, improving both business efficiency and customer satisfaction.

[0006] The "analysis method" refers to a function that identifies customer interests and reaction patterns based on past business presentation data and psychological data.

[0007] The "generation means" is a function that automatically creates optimal presentation text and materials based on the analysis results obtained by the analysis means.

[0008] "Emotion recognition means" refers to a function that captures the customer's facial expressions and voice in real time, and analyzes and judges their psychological state.

[0009] "Presentation methods" refer to a function that provides real-time suggestions for improving presentations and information tailored to the customer's psychological state.

[0010] The "follow-up method" is a function that re-analyzes the data collected after the presentation and generates specific improvement suggestions for the next presentation. [Brief explanation of the drawing]

[0011] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, when an emotion engine is combined. [Figure 14] This is a sequence diagram showing the processing flow of the data processing system in Application Example 2, which combines an emotion engine. [Modes for carrying out the invention]

[0012] Hereinafter, an example of an embodiment of the system relating to the technology of this disclosure will be described with reference to the attached drawings.

[0013] First, let's explain the terminology used in the following explanation.

[0014] In the following embodiments, the labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0015] In the following embodiments, the labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0016] In the following embodiments, the labeled storage is one or more non-volatile storage devices that store various programs, various parameters, and the like. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.

[0017] In the following embodiments, the labeled communication I / F (Interface) is an interface including a communication processor, an antenna, and the like. The communication I / F controls communication between multiple computers. Examples of communication standards applied to the communication I / F include wireless communication standards including 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0018] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0019] [First Embodiment]

[0020] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0021] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0022] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0023] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0024] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0025] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0026] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0027] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0028] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0029] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0030] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0031] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0032] The presentation optimization system proposed in this invention is comprised of a server, a terminal, and a user working together. The server first collects past business presentation data and psychological data, and then uses these as analysis tools to identify customer interests and reaction patterns. Based on these analysis results, it automatically creates optimal presentation text and materials using a generation tool. This allows the user to prepare presentations efficiently and effectively.

[0033] The user begins the presentation based on the generated presentation text and materials. The device captures the customer's facial expressions and voice in real time. Emotion recognition analyzes this information to determine the customer's psychological state. Then, using a presentation tool, it provides the user with suggestions for improving the presentation or new information in real time. This functionality allows the user to appropriately and instantly adjust the presentation content in response to customer reactions.

[0034] Furthermore, after the presentation ends, the server re-analyzes the collected data and generates specific improvement suggestions for the next presentation based on follow-up measures. These suggestions are provided to the user as an automatically generated report, contributing to improved sales activities and communication.

[0035] As a concrete example, when proposing a new plan for a communication product, the user approaches the customer with optimal materials generated by the server. During the meeting, the terminal presents information that will capture the customer's interest, and if it detects a lack of interest from the customer's facial expressions, it suggests an alternative approach. After the meeting, the server provides specific suggestions for improvement for the next proposal. This entire process enables presentations tailored to customer needs, thereby increasing the closing rate. This system enables communication optimized for each individual customer, enhancing customer satisfaction and contributing to improved sales efficiency.

[0036] The following describes the processing flow.

[0037] Step 1:

[0038] The server collects past business presentation data and psychological data. This includes presentation history, customer comments, and customer reaction data.

[0039] Step 2:

[0040] The server processes the collected data using analytical tools and identifies customer interests and response patterns using NLP models. This analysis reveals customer preferences and needs.

[0041] Step 3:

[0042] Based on the analysis results, the server automatically generates the optimal presentation text and materials using a generation method and presents them to the user. These materials are tailored to the customer's areas of interest and needs.

[0043] Step 4:

[0044] Based on the presentation materials provided by the user, prepare to deliver the presentation. This includes reviewing the materials and conducting a rehearsal.

[0045] Step 5:

[0046] At the start of the presentation, the device collects the customer's facial expressions and voice. The device uses its camera and microphone to capture this data in real time.

[0047] Step 6:

[0048] The device uses emotion recognition to analyze the customer's psychological state from the collected data, determining things like whether they are interested or bored.

[0049] Step 7:

[0050] The device provides users with improvement suggestions and additional information in real time through presentation methods, based on the customer's psychological state. This allows users to dynamically adjust the presentation.

[0051] Step 8:

[0052] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0053] Step 9:

[0054] The server provides users with an automatically generated report that includes improvement suggestions. This report will be used in future sales activities.

[0055] (Example 1)

[0056] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0057] In modern business activities, maximizing the effectiveness of information dissemination to customers is crucial. However, data-driven solutions for appropriately tailoring information to individual customer responses and using that feedback to improve future information disseminations are limited. Therefore, there is a need for a system that optimizes information based on dynamic customer responses and continuously improves it.

[0058] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0059] In this invention, the server includes information analysis means for analyzing past business information and social science data to identify customer interests and reactions, data generation means for automatically generating optimized information representations based on the analysis results, and reaction analysis means for acquiring customer facial expressions and voices in real time and analyzing their psychological state. This maximizes the effectiveness of information announcements based on dynamic reactions of each customer and enables the provision of concrete improvement suggestions for future information announcements.

[0060] "Information analysis tools" refer to functions that collect past business information and social science data, and analyze them to identify customer interests and reactions.

[0061] "Data generation means" refers to a function that automatically creates an optimized information representation based on the analysis results obtained by the information analysis means.

[0062] "Reaction analysis means" refers to a function that acquires customer facial expressions and voices in real time, analyzes them, and determines the customer's psychological state.

[0063] "Information correction measures" refer to functions that provide suggestions for improving information dissemination or offering new information based on the customer's psychological state.

[0064] "Method for proposing improvement measures" refers to a function that reanalyzes data collected after an information release and generates specific improvement proposals for the next information release.

[0065] This invention is a system in which a server, a terminal, and a user collaborate to optimize presentation information. The server first acquires business information and social science data. This is done using a database management system and collecting data using query languages such as SQL. Subsequently, analysis software such as Python's Pandas and scikit-learn is used for data analysis, and machine learning techniques are applied to clarify customer behavior patterns and interests.

[0066] Based on these analysis results, the server automatically generates information representations using a generative AI model. This process utilizes an AI model known for natural language processing, specifically inputting a prompt message such as, "Create presentation text for a specific customer segment." The generated text can be used in presentation software such as Google Slides and Microsoft PowerPoint.

[0067] The user receives materials generated by the server and presents the information to the customer. During the meeting, the terminal uses a camera and microphone to capture the customer's facial expressions and voice in real time. This data is used to evaluate the customer's psychological state by sentiment analysis software (for example, an API provided by a cloud provider). The terminal is equipped with means to modify the information based on the customer's reactions, and specific advice such as "The customer's interest is waning, let's provide additional information in the next section" will be displayed during the presentation.

[0068] Once the presentation is complete, the server re-analyzes the data and generates improvement suggestions that can be used for future information presentations. These suggestions are then provided to the user as an automatically generated report. This entire process enables customized information presentations for customers, resulting in higher customer satisfaction and greater efficiency in commercial activities.

[0069] For example, when proposing a new communication service plan, a generative AI model can be used to input a prompt into the server such as, "Create text introducing a recommended communication plan for business professionals in their 30s," which will generate information optimized for the customer's attributes. This allows users to efficiently develop presentations that capture the customer's interest.

[0070] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0071] Step 1:

[0072] The server retrieves historical business information and social science data. This retrieval is performed by collecting data from a database using SQL queries. Query related to specific customers or presentation topics is used as input, and the output provides historical presentation data and customer feedback that match those criteria. This makes context-specific and useful data available as the basis for processing.

[0073] Step 2:

[0074] The server analyzes the acquired data to identify customer interests and response patterns. The analysis uses the Python Pandas library and scikit-learn for data cleaning and classification. The input is the raw data obtained in step 1, and the output is specific indicators and patterns that show customer interests and past responses. This step allows us to understand what kind of information customers responded favorably to.

[0075] Step 3:

[0076] The server automatically generates information representations using a generative AI model based on the analysis results. For example, a prompt such as "Create presentation text for a specific customer segment" is input, and the generative AI model responds by generating the optimal presentation text. The output consists of text and slides tailored to customer attributes. This generated content is directly used for information dissemination to customers.

[0077] Step 4:

[0078] Users present information based on presentation materials provided by the server. Upon receiving the materials, users use presentation software to organize and present them to customers. In this step, the input is the materials from the server, and the output is the actual presentation given to the customer.

[0079] Step 5:

[0080] The device captures the customer's facial expressions and voice in real time during the presentation. Using a camera and microphone, it collects the customer's facial expressions and voice as digital data. This data is then analyzed using emotion analysis software to evaluate the customer's psychological state. The results of this evaluation are output and used as an indicator to measure the customer's interest and understanding.

[0081] Step 6:

[0082] The device provides users with real-time suggestions for improving information presentations based on sentiment analysis results. The input is the sentiment analysis results from step 5, and the output is specific advice and correction instructions. For example, it might suggest, "There is a possibility that the customer's attention will be interrupted, so let's move on to the next topic."

[0083] Step 7:

[0084] After the presentation ends, the server analyzes the collected data again and generates improvement suggestions for the next information presentation. The inputs are the reaction data obtained in step 5 and the presentation results, and the output is a report containing automatically generated improvement suggestions. This report is provided to the user and used to review future activities and information presentation strategies.

[0085] (Application Example 1)

[0086] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0087] In today's commercial environment, understanding customer interests and emotions in real time and responding quickly and accurately is key to providing effective and personalized service. However, traditional methods struggle to instantly judge human emotions and interests and appropriately adjust presentations and negotiation content. This results in missed opportunities to increase customer satisfaction and hinders commercial success.

[0088] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0089] In this invention, the server includes an analysis means for analyzing business record data and psychological analysis data to identify the subject's interests and reaction tendencies; a generation means for automatically creating optimal presentation content and information based on the analysis results; and an emotion recognition means for acquiring the subject's facial expressions and voice in real time and evaluating their psychological state. This makes it possible to instantly grasp the subject's interests and emotions during presentations and business negotiations and provide optimal information on the spot.

[0090] "Business record data" refers to a collection of data containing information about past business operations, which serves as the basis for analyzing customer interests and responses.

[0091] "Psychological analysis data" refers to a collection of data gathered to understand an individual's psychological state and reactions.

[0092] "Analysis means" refers to a device or method that extracts work record data and psychological analysis data and performs a process to identify the interests and reaction tendencies of the subject.

[0093] "Generation means" refers to a device or method that automatically creates optimal presentation content and information based on the analysis results.

[0094] "Emotion recognition means" refers to a device or technology for acquiring a subject's facial expressions and voice in real time and evaluating their psychological state based on that information.

[0095] "Display means" refers to a device or technology that visually displays acquired information and clearly indicates the subject's condition or analysis results.

[0096] A "follow-up device" is a device or method for reanalyzing data collected after a presentation and automatically generating improvement suggestions for the next presentation.

[0097] The system of this invention consists of a server, a terminal, and a user working together. First, the server collects work record data and psychological analysis data, and uses dedicated software for data analysis to identify the subject's interests and reaction tendencies. Specifically, it analyzes the data using natural language processing technology. Based on the analysis results, a generative AI model automatically creates the optimal presentation content and information. This process enables the most effective information delivery in presentations and business negotiations.

[0098] Next, the device, for example in the form of smart glasses or other portable information terminals, captures the subject's facial expressions and voice in real time. This information is analyzed using emotion recognition technology to immediately evaluate the subject's psychological state. Emotion recognition technology uses facial recognition libraries such as OpenCV or DeepFace.

[0099] Users utilize the presentation device to visually evaluate the analyzed information and adjust the presentation content in real time. This process can enhance the effectiveness of business negotiations and demonstrations.

[0100] A concrete example would be a scenario introducing a new smartphone. The server analyzes data collected from the target audience's past purchase history and product reviews to identify their interests in camera features and battery life. Based on this, the generated presentation text emphasizes these features. If the audience's facial expression on the device indicates a lack of interest, the user quickly switches to presenting information about other appealing features.

[0101] An example of a prompt might be, "If the customer is not interested in the dual-camera feature, generate presentation text to highlight other features." This prompt is then fed into the generative AI model to ensure that appropriate alternative information is provided.

[0102] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0103] Step 1:

[0104] The server collects work record data and psychological analysis data. It uses digital data related to past work and psychological survey data as input. Natural language processing techniques are applied to analyze this data to identify the subjects' areas of interest and response tendencies. This process provides the server with the foundational data needed to create optimal presentations and information.

[0105] Step 2:

[0106] The server uses a generative AI model based on the analysis results to automatically create optimal presentation content and information. The analysis results obtained in Step 1 are used as input. The generative AI model generates presentation text based on the prompt text, which is then output. This generated information is used in subsequent presentations.

[0107] Step 3:

[0108] The device captures the subject's facial expressions and voice in real time. It uses a camera and microphone to acquire video and audio of the subject as input. This data is processed using emotion recognition technology to evaluate their psychological state in real time. Through this process, the device obtains immediate data on the subject's behavior and reactions.

[0109] Step 4:

[0110] The user uses a presentation device to visually evaluate the presentation content based on the analyzed information and adjust it as needed. The inputs used are the presentation content generated in step 2 and the real-time sentiment analysis results obtained in step 3. Based on this information, the user appropriately modifies the presentation or sales negotiation content to effectively convey the information to the target audience.

[0111] Step 5:

[0112] After the presentation ends, the server re-analyzes the data and automatically generates improvement suggestions for the next time. The input is the data recorded during the presentation. The same natural language processing techniques used in the previous procedure are employed for this analysis. The output presents improvement suggestions, providing valuable feedback to prepare for future business negotiations and presentations.

[0113] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0114] The presentation optimization system of the present invention involves a server, a terminal, a user, and an emotion engine. First, the server collects past business presentation data and psychological data, and uses analytical means to identify customer interests and response patterns. This analysis utilizes a natural language processing model to clarify the customer's specific needs and interests.

[0115] Based on the analysis results, the server automatically generates optimal presentation text and materials using a generation mechanism. The user prepares based on these materials and delivers the presentation in real time via the terminal. During the presentation, the terminal uses emotion recognition to analyze the customer's facial expressions, voice, and even gaze in real time to determine the customer's psychological state.

[0116] Furthermore, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's own emotional state. This information is fed back to the user through the presentation system, helping them with self-awareness and providing guidance for carefully adjusting the presentation's progress.

[0117] Furthermore, this system uses follow-up methods to re-analyze the data collected after the presentation. Based on the analysis results, the server generates and provides specific improvement suggestions for the next presentation, utilizing user and customer sentiment data.

[0118] As a concrete example, when giving a presentation introducing a new product, the server analyzes data obtained from past success stories and provides the user with generated materials. As the user conducts the presentation while interacting with the customer, they receive feedback on their own emotional state from an emotion engine, while also having their emotions monitored by a terminal. Based on this information, they adjust the content as needed to achieve more effective communication. After the presentation, improvement suggestions obtained through follow-up become an important source of information that supports the success rate of future presentations. This system enables communication that captures the emotions of both the customer and the user, maximizing the effectiveness of the presentation.

[0119] The following describes the processing flow.

[0120] Step 1:

[0121] The server collects past business presentation data and psychological data. Specifically, it retrieves customer purchase history and past presentation feedback from the database.

[0122] Step 2:

[0123] The server processes the collected data using analytical tools, and a natural language processing model identifies customer areas of interest and response patterns. This reveals customer characteristics and needs.

[0124] Step 3:

[0125] Based on the analysis results, the server automatically generates presentation text and materials using a generation mechanism. During this process, content and visual materials that are likely to attract the customer's interest are highlighted.

[0126] Step 4:

[0127] Users prepare their presentations based on automatically generated materials. They verify the consistency of the materials and practice the flow of their presentations.

[0128] Step 5:

[0129] The device is used at the start of the presentation to capture the audience's facial expressions, gaze, and tone of voice in real time. This uses a camera and microphone.

[0130] Step 6:

[0131] The system analyzes data captured by the emotion recognition mechanism on the device to determine the customer's psychological state. Specifically, it identifies emotions such as interest, suspicion, and boredom.

[0132] Step 7:

[0133] The emotion engine analyzes the user's facial expressions and tone of voice in real time to recognize the user's emotional state. This allows the user to understand their own situation.

[0134] Step 8:

[0135] The device provides feedback to the user through presentation methods based on customer and user sentiment data. This allows the user to adjust the presentation accordingly.

[0136] Step 9:

[0137] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0138] Step 10:

[0139] The server provides the user with a report that includes suggestions for improvement. The user can use this report to prepare for their next presentation more effectively.

[0140] (Example 2)

[0141] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0142] In information presentation activities, there is a challenge in effectively capturing the recipient's interest and reactions, resulting in decreased efficiency of information transmission. Furthermore, it is difficult for the presenter to understand their own emotional state and appropriately adjust their presentation accordingly. To address these challenges, it is necessary to analyze the emotional states of both the presenter and the recipient in real time and achieve optimized information presentation.

[0143] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0144] In this invention, the server includes an analysis means for analyzing past information presentation data and psychological information to identify the recipient's interests and reaction patterns; a generation means for automatically generating optimal information presentation text and materials based on the analysis results; an emotion recognition means for capturing the recipient's facial expressions, voice, and gaze in real time and analyzing their psychological state; a self-recognition feedback means for recognizing the user's facial expressions and tone of voice and presenting their own emotional state; and a follow-up means for re-analyzing the data after information presentation and generating suggestions for improvement for the next time. This maximizes the effectiveness of information presentation and enables smooth communication.

[0145] "Past information presentation data" refers to all data related to presentations and explanations given in the past, including the content of statements and materials used.

[0146] "Psychological information" refers to data on recipients' emotions and behavioral patterns, based on psychological findings and research results.

[0147] "Analysis means" refers to methods and devices for processing target data and extracting important patterns and information from it.

[0148] "Generation method" refers to the method or process of creating new informational texts or materials based on the analysis results.

[0149] "Emotion recognition means" refers to methods and technologies that analyze and judge the emotional state of the recipient from facial expressions, voice, gaze, etc.

[0150] "Self-awareness feedback methods" refer to technologies and systems that present the sender's emotional state to themselves and promote self-awareness.

[0151] "Follow-up measures" refer to methods or devices for re-analyzing data after information has been presented and generating suggestions for future improvements.

[0152] This system optimizes presentations through the coordinated operation of a server, terminal, user, and emotion engine. The following describes a specific implementation of this system.

[0153] First, the server collects past information presentation data and psychological information, and analyzes this data. The server utilizes a natural language processing model to identify the recipient's interests and response patterns. This analysis makes it possible to determine what information is effective for the recipient. The analysis results are then automatically generated as optimal information presentation texts and materials through a generative AI model. Specifically, information about newly introduced products and services is constructed based on successful past presentation examples.

[0154] Users prepare their presentations based on the generated informational text and materials. The device plays a crucial role in the presentation itself. The device captures the audience's facial expressions, voice, and gaze in real time, and analyzes their psychological state using emotion recognition technology. This analysis allows for an immediate determination of how the audience is receiving the presentation.

[0155] In addition, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's emotional state. This information is fed back to the user to help them with self-awareness and to guide them in adjusting the flow of their presentation.

[0156] After the presentation ends, the server re-analyzes the data for follow-up and generates suggestions for improvement for the next presentation. This can improve the success rate of future information presentations.

[0157] As a concrete example, when giving a presentation to introduce a new product, the following prompt can be used in the AI model: "Generate new product introduction materials based on past success stories. Focus on elements that will capture the audience's attention, and include suggestions for conducting the presentation while monitoring emotions in real time."

[0158] This system enables effective information transmission by understanding the emotions of both the user and the recipient, maximizing the impact of presentations.

[0159] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0160] Step 1:

[0161] The server collects historical information presentation data and psychological information. It retrieves relevant documents and datasets from databases and the internet as input. To analyze this information, it uses a natural language processing model to identify the recipient's areas of interest. As output, it generates a profile of the recipient's interests and response patterns.

[0162] Step 2:

[0163] Based on the analysis results obtained in Step 1, the server automatically generates optimal information presentation text and materials using a generative AI model. This process involves using prompts to instruct the generative AI model and design the flow and structure of the information. The input is the analysis result profile, and the output is specific presentation text and visual materials.

[0164] Step 3:

[0165] The user prepares their presentation based on the presentation text and materials provided by the server. The user reviews the materials, customizes them as needed, and prepares for the presentation. During the preparation process, they organize their presentation method and key points.

[0166] Step 4:

[0167] The user begins a presentation and presents information through the device. The device uses emotion recognition to capture the recipient's facial expressions, voice, and gaze in real time and analyzes their psychological state. The input is real-time data from the recipient, and this analysis allows the device to understand the recipient's shifts in interest and attention. As output, a report on the recipient's psychological state is generated.

[0168] Step 5:

[0169] The emotion engine captures the user's facial expressions and tone of voice to recognize the user's own emotional state. This information is fed back to the user through the device, providing guidance for adjusting the presentation. The input is real-time data from the user, and the output generates feedback that aids self-awareness.

[0170] Step 6:

[0171] After the presentation ends, the server uses follow-up mechanisms to re-analyze the collected data. The input is all the data acquired during the presentation, and the server performs data processing to derive optimal improvement measures. As output, specific improvement suggestions for the next information presentation are generated and provided to the user.

[0172] (Application Example 2)

[0173] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0174] In modern advertising, there is a need to understand the emotions of individual viewers in real time and present the most suitable advertisements that match their psychological state. However, conventional advertising systems have the challenge of not being able to effectively capture viewers' emotions and reactions and instantly generate corresponding advertisement content. Therefore, new methods are needed to maximize the effectiveness of advertising.

[0175] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0176] In this invention, the server includes an analysis means for analyzing past data and psychological data to identify an individual's interests and reaction patterns, a generation means for automatically generating optimal content based on the analysis results, and an emotion recognition means for capturing an individual's facial expressions and voice in real time and analyzing their psychological state. This makes it possible to present dynamic advertising content that responds to the viewer's emotional state.

[0177] "Analysis methods" refer to techniques that analyze past data and psychological data to identify individuals' interests and response patterns.

[0178] "Generation means" refers to a technology that automatically creates optimal content based on the results obtained by the analysis means.

[0179] "Emotion recognition technology" refers to a technique that collects an individual's facial expressions and voice in real time and analyzes their psychological state from that data.

[0180] "Presentation methods" refer to technologies that dynamically display different information according to an individual's psychological state.

[0181] "Follow-up methods" refer to technologies that reanalyze data collected after viewing and generate suggestions for improvement for the next time.

[0182] The system for implementing the present invention aims to optimize ad delivery and mainly includes analysis means, generation means, emotion recognition means, presentation means, and follow-up means. These means are specifically processed on the server and the user's terminal.

[0183] The server processes historical and psychological data using analytical tools to identify viewer interests and reaction patterns. This analysis utilizes natural language processing technology, OpenCV for facial recognition software, and Google's speech recognition API for voice analysis software. Based on the analysis results, the server uses a generative AI model (e.g., Hugging Face's Transformers) to automatically generate advertising content optimized for the viewer.

[0184] On the user's device, personal facial expressions and voice data collected from the smartphone's camera and microphone are analyzed in real time using emotion recognition technology to determine their psychological state. To maintain viewer interest, the presentation system displays different information on the fly and adjusts the advertising content accordingly. This information is dynamically updated based on the viewer's facial expressions and voice tone while watching the video.

[0185] After the presentation ends, the server uses follow-up methods to re-analyze the viewing data and generate improvement suggestions to help with the next advertising campaign. For example, if a user smiles while watching the video, content such as "Using this product will make your everyday life more enjoyable. Click here for more details!" will be displayed.

[0186] This system enables ad delivery that resonates with viewers' emotions, maximizing advertising effectiveness.

[0187] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0188] Step 1:

[0189] The server collects historical and psychological data. The input consists of business presentation data and datasets based on psychological research. These are analyzed using natural language processing techniques to identify audience interest and response patterns. The analysis identifies which topics attract interest, and this information is used as input for the next step.

[0190] Step 2:

[0191] The server automatically generates appropriate advertising content using a generative AI model based on insights obtained from the analysis. The input is the interest and response patterns obtained in step 1, which the generative AI model processes to output broadcastable advertising text and visuals. The generated content is optimized to easily attract the viewer's interest.

[0192] Step 3:

[0193] The user's device uses a camera and microphone to capture the viewer's facial expressions and voice in real time. The input consists of audio and image data, which are analyzed by emotion recognition systems to determine the viewer's psychological state. The output is real-time emotion data from the viewer, which is used to appropriately adjust the displayed advertisements.

[0194] Step 4:

[0195] The device uses presentation tools to display the most suitable advertisement content to the viewer based on the emotion recognition results. The input is the emotion data obtained in step 3, and based on this data, it selects and dynamically displays different advertisement content. The output is the customized advertisement presented to the viewer.

[0196] Step 5:

[0197] After the presentation, the server re-analyzes the viewing data and uses follow-up methods to generate suggestions for improving the next advertising campaign. The input is the viewing data collected in steps 3 and 4, which is analyzed to evaluate its effectiveness and output suggestions for the next campaign.

[0198] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0199] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0200] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0201] [Second Embodiment]

[0202] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0203] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0204] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0205] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0206] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0207] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0208] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0209] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0210] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0211] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0212] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0213] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0214] The presentation optimization system proposed in this invention is comprised of a server, a terminal, and a user working together. The server first collects past business presentation data and psychological data, and then uses these as analysis tools to identify customer interests and reaction patterns. Based on these analysis results, it automatically creates optimal presentation text and materials using a generation tool. This allows the user to prepare presentations efficiently and effectively.

[0215] The user begins the presentation based on the generated presentation text and materials. The device captures the customer's facial expressions and voice in real time. Emotion recognition analyzes this information to determine the customer's psychological state. Then, using a presentation tool, it provides the user with suggestions for improving the presentation or new information in real time. This functionality allows the user to appropriately and instantly adjust the presentation content in response to customer reactions.

[0216] Furthermore, after the presentation ends, the server re-analyzes the collected data and generates specific improvement suggestions for the next presentation based on follow-up measures. These suggestions are provided to the user as an automatically generated report, contributing to improved sales activities and communication.

[0217] As a concrete example, when proposing a new plan for a communication product, the user approaches the customer with optimal materials generated by the server. During the meeting, the terminal presents information that will capture the customer's interest, and if it detects a lack of interest from the customer's facial expressions, it suggests an alternative approach. After the meeting, the server provides specific suggestions for improvement for the next proposal. This entire process enables presentations tailored to customer needs, thereby increasing the closing rate. This system enables communication optimized for each individual customer, enhancing customer satisfaction and contributing to improved sales efficiency.

[0218] The following describes the processing flow.

[0219] Step 1:

[0220] The server collects past business presentation data and psychological data. This includes presentation history, customer comments, and customer reaction data.

[0221] Step 2:

[0222] The server processes the collected data using analytical tools and identifies customer interests and response patterns using NLP models. This analysis reveals customer preferences and needs.

[0223] Step 3:

[0224] Based on the analysis results, the server automatically generates the optimal presentation text and materials using a generation method and presents them to the user. These materials are tailored to the customer's areas of interest and needs.

[0225] Step 4:

[0226] Based on the presentation materials provided by the user, prepare to deliver the presentation. This includes reviewing the materials and conducting a rehearsal.

[0227] Step 5:

[0228] At the start of the presentation, the device collects the customer's facial expressions and voice. The device uses its camera and microphone to capture this data in real time.

[0229] Step 6:

[0230] The device uses emotion recognition to analyze the customer's psychological state from the collected data, determining things like whether they are interested or bored.

[0231] Step 7:

[0232] The device provides users with improvement suggestions and additional information in real time through presentation methods, based on the customer's psychological state. This allows users to dynamically adjust the presentation.

[0233] Step 8:

[0234] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0235] Step 9:

[0236] The server provides users with an automatically generated report that includes improvement suggestions. This report will be used in future sales activities.

[0237] (Example 1)

[0238] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0239] In modern business activities, maximizing the effectiveness of information dissemination to customers is crucial. However, data-driven solutions for appropriately tailoring information to individual customer responses and using that feedback to improve future information disseminations are limited. Therefore, there is a need for a system that optimizes information based on dynamic customer responses and continuously improves it.

[0240] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0241] In this invention, the server includes information analysis means for analyzing past business information and social science data to identify customer interests and reactions, data generation means for automatically generating optimized information representations based on the analysis results, and reaction analysis means for acquiring customer facial expressions and voices in real time and analyzing their psychological state. This maximizes the effectiveness of information announcements based on dynamic reactions of each customer and enables the provision of concrete improvement suggestions for future information announcements.

[0242] "Information analysis tools" refer to functions that collect past business information and social science data, and analyze them to identify customer interests and reactions.

[0243] "Data generation means" refers to a function that automatically creates an optimized information representation based on the analysis results obtained by the information analysis means.

[0244] "Reaction analysis means" refers to a function that acquires customer facial expressions and voices in real time, analyzes them, and determines the customer's psychological state.

[0245] "Information correction measures" refer to functions that provide suggestions for improving information dissemination or offering new information based on the customer's psychological state.

[0246] "Method for proposing improvement measures" refers to a function that reanalyzes data collected after an information release and generates specific improvement proposals for the next information release.

[0247] This invention is a system in which a server, a terminal, and a user collaborate to optimize presentation information. The server first acquires business information and social science data. This is done using a database management system and collecting data using query languages such as SQL. Subsequently, analysis software such as Python's Pandas and scikit-learn is used for data analysis, and machine learning techniques are applied to clarify customer behavior patterns and interests.

[0248] Based on these analysis results, the server automatically generates information representations using a generative AI model. This process utilizes an AI model known for natural language processing, specifically taking the prompt "Create presentation text for a specific customer segment" as input. The generated text can be used in presentation software such as Google Slides or Microsoft PowerPoint.

[0249] The user receives materials generated by the server and presents the information to the customer. During the meeting, the terminal uses a camera and microphone to capture the customer's facial expressions and voice in real time. This data is used to evaluate the customer's psychological state by sentiment analysis software (for example, an API provided by a cloud provider). The terminal is equipped with means to modify the information based on the customer's reactions, and specific advice such as "The customer's interest is waning, let's provide additional information in the next section" will be displayed during the presentation.

[0250] Once the presentation is complete, the server re-analyzes the data and generates improvement suggestions that can be used for future information presentations. These suggestions are then provided to the user as an automatically generated report. This entire process enables customized information presentations for customers, resulting in higher customer satisfaction and greater efficiency in commercial activities.

[0251] For example, when proposing a new communication service plan, a generative AI model can be used to input a prompt into the server such as, "Create text introducing a recommended communication plan for business professionals in their 30s," which will generate information optimized for the customer's attributes. This allows users to efficiently develop presentations that capture the customer's interest.

[0252] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0253] Step 1:

[0254] The server retrieves historical business information and social science data. This retrieval is performed by collecting data from a database using SQL queries. Query related to specific customers or presentation topics is used as input, and the output provides historical presentation data and customer feedback that match those criteria. This makes context-specific and useful data available as the basis for processing.

[0255] Step 2:

[0256] The server analyzes the acquired data to identify customer interests and response patterns. The analysis uses the Python Pandas library and scikit-learn for data cleaning and classification. The input is the raw data obtained in step 1, and the output is specific indicators and patterns that show customer interests and past responses. This step allows us to understand what kind of information customers responded favorably to.

[0257] Step 3:

[0258] The server automatically generates information representations using a generative AI model based on the analysis results. For example, a prompt such as "Create presentation text for a specific customer segment" is input, and the generative AI model responds by generating the optimal presentation text. The output consists of text and slides tailored to customer attributes. This generated content is directly used for information dissemination to customers.

[0259] Step 4:

[0260] Users present information based on presentation materials provided by the server. Upon receiving the materials, users use presentation software to organize and present them to customers. In this step, the input is the materials from the server, and the output is the actual presentation given to the customer.

[0261] Step 5:

[0262] The device captures the customer's facial expressions and voice in real time during the presentation. Using a camera and microphone, it collects the customer's facial expressions and voice as digital data. This data is then analyzed using emotion analysis software to evaluate the customer's psychological state. The results of this evaluation are output and used as an indicator to measure the customer's interest and understanding.

[0263] Step 6:

[0264] The device provides users with real-time suggestions for improving information presentations based on sentiment analysis results. The input is the sentiment analysis results from step 5, and the output is specific advice and correction instructions. For example, it might suggest, "There is a possibility that the customer's attention will be interrupted, so let's move on to the next topic."

[0265] Step 7:

[0266] After the presentation ends, the server analyzes the collected data again and generates improvement suggestions for the next information presentation. The inputs are the reaction data obtained in step 5 and the presentation results, and the output is a report containing automatically generated improvement suggestions. This report is provided to the user and used to review future activities and information presentation strategies.

[0267] (Application Example 1)

[0268] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0269] In today's commercial environment, understanding customer interests and emotions in real time and responding quickly and accurately is key to providing effective and personalized service. However, traditional methods struggle to instantly judge human emotions and interests and appropriately adjust presentations and negotiation content. This results in missed opportunities to increase customer satisfaction and hinders commercial success.

[0270] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0271] In this invention, the server includes an analysis means for analyzing business record data and psychological analysis data to identify the subject's interests and reaction tendencies; a generation means for automatically creating optimal presentation content and information based on the analysis results; and an emotion recognition means for acquiring the subject's facial expressions and voice in real time and evaluating their psychological state. This makes it possible to instantly grasp the subject's interests and emotions during presentations and business negotiations and provide optimal information on the spot.

[0272] "Business record data" refers to a collection of data containing information about past business operations, which serves as the basis for analyzing customer interests and responses.

[0273] "Psychological analysis data" refers to a collection of data gathered to understand an individual's psychological state and reactions.

[0274] "Analysis means" refers to a device or method that extracts work record data and psychological analysis data and performs a process to identify the interests and reaction tendencies of the subject.

[0275] "Generation means" refers to a device or method that automatically creates optimal presentation content and information based on the analysis results.

[0276] "Emotion recognition means" refers to a device or technology for acquiring a subject's facial expressions and voice in real time and evaluating their psychological state based on that information.

[0277] "Display means" refers to a device or technology that visually displays acquired information and clearly indicates the subject's condition or analysis results.

[0278] A "follow-up device" is a device or method for reanalyzing data collected after a presentation and automatically generating improvement suggestions for the next presentation.

[0279] The system of this invention consists of a server, a terminal, and a user working together. First, the server collects work record data and psychological analysis data, and uses dedicated software for data analysis to identify the subject's interests and reaction tendencies. Specifically, it analyzes the data using natural language processing technology. Based on the analysis results, a generative AI model automatically creates the optimal presentation content and information. This process enables the most effective information delivery in presentations and business negotiations.

[0280] Next, the device, for example in the form of smart glasses or other portable information terminals, captures the subject's facial expressions and voice in real time. This information is analyzed using emotion recognition technology to immediately evaluate the subject's psychological state. Emotion recognition technology uses facial recognition libraries such as OpenCV or DeepFace.

[0281] The user utilizes the presentation device to visually evaluate the analyzed information and adjust the presentation content in real time. This process can enhance the effectiveness of business negotiations and demonstrations.

[0282] As a specific example, a scenario of introducing a new smartphone can be considered. The server analyzes the data collected from the target person's past purchase history and product reviews to identify the interest in the camera function and battery life. Based on this, these functions are emphasized in the generated presentation text. If the terminal shows signs that the target person's expression is not interested, the user quickly switches and presents information about another attractive feature.

[0283] Examples of prompt sentences include "If the customer shows no interest in the dual camera function, please generate presentation text for emphasizing other features." This prompt is input into the generation AI model to provide appropriate alternative information.

[0284] The flow of the specific process in Application Example 1 will be described using FIG. 12.

[0285] Step 1:

[0286] The server collects business record data and psychometric data. As input, digital data related to past business and psychometric survey data are used. To analyze these data, natural language processing technology is applied to identify the target person's area of interest and reaction tendency. Through this process, the server obtains the basic data for creating the optimal presentation content and information.

[0287] Step 2:

[0288] The server uses a generative AI model based on the analysis results to automatically create optimal presentation content and information. The analysis results obtained in Step 1 are used as input. The generative AI model generates presentation text based on the prompt text, which is then output. This generated information is used in subsequent presentations.

[0289] Step 3:

[0290] The device captures the subject's facial expressions and voice in real time. It uses a camera and microphone to acquire video and audio of the subject as input. This data is processed using emotion recognition technology to evaluate their psychological state in real time. Through this process, the device obtains immediate data on the subject's behavior and reactions.

[0291] Step 4:

[0292] The user uses a presentation device to visually evaluate the presentation content based on the analyzed information and adjust it as needed. The inputs used are the presentation content generated in step 2 and the real-time sentiment analysis results obtained in step 3. Based on this information, the user appropriately modifies the presentation or sales negotiation content to effectively convey the information to the target audience.

[0293] Step 5:

[0294] After the presentation ends, the server re-analyzes the data and automatically generates improvement suggestions for the next time. The input is the data recorded during the presentation. The same natural language processing techniques used in the previous procedure are employed for this analysis. The output presents improvement suggestions, providing valuable feedback to prepare for future business negotiations and presentations.

[0295] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0296] The presentation optimization system of the present invention involves a server, a terminal, a user, and an emotion engine. First, the server collects past business presentation data and psychological data, and uses analytical means to identify customer interests and response patterns. This analysis utilizes a natural language processing model to clarify the customer's specific needs and interests.

[0297] Based on the analysis results, the server automatically generates optimal presentation text and materials using a generation mechanism. The user prepares based on these materials and delivers the presentation in real time via the terminal. During the presentation, the terminal uses emotion recognition to analyze the customer's facial expressions, voice, and even gaze in real time to determine the customer's psychological state.

[0298] Furthermore, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's own emotional state. This information is fed back to the user through the presentation system, helping them with self-awareness and providing guidance for carefully adjusting the presentation's progress.

[0299] Furthermore, this system uses follow-up methods to re-analyze the data collected after the presentation. Based on the analysis results, the server generates and provides specific improvement suggestions for the next presentation, utilizing user and customer sentiment data.

[0300] As a specific example, when presenting a new product, the server analyzes data obtained from past successful cases and provides the generated materials to the user. While conducting the presentation facing the customer, the user receives feedback on their own emotional state from the emotion engine while being monitored for the customer's emotions by the terminal. Based on this information, the content is adjusted as appropriate to achieve more effective communication. After the presentation, improvement suggestions obtained through follow-up become an important source of information to support the improvement of the success rate of the next presentation. This system enables communication that captures the emotions of both the customer and the user, maximizing the effectiveness of the presentation.

[0301] The following describes the process flow.

[0302] Step 1:

[0303] The server collects past business presentation data and psychological data. Specifically, it obtains the customer's purchase history and past presentation feedback from the database.

[0304] Step 2:

[0305] The server processes the collected data using analysis means and identifies the customer's area of interest and reaction pattern through a natural language processing model. This reveals the customer's characteristics and needs.

[0306] Step 3:

[0307] Based on the analysis results, the server uses generation means to automatically generate presentation text and materials. At this time, content and visual materials that attract the customer's interest are emphasized.

[0308] Step 4:

[0309] The user prepares for the presentation based on the automatically generated materials. The user checks the consistency of the materials and practices the presentation flow.

[0310] Step 5:

[0311] The device is used at the start of the presentation to capture the audience's facial expressions, gaze, and tone of voice in real time. This uses a camera and microphone.

[0312] Step 6:

[0313] The system analyzes data captured by the emotion recognition mechanism on the device to determine the customer's psychological state. Specifically, it identifies emotions such as interest, suspicion, and boredom.

[0314] Step 7:

[0315] The emotion engine analyzes the user's facial expressions and tone of voice in real time to recognize the user's emotional state. This allows the user to understand their own situation.

[0316] Step 8:

[0317] The device provides feedback to the user through presentation methods based on customer and user sentiment data. This allows the user to adjust the presentation accordingly.

[0318] Step 9:

[0319] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0320] Step 10:

[0321] The server provides the user with a report that includes suggestions for improvement. The user can use this report to prepare for their next presentation more effectively.

[0322] (Example 2)

[0323] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0324] In information presentation activities, there is a challenge in effectively capturing the recipient's interest and reactions, resulting in decreased efficiency of information transmission. Furthermore, it is difficult for the presenter to understand their own emotional state and appropriately adjust their presentation accordingly. To address these challenges, it is necessary to analyze the emotional states of both the presenter and the recipient in real time and achieve optimized information presentation.

[0325] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0326] In this invention, the server includes an analysis means for analyzing past information presentation data and psychological information to identify the recipient's interests and reaction patterns; a generation means for automatically generating optimal information presentation text and materials based on the analysis results; an emotion recognition means for capturing the recipient's facial expressions, voice, and gaze in real time and analyzing their psychological state; a self-recognition feedback means for recognizing the user's facial expressions and tone of voice and presenting their own emotional state; and a follow-up means for re-analyzing the data after information presentation and generating suggestions for improvement for the next time. This maximizes the effectiveness of information presentation and enables smooth communication.

[0327] "Past information presentation data" refers to all data related to presentations and explanations given in the past, including the content of statements and materials used.

[0328] "Psychological information" refers to data on recipients' emotions and behavioral patterns, based on psychological findings and research results.

[0329] "Analysis means" refers to methods and devices for processing target data and extracting important patterns and information from it.

[0330] "Generation method" refers to the method or process of creating new informational texts or materials based on the analysis results.

[0331] "Emotion recognition means" refers to methods and technologies that analyze and judge the emotional state of the recipient from facial expressions, voice, gaze, etc.

[0332] "Self-awareness feedback methods" refer to technologies and systems that present the sender's emotional state to themselves and promote self-awareness.

[0333] "Follow-up measures" refer to methods or devices for re-analyzing data after information has been presented and generating suggestions for future improvements.

[0334] This system optimizes presentations through the coordinated operation of a server, terminal, user, and emotion engine. The following describes a specific implementation of this system.

[0335] First, the server collects past information presentation data and psychological information, and analyzes this data. The server utilizes a natural language processing model to identify the recipient's interests and response patterns. This analysis makes it possible to determine what information is effective for the recipient. The analysis results are then automatically generated as optimal information presentation texts and materials through a generative AI model. Specifically, information about newly introduced products and services is constructed based on successful past presentation examples.

[0336] Users prepare their presentations based on the generated informational text and materials. The device plays a crucial role in the presentation itself. The device captures the audience's facial expressions, voice, and gaze in real time, and analyzes their psychological state using emotion recognition technology. This analysis allows for an immediate determination of how the audience is receiving the presentation.

[0337] In addition, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's emotional state. This information is fed back to the user to help them with self-awareness and to guide them in adjusting the flow of their presentation.

[0338] After the presentation ends, the server re-analyzes the data for follow-up and generates suggestions for improvement for the next presentation. This can improve the success rate of future information presentations.

[0339] As a concrete example, when giving a presentation to introduce a new product, the following prompt can be used in the AI model: "Generate new product introduction materials based on past success stories. Focus on elements that will capture the audience's attention, and include suggestions for conducting the presentation while monitoring emotions in real time."

[0340] This system enables effective information transmission by understanding the emotions of both the user and the recipient, maximizing the impact of presentations.

[0341] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0342] Step 1:

[0343] The server collects historical information presentation data and psychological information. It retrieves relevant documents and datasets from databases and the internet as input. To analyze this information, it uses a natural language processing model to identify the recipient's areas of interest. As output, it generates a profile of the recipient's interests and response patterns.

[0344] Step 2:

[0345] Based on the analysis results obtained in Step 1, the server automatically generates optimal information presentation text and materials using a generative AI model. This process involves using prompts to instruct the generative AI model and design the flow and structure of the information. The input is the analysis result profile, and the output is specific presentation text and visual materials.

[0346] Step 3:

[0347] The user prepares their presentation based on the presentation text and materials provided by the server. The user reviews the materials, customizes them as needed, and prepares for the presentation. During the preparation process, they organize their presentation method and key points.

[0348] Step 4:

[0349] The user begins a presentation and presents information through the device. The device uses emotion recognition to capture the recipient's facial expressions, voice, and gaze in real time and analyzes their psychological state. The input is real-time data from the recipient, and this analysis allows the device to understand the recipient's shifts in interest and attention. As output, a report on the recipient's psychological state is generated.

[0350] Step 5:

[0351] The emotion engine captures the user's facial expressions and tone of voice to recognize the user's own emotional state. This information is fed back to the user through the device, providing guidance for adjusting the presentation. The input is real-time data from the user, and the output generates feedback that aids self-awareness.

[0352] Step 6:

[0353] After the presentation ends, the server uses follow-up mechanisms to re-analyze the collected data. The input is all the data acquired during the presentation, and the server performs data processing to derive optimal improvement measures. As output, specific improvement suggestions for the next information presentation are generated and provided to the user.

[0354] (Application Example 2)

[0355] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0356] In modern advertising, there is a need to understand the emotions of individual viewers in real time and present the most suitable advertisements that match their psychological state. However, conventional advertising systems have the challenge of not being able to effectively capture viewers' emotions and reactions and instantly generate corresponding advertisement content. Therefore, new methods are needed to maximize the effectiveness of advertising.

[0357] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0358] In this invention, the server includes an analysis means for analyzing past data and psychological data to identify an individual's interests and reaction patterns, a generation means for automatically generating optimal content based on the analysis results, and an emotion recognition means for capturing an individual's facial expressions and voice in real time and analyzing their psychological state. This makes it possible to present dynamic advertising content that responds to the viewer's emotional state.

[0359] "Analysis methods" refer to techniques that analyze past data and psychological data to identify individuals' interests and response patterns.

[0360] "Generation means" refers to a technology that automatically creates optimal content based on the results obtained by the analysis means.

[0361] "Emotion recognition technology" refers to a technique that collects an individual's facial expressions and voice in real time and analyzes their psychological state from that data.

[0362] "Presentation methods" refer to technologies that dynamically display different information according to an individual's psychological state.

[0363] "Follow-up methods" refer to technologies that reanalyze data collected after viewing and generate suggestions for improvement for the next time.

[0364] The system for implementing the present invention aims to optimize ad delivery and mainly includes analysis means, generation means, emotion recognition means, presentation means, and follow-up means. These means are specifically processed on the server and the user's terminal.

[0365] The server processes historical and psychological data using analytical tools to identify viewer interests and reaction patterns. This analysis utilizes natural language processing technology, OpenCV for facial recognition software, and Google's speech recognition API for voice analysis software. Based on the analysis results, the server uses a generative AI model (e.g., Hugging Face's Transformers) to automatically generate advertising content optimized for the viewer.

[0366] On the user's device, personal facial expressions and voice data collected from the smartphone's camera and microphone are analyzed in real time using emotion recognition technology to determine their psychological state. To maintain viewer interest, the presentation system displays different information on the fly and adjusts the advertising content accordingly. This information is dynamically updated based on the viewer's facial expressions and voice tone while watching the video.

[0367] After the presentation ends, the server uses follow-up methods to re-analyze the viewing data and generate improvement suggestions to help with the next advertising campaign. For example, if a user smiles while watching the video, content such as "Using this product will make your everyday life more enjoyable. Click here for more details!" will be displayed.

[0368] This system enables ad delivery that resonates with viewers' emotions, maximizing advertising effectiveness.

[0369] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0370] Step 1:

[0371] The server collects historical and psychological data. The input consists of business presentation data and datasets based on psychological research. These are analyzed using natural language processing techniques to identify audience interest and response patterns. The analysis identifies which topics attract interest, and this information is used as input for the next step.

[0372] Step 2:

[0373] The server automatically generates appropriate advertising content using a generative AI model based on insights obtained from the analysis. The input is the interest and response patterns obtained in step 1, which the generative AI model processes to output broadcastable advertising text and visuals. The generated content is optimized to easily attract the viewer's interest.

[0374] Step 3:

[0375] The user's device uses a camera and microphone to capture the viewer's facial expressions and voice in real time. The input consists of audio and image data, which are analyzed by emotion recognition systems to determine the viewer's psychological state. The output is real-time emotion data from the viewer, which is used to appropriately adjust the displayed advertisements.

[0376] Step 4:

[0377] The device uses presentation tools to display the most suitable advertisement content to the viewer based on the emotion recognition results. The input is the emotion data obtained in step 3, and based on this data, it selects and dynamically displays different advertisement content. The output is the customized advertisement presented to the viewer.

[0378] Step 5:

[0379] After the presentation, the server re-analyzes the viewing data and uses follow-up methods to generate suggestions for improving the next advertising campaign. The input is the viewing data collected in steps 3 and 4, which is analyzed to evaluate its effectiveness and output suggestions for the next campaign.

[0380] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0381] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0382] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0383] [Third Embodiment]

[0384] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0385] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0386] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0387] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0388] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0389] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0390] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0391] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0392] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0393] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0394] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0395] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0396] The presentation optimization system proposed in this invention is comprised of a server, a terminal, and a user working together. The server first collects past business presentation data and psychological data, and then uses these as analysis tools to identify customer interests and reaction patterns. Based on these analysis results, it automatically creates optimal presentation text and materials using a generation tool. This allows the user to prepare presentations efficiently and effectively.

[0397] The user begins the presentation based on the generated presentation text and materials. The device captures the customer's facial expressions and voice in real time. Emotion recognition analyzes this information to determine the customer's psychological state. Then, using a presentation tool, it provides the user with suggestions for improving the presentation or new information in real time. This functionality allows the user to appropriately and instantly adjust the presentation content in response to customer reactions.

[0398] Furthermore, after the presentation ends, the server re-analyzes the collected data and generates specific improvement suggestions for the next presentation based on follow-up measures. These suggestions are provided to the user as an automatically generated report, contributing to improved sales activities and communication.

[0399] As a concrete example, when proposing a new plan for a communication product, the user approaches the customer with optimal materials generated by the server. During the meeting, the terminal presents information that will capture the customer's interest, and if it detects a lack of interest from the customer's facial expressions, it suggests an alternative approach. After the meeting, the server provides specific suggestions for improvement for the next proposal. This entire process enables presentations tailored to customer needs, thereby increasing the closing rate. This system enables communication optimized for each individual customer, enhancing customer satisfaction and contributing to improved sales efficiency.

[0400] The following describes the processing flow.

[0401] Step 1:

[0402] The server collects past business presentation data and psychological data. This includes presentation history, customer comments, and customer reaction data.

[0403] Step 2:

[0404] The server processes the collected data using analytical tools and identifies customer interests and response patterns using NLP models. This analysis reveals customer preferences and needs.

[0405] Step 3:

[0406] Based on the analysis results, the server automatically generates the optimal presentation text and materials using a generation method and presents them to the user. These materials are tailored to the customer's areas of interest and needs.

[0407] Step 4:

[0408] Based on the presentation materials provided by the user, prepare to deliver the presentation. This includes reviewing the materials and conducting a rehearsal.

[0409] Step 5:

[0410] At the start of the presentation, the device collects the customer's facial expressions and voice. The device uses its camera and microphone to capture this data in real time.

[0411] Step 6:

[0412] The device uses emotion recognition to analyze the customer's psychological state from the collected data, determining things like whether they are interested or bored.

[0413] Step 7:

[0414] The device provides users with improvement suggestions and additional information in real time through presentation methods, based on the customer's psychological state. This allows users to dynamically adjust the presentation.

[0415] Step 8:

[0416] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0417] Step 9:

[0418] The server provides users with an automatically generated report that includes improvement suggestions. This report will be used in future sales activities.

[0419] (Example 1)

[0420] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0421] In modern business activities, maximizing the effectiveness of information dissemination to customers is crucial. However, data-driven solutions for appropriately tailoring information to individual customer responses and using that feedback to improve future information disseminations are limited. Therefore, there is a need for a system that optimizes information based on dynamic customer responses and continuously improves it.

[0422] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0423] In this invention, the server includes information analysis means for analyzing past business information and social science data to identify customer interests and reactions, data generation means for automatically generating optimized information representations based on the analysis results, and reaction analysis means for acquiring customer facial expressions and voices in real time and analyzing their psychological state. This maximizes the effectiveness of information announcements based on dynamic reactions of each customer and enables the provision of concrete improvement suggestions for future information announcements.

[0424] "Information analysis tools" refer to functions that collect past business information and social science data, and analyze them to identify customer interests and reactions.

[0425] "Data generation means" refers to a function that automatically creates an optimized information representation based on the analysis results obtained by the information analysis means.

[0426] "Reaction analysis means" refers to a function that acquires customer facial expressions and voices in real time, analyzes them, and determines the customer's psychological state.

[0427] "Information correction measures" refer to functions that provide suggestions for improving information dissemination or offering new information based on the customer's psychological state.

[0428] "Method for proposing improvement measures" refers to a function that reanalyzes data collected after an information release and generates specific improvement proposals for the next information release.

[0429] This invention is a system in which a server, a terminal, and a user collaborate to optimize presentation information. The server first acquires business information and social science data. This is done using a database management system and collecting data using query languages such as SQL. Subsequently, analysis software such as Python's Pandas and scikit-learn is used for data analysis, and machine learning techniques are applied to clarify customer behavior patterns and interests.

[0430] Based on these analysis results, the server automatically generates information representations using a generative AI model. This process utilizes an AI model known for natural language processing, specifically taking the prompt "Create presentation text for a specific customer segment" as input. The generated text can be used in presentation software such as Google Slides or Microsoft PowerPoint.

[0431] The user receives materials generated by the server and presents the information to the customer. During the meeting, the terminal uses a camera and microphone to capture the customer's facial expressions and voice in real time. This data is used to evaluate the customer's psychological state by sentiment analysis software (for example, an API provided by a cloud provider). The terminal is equipped with means to modify the information based on the customer's reactions, and specific advice such as "The customer's interest is waning, let's provide additional information in the next section" will be displayed during the presentation.

[0432] Once the presentation is complete, the server re-analyzes the data and generates improvement suggestions that can be used for future information presentations. These suggestions are then provided to the user as an automatically generated report. This entire process enables customized information presentations for customers, resulting in higher customer satisfaction and greater efficiency in commercial activities.

[0433] For example, when proposing a new communication service plan, a generative AI model can be used to input a prompt into the server such as, "Create text introducing a recommended communication plan for business professionals in their 30s," which will generate information optimized for the customer's attributes. This allows users to efficiently develop presentations that capture the customer's interest.

[0434] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0435] Step 1:

[0436] The server retrieves historical business information and social science data. This retrieval is performed by collecting data from a database using SQL queries. Query related to specific customers or presentation topics is used as input, and the output provides historical presentation data and customer feedback that match those criteria. This makes context-specific and useful data available as the basis for processing.

[0437] Step 2:

[0438] The server analyzes the acquired data to identify customer interests and response patterns. The analysis uses the Python Pandas library and scikit-learn for data cleaning and classification. The input is the raw data obtained in step 1, and the output is specific indicators and patterns that show customer interests and past responses. This step allows us to understand what kind of information customers responded favorably to.

[0439] Step 3:

[0440] The server automatically generates information representations using a generative AI model based on the analysis results. For example, a prompt such as "Create presentation text for a specific customer segment" is input, and the generative AI model responds by generating the optimal presentation text. The output consists of text and slides tailored to customer attributes. This generated content is directly used for information dissemination to customers.

[0441] Step 4:

[0442] Users present information based on presentation materials provided by the server. Upon receiving the materials, users use presentation software to organize and present them to customers. In this step, the input is the materials from the server, and the output is the actual presentation given to the customer.

[0443] Step 5:

[0444] The device captures the customer's facial expressions and voice in real time during the presentation. Using a camera and microphone, it collects the customer's facial expressions and voice as digital data. This data is then analyzed using emotion analysis software to evaluate the customer's psychological state. The results of this evaluation are output and used as an indicator to measure the customer's interest and understanding.

[0445] Step 6:

[0446] The device provides users with real-time suggestions for improving information presentations based on sentiment analysis results. The input is the sentiment analysis results from step 5, and the output is specific advice and correction instructions. For example, it might suggest, "There is a possibility that the customer's attention will be interrupted, so let's move on to the next topic."

[0447] Step 7:

[0448] After the presentation ends, the server analyzes the collected data again and generates improvement suggestions for the next information presentation. The inputs are the reaction data obtained in step 5 and the presentation results, and the output is a report containing automatically generated improvement suggestions. This report is provided to the user and used to review future activities and information presentation strategies.

[0449] (Application Example 1)

[0450] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0451] In today's commercial environment, understanding customer interests and emotions in real time and responding quickly and accurately is key to providing effective and personalized service. However, traditional methods struggle to instantly judge human emotions and interests and appropriately adjust presentations and negotiation content. This results in missed opportunities to increase customer satisfaction and hinders commercial success.

[0452] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0453] In this invention, the server includes an analysis means for analyzing business record data and psychological analysis data to identify the subject's interests and reaction tendencies; a generation means for automatically creating optimal presentation content and information based on the analysis results; and an emotion recognition means for acquiring the subject's facial expressions and voice in real time and evaluating their psychological state. This makes it possible to instantly grasp the subject's interests and emotions during presentations and business negotiations and provide optimal information on the spot.

[0454] "Business record data" refers to a collection of data containing information about past business operations, which serves as the basis for analyzing customer interests and responses.

[0455] "Psychological analysis data" refers to a collection of data gathered to understand an individual's psychological state and reactions.

[0456] "Analysis means" refers to a device or method that extracts work record data and psychological analysis data and performs a process to identify the interests and reaction tendencies of the subject.

[0457] "Generation means" refers to a device or method that automatically creates optimal presentation content and information based on the analysis results.

[0458] "Emotion recognition means" refers to a device or technology for acquiring a subject's facial expressions and voice in real time and evaluating their psychological state based on that information.

[0459] "Display means" refers to a device or technology that visually displays acquired information and clearly indicates the subject's condition or analysis results.

[0460] A "follow-up device" is a device or method for reanalyzing data collected after a presentation and automatically generating improvement suggestions for the next presentation.

[0461] The system of this invention consists of a server, a terminal, and a user working together. First, the server collects work record data and psychological analysis data, and uses dedicated software for data analysis to identify the subject's interests and reaction tendencies. Specifically, it analyzes the data using natural language processing technology. Based on the analysis results, a generative AI model automatically creates the optimal presentation content and information. This process enables the most effective information delivery in presentations and business negotiations.

[0462] Next, the device, for example in the form of smart glasses or other portable information terminals, captures the subject's facial expressions and voice in real time. This information is analyzed using emotion recognition technology to immediately evaluate the subject's psychological state. Emotion recognition technology uses facial recognition libraries such as OpenCV or DeepFace.

[0463] Users utilize the presentation device to visually evaluate the analyzed information and adjust the presentation content in real time. This process can enhance the effectiveness of business negotiations and demonstrations.

[0464] A concrete example would be a scenario introducing a new smartphone. The server analyzes data collected from the target audience's past purchase history and product reviews to identify their interests in camera features and battery life. Based on this, the generated presentation text emphasizes these features. If the audience's facial expression on the device indicates a lack of interest, the user quickly switches to presenting information about other appealing features.

[0465] An example of a prompt might be, "If the customer is not interested in the dual-camera feature, generate presentation text to highlight other features." This prompt is then fed into the generative AI model to ensure that appropriate alternative information is provided.

[0466] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0467] Step 1:

[0468] The server collects work record data and psychological analysis data. It uses digital data related to past work and psychological survey data as input. Natural language processing techniques are applied to analyze this data to identify the subjects' areas of interest and response tendencies. This process provides the server with the foundational data needed to create optimal presentations and information.

[0469] Step 2:

[0470] The server uses a generative AI model based on the analysis results to automatically create optimal presentation content and information. The analysis results obtained in Step 1 are used as input. The generative AI model generates presentation text based on the prompt text, which is then output. This generated information is used in subsequent presentations.

[0471] Step 3:

[0472] The device captures the subject's facial expressions and voice in real time. It uses a camera and microphone to acquire video and audio of the subject as input. This data is processed using emotion recognition technology to evaluate their psychological state in real time. Through this process, the device obtains immediate data on the subject's behavior and reactions.

[0473] Step 4:

[0474] The user uses a presentation device to visually evaluate the presentation content based on the analyzed information and adjust it as needed. The inputs used are the presentation content generated in step 2 and the real-time sentiment analysis results obtained in step 3. Based on this information, the user appropriately modifies the presentation or sales negotiation content to effectively convey the information to the target audience.

[0475] Step 5:

[0476] After the presentation ends, the server re-analyzes the data and automatically generates improvement suggestions for the next time. The input is the data recorded during the presentation. The same natural language processing techniques used in the previous procedure are employed for this analysis. The output presents improvement suggestions, providing valuable feedback to prepare for future business negotiations and presentations.

[0477] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0478] The presentation optimization system of the present invention involves a server, a terminal, a user, and an emotion engine. First, the server collects past business presentation data and psychological data, and uses analytical means to identify customer interests and response patterns. This analysis utilizes a natural language processing model to clarify the customer's specific needs and interests.

[0479] Based on the analysis results, the server automatically generates optimal presentation text and materials using a generation mechanism. The user prepares based on these materials and delivers the presentation in real time via the terminal. During the presentation, the terminal uses emotion recognition to analyze the customer's facial expressions, voice, and even gaze in real time to determine the customer's psychological state.

[0480] Furthermore, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's own emotional state. This information is fed back to the user through the presentation system, helping them with self-awareness and providing guidance for carefully adjusting the presentation's progress.

[0481] Furthermore, this system uses follow-up methods to re-analyze the data collected after the presentation. Based on the analysis results, the server generates and provides specific improvement suggestions for the next presentation, utilizing user and customer sentiment data.

[0482] As a concrete example, when giving a presentation introducing a new product, the server analyzes data obtained from past success stories and provides the user with generated materials. As the user conducts the presentation while interacting with the customer, they receive feedback on their own emotional state from an emotion engine, while also having their emotions monitored by a terminal. Based on this information, they adjust the content as needed to achieve more effective communication. After the presentation, improvement suggestions obtained through follow-up become an important source of information that supports the success rate of future presentations. This system enables communication that captures the emotions of both the customer and the user, maximizing the effectiveness of the presentation.

[0483] The following describes the processing flow.

[0484] Step 1:

[0485] The server collects past business presentation data and psychological data. Specifically, it retrieves customer purchase history and past presentation feedback from the database.

[0486] Step 2:

[0487] The server processes the collected data using analytical tools, and a natural language processing model identifies customer areas of interest and response patterns. This reveals customer characteristics and needs.

[0488] Step 3:

[0489] Based on the analysis results, the server automatically generates presentation text and materials using a generation mechanism. During this process, content and visual materials that are likely to attract the customer's interest are highlighted.

[0490] Step 4:

[0491] Users prepare their presentations based on automatically generated materials. They verify the consistency of the materials and practice the flow of their presentations.

[0492] Step 5:

[0493] The device is used at the start of the presentation to capture the audience's facial expressions, gaze, and tone of voice in real time. This uses a camera and microphone.

[0494] Step 6:

[0495] The system analyzes data captured by the emotion recognition mechanism on the device to determine the customer's psychological state. Specifically, it identifies emotions such as interest, suspicion, and boredom.

[0496] Step 7:

[0497] The emotion engine analyzes the user's facial expressions and tone of voice in real time to recognize the user's emotional state. This allows the user to understand their own situation.

[0498] Step 8:

[0499] The device provides feedback to the user through presentation methods based on customer and user sentiment data. This allows the user to adjust the presentation accordingly.

[0500] Step 9:

[0501] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0502] Step 10:

[0503] The server provides the user with a report that includes suggestions for improvement. The user can use this report to prepare for their next presentation more effectively.

[0504] (Example 2)

[0505] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0506] In information presentation activities, there is a challenge in effectively capturing the recipient's interest and reactions, resulting in decreased efficiency of information transmission. Furthermore, it is difficult for the presenter to understand their own emotional state and appropriately adjust their presentation accordingly. To address these challenges, it is necessary to analyze the emotional states of both the presenter and the recipient in real time and achieve optimized information presentation.

[0507] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0508] In this invention, the server includes an analysis means for analyzing past information presentation data and psychological information to identify the recipient's interests and reaction patterns; a generation means for automatically generating optimal information presentation text and materials based on the analysis results; an emotion recognition means for capturing the recipient's facial expressions, voice, and gaze in real time and analyzing their psychological state; a self-recognition feedback means for recognizing the user's facial expressions and tone of voice and presenting their own emotional state; and a follow-up means for re-analyzing the data after information presentation and generating suggestions for improvement for the next time. This maximizes the effectiveness of information presentation and enables smooth communication.

[0509] "Past information presentation data" refers to all data related to presentations and explanations given in the past, including the content of statements and materials used.

[0510] "Psychological information" refers to data on recipients' emotions and behavioral patterns, based on psychological findings and research results.

[0511] "Analysis means" refers to methods and devices for processing target data and extracting important patterns and information from it.

[0512] "Generation method" refers to the method or process of creating new informational texts or materials based on the analysis results.

[0513] "Emotion recognition means" refers to methods and technologies that analyze and judge the emotional state of the recipient from facial expressions, voice, gaze, etc.

[0514] "Self-awareness feedback methods" refer to technologies and systems that present the sender's emotional state to themselves and promote self-awareness.

[0515] "Follow-up measures" refer to methods or devices for re-analyzing data after information has been presented and generating suggestions for future improvements.

[0516] This system optimizes presentations through the coordinated operation of a server, terminal, user, and emotion engine. The following describes a specific implementation of this system.

[0517] First, the server collects past information presentation data and psychological information, and analyzes this data. The server utilizes a natural language processing model to identify the recipient's interests and response patterns. This analysis makes it possible to determine what information is effective for the recipient. The analysis results are then automatically generated as optimal information presentation texts and materials through a generative AI model. Specifically, information about newly introduced products and services is constructed based on successful past presentation examples.

[0518] Users prepare their presentations based on the generated informational text and materials. The device plays a crucial role in the presentation itself. The device captures the audience's facial expressions, voice, and gaze in real time, and analyzes their psychological state using emotion recognition technology. This analysis allows for an immediate determination of how the audience is receiving the presentation.

[0519] In addition, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's emotional state. This information is fed back to the user to help them with self-awareness and to guide them in adjusting the flow of their presentation.

[0520] After the presentation ends, the server re-analyzes the data for follow-up and generates suggestions for improvement for the next presentation. This can improve the success rate of future information presentations.

[0521] As a concrete example, when giving a presentation to introduce a new product, the following prompt can be used in the AI model: "Generate new product introduction materials based on past success stories. Focus on elements that will capture the audience's attention, and include suggestions for conducting the presentation while monitoring emotions in real time."

[0522] This system enables effective information transmission by understanding the emotions of both the user and the recipient, maximizing the impact of presentations.

[0523] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0524] Step 1:

[0525] The server collects historical information presentation data and psychological information. It retrieves relevant documents and datasets from databases and the internet as input. To analyze this information, it uses a natural language processing model to identify the recipient's areas of interest. As output, it generates a profile of the recipient's interests and response patterns.

[0526] Step 2:

[0527] Based on the analysis results obtained in Step 1, the server automatically generates optimal information presentation text and materials using a generative AI model. This process involves using prompts to instruct the generative AI model and design the flow and structure of the information. The input is the analysis result profile, and the output is specific presentation text and visual materials.

[0528] Step 3:

[0529] The user prepares their presentation based on the presentation text and materials provided by the server. The user reviews the materials, customizes them as needed, and prepares for the presentation. During the preparation process, they organize their presentation method and key points.

[0530] Step 4:

[0531] The user begins a presentation and presents information through the device. The device uses emotion recognition to capture the recipient's facial expressions, voice, and gaze in real time and analyzes their psychological state. The input is real-time data from the recipient, and this analysis allows the device to understand the recipient's shifts in interest and attention. As output, a report on the recipient's psychological state is generated.

[0532] Step 5:

[0533] The emotion engine captures the user's facial expressions and tone of voice to recognize the user's own emotional state. This information is fed back to the user through the device, providing guidance for adjusting the presentation. The input is real-time data from the user, and the output generates feedback that aids self-awareness.

[0534] Step 6:

[0535] After the presentation ends, the server uses follow-up mechanisms to re-analyze the collected data. The input is all the data acquired during the presentation, and the server performs data processing to derive optimal improvement measures. As output, specific improvement suggestions for the next information presentation are generated and provided to the user.

[0536] (Application Example 2)

[0537] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0538] In modern advertising, there is a need to understand the emotions of individual viewers in real time and present the most suitable advertisements that match their psychological state. However, conventional advertising systems have the challenge of not being able to effectively capture viewers' emotions and reactions and instantly generate corresponding advertisement content. Therefore, new methods are needed to maximize the effectiveness of advertising.

[0539] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0540] In this invention, the server includes an analysis means for analyzing past data and psychological data to identify an individual's interests and reaction patterns, a generation means for automatically generating optimal content based on the analysis results, and an emotion recognition means for capturing an individual's facial expressions and voice in real time and analyzing their psychological state. This makes it possible to present dynamic advertising content that responds to the viewer's emotional state.

[0541] "Analysis methods" refer to techniques that analyze past data and psychological data to identify individuals' interests and response patterns.

[0542] "Generation means" refers to a technology that automatically creates optimal content based on the results obtained by the analysis means.

[0543] "Emotion recognition technology" refers to a technique that collects an individual's facial expressions and voice in real time and analyzes their psychological state from that data.

[0544] "Presentation methods" refer to technologies that dynamically display different information according to an individual's psychological state.

[0545] "Follow-up methods" refer to technologies that reanalyze data collected after viewing and generate suggestions for improvement for the next time.

[0546] The system for implementing the present invention aims to optimize ad delivery and mainly includes analysis means, generation means, emotion recognition means, presentation means, and follow-up means. These means are specifically processed on the server and the user's terminal.

[0547] The server processes historical and psychological data using analytical tools to identify viewer interests and reaction patterns. This analysis utilizes natural language processing technology, OpenCV for facial recognition software, and Google's speech recognition API for voice analysis software. Based on the analysis results, the server uses a generative AI model (e.g., Hugging Face's Transformers) to automatically generate advertising content optimized for the viewer.

[0548] On the user's device, personal facial expressions and voice data collected from the smartphone's camera and microphone are analyzed in real time using emotion recognition technology to determine their psychological state. To maintain viewer interest, the presentation system displays different information on the fly and adjusts the advertising content accordingly. This information is dynamically updated based on the viewer's facial expressions and voice tone while watching the video.

[0549] After the presentation ends, the server uses follow-up methods to re-analyze the viewing data and generate improvement suggestions to help with the next advertising campaign. For example, if a user smiles while watching the video, content such as "Using this product will make your everyday life more enjoyable. Click here for more details!" will be displayed.

[0550] This system enables ad delivery that resonates with viewers' emotions, maximizing advertising effectiveness.

[0551] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0552] Step 1:

[0553] The server collects historical and psychological data. The input consists of business presentation data and datasets based on psychological research. These are analyzed using natural language processing techniques to identify audience interest and response patterns. The analysis identifies which topics attract interest, and this information is used as input for the next step.

[0554] Step 2:

[0555] The server automatically generates appropriate advertising content using a generative AI model based on insights obtained from the analysis. The input is the interest and response patterns obtained in step 1, which the generative AI model processes to output broadcastable advertising text and visuals. The generated content is optimized to easily attract the viewer's interest.

[0556] Step 3:

[0557] The user's device uses a camera and microphone to capture the viewer's facial expressions and voice in real time. The input consists of audio and image data, which are analyzed by emotion recognition systems to determine the viewer's psychological state. The output is real-time emotion data from the viewer, which is used to appropriately adjust the displayed advertisements.

[0558] Step 4:

[0559] The device uses presentation tools to display the most suitable advertisement content to the viewer based on the emotion recognition results. The input is the emotion data obtained in step 3, and based on this data, it selects and dynamically displays different advertisement content. The output is the customized advertisement presented to the viewer.

[0560] Step 5:

[0561] After the presentation, the server re-analyzes the viewing data and uses follow-up methods to generate suggestions for improving the next advertising campaign. The input is the viewing data collected in steps 3 and 4, which is analyzed to evaluate its effectiveness and output suggestions for the next campaign.

[0562] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0563] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0564] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0565] [Fourth Embodiment]

[0566] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0567] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0568] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0569] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0570] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0571] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0572] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0573] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0574] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0575] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0576] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0577] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0578] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0579] The presentation optimization system proposed in this invention is comprised of a server, a terminal, and a user working together. The server first collects past business presentation data and psychological data, and then uses these as analysis tools to identify customer interests and reaction patterns. Based on these analysis results, it automatically creates optimal presentation text and materials using a generation tool. This allows the user to prepare presentations efficiently and effectively.

[0580] The user begins the presentation based on the generated presentation text and materials. The device captures the customer's facial expressions and voice in real time. Emotion recognition analyzes this information to determine the customer's psychological state. Then, using a presentation tool, it provides the user with suggestions for improving the presentation or new information in real time. This functionality allows the user to appropriately and instantly adjust the presentation content in response to customer reactions.

[0581] Furthermore, after the presentation ends, the server re-analyzes the collected data and generates specific improvement suggestions for the next presentation based on follow-up measures. These suggestions are provided to the user as an automatically generated report, contributing to improved sales activities and communication.

[0582] As a concrete example, when proposing a new plan for a communication product, the user approaches the customer with optimal materials generated by the server. During the meeting, the terminal presents information that will capture the customer's interest, and if it detects a lack of interest from the customer's facial expressions, it suggests an alternative approach. After the meeting, the server provides specific suggestions for improvement for the next proposal. This entire process enables presentations tailored to customer needs, thereby increasing the closing rate. This system enables communication optimized for each individual customer, enhancing customer satisfaction and contributing to improved sales efficiency.

[0583] The following describes the processing flow.

[0584] Step 1:

[0585] The server collects past business presentation data and psychological data. This includes presentation history, customer comments, and customer reaction data.

[0586] Step 2:

[0587] The server processes the collected data using analytical tools and identifies customer interests and response patterns using NLP models. This analysis reveals customer preferences and needs.

[0588] Step 3:

[0589] Based on the analysis results, the server automatically generates the optimal presentation text and materials using a generation method and presents them to the user. These materials are tailored to the customer's areas of interest and needs.

[0590] Step 4:

[0591] Based on the presentation materials provided by the user, prepare to deliver the presentation. This includes reviewing the materials and conducting a rehearsal.

[0592] Step 5:

[0593] At the start of the presentation, the device collects the customer's facial expressions and voice. The device uses its camera and microphone to capture this data in real time.

[0594] Step 6:

[0595] The device uses emotion recognition to analyze the customer's psychological state from the collected data, determining things like whether they are interested or bored.

[0596] Step 7:

[0597] The device provides users with improvement suggestions and additional information in real time through presentation methods, based on the customer's psychological state. This allows users to dynamically adjust the presentation.

[0598] Step 8:

[0599] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0600] Step 9:

[0601] The server provides users with an automatically generated report that includes improvement suggestions. This report will be used in future sales activities.

[0602] (Example 1)

[0603] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0604] In modern business activities, maximizing the effectiveness of information dissemination to customers is crucial. However, data-driven solutions for appropriately tailoring information to individual customer responses and using that feedback to improve future information disseminations are limited. Therefore, there is a need for a system that optimizes information based on dynamic customer responses and continuously improves it.

[0605] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0606] In this invention, the server includes information analysis means for analyzing past business information and social science data to identify customer interests and reactions, data generation means for automatically generating optimized information representations based on the analysis results, and reaction analysis means for acquiring customer facial expressions and voices in real time and analyzing their psychological state. This maximizes the effectiveness of information announcements based on dynamic reactions of each customer and enables the provision of concrete improvement suggestions for future information announcements.

[0607] "Information analysis tools" refer to functions that collect past business information and social science data, and analyze them to identify customer interests and reactions.

[0608] "Data generation means" refers to a function that automatically creates an optimized information representation based on the analysis results obtained by the information analysis means.

[0609] "Reaction analysis means" refers to a function that acquires customer facial expressions and voices in real time, analyzes them, and determines the customer's psychological state.

[0610] "Information correction measures" refer to functions that provide suggestions for improving information dissemination or offering new information based on the customer's psychological state.

[0611] "Method for proposing improvement measures" refers to a function that reanalyzes data collected after an information release and generates specific improvement proposals for the next information release.

[0612] This invention is a system in which a server, a terminal, and a user collaborate to optimize presentation information. The server first acquires business information and social science data. This is done using a database management system and collecting data using query languages such as SQL. Subsequently, analysis software such as Python's Pandas and scikit-learn is used for data analysis, and machine learning techniques are applied to clarify customer behavior patterns and interests.

[0613] Based on these analysis results, the server automatically generates information representations using a generative AI model. This process utilizes an AI model known for natural language processing, specifically taking the prompt "Create presentation text for a specific customer segment" as input. The generated text can be used in presentation software such as Google Slides or Microsoft PowerPoint.

[0614] The user receives materials generated by the server and presents the information to the customer. During the meeting, the terminal uses a camera and microphone to capture the customer's facial expressions and voice in real time. This data is used to evaluate the customer's psychological state by sentiment analysis software (for example, an API provided by a cloud provider). The terminal is equipped with means to modify the information based on the customer's reactions, and specific advice such as "The customer's interest is waning, let's provide additional information in the next section" will be displayed during the presentation.

[0615] Once the presentation is complete, the server re-analyzes the data and generates improvement suggestions that can be used for future information presentations. These suggestions are then provided to the user as an automatically generated report. This entire process enables customized information presentations for customers, resulting in higher customer satisfaction and greater efficiency in commercial activities.

[0616] For example, when proposing a new communication service plan, a generative AI model can be used to input a prompt into the server such as, "Create text introducing a recommended communication plan for business professionals in their 30s," which will generate information optimized for the customer's attributes. This allows users to efficiently develop presentations that capture the customer's interest.

[0617] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0618] Step 1:

[0619] The server retrieves historical business information and social science data. This retrieval is performed by collecting data from a database using SQL queries. Query related to specific customers or presentation topics is used as input, and the output provides historical presentation data and customer feedback that match those criteria. This makes context-specific and useful data available as the basis for processing.

[0620] Step 2:

[0621] The server analyzes the acquired data to identify customer interests and response patterns. The analysis uses the Python Pandas library and scikit-learn for data cleaning and classification. The input is the raw data obtained in step 1, and the output is specific indicators and patterns that show customer interests and past responses. This step allows us to understand what kind of information customers responded favorably to.

[0622] Step 3:

[0623] The server automatically generates information representations using a generative AI model based on the analysis results. For example, a prompt such as "Create presentation text for a specific customer segment" is input, and the generative AI model responds by generating the optimal presentation text. The output consists of text and slides tailored to customer attributes. This generated content is directly used for information dissemination to customers.

[0624] Step 4:

[0625] Users present information based on presentation materials provided by the server. Upon receiving the materials, users use presentation software to organize and present them to customers. In this step, the input is the materials from the server, and the output is the actual presentation given to the customer.

[0626] Step 5:

[0627] The device captures the customer's facial expressions and voice in real time during the presentation. Using a camera and microphone, it collects the customer's facial expressions and voice as digital data. This data is then analyzed using emotion analysis software to evaluate the customer's psychological state. The results of this evaluation are output and used as an indicator to measure the customer's interest and understanding.

[0628] Step 6:

[0629] The device provides users with real-time suggestions for improving information presentations based on sentiment analysis results. The input is the sentiment analysis results from step 5, and the output is specific advice and correction instructions. For example, it might suggest, "There is a possibility that the customer's attention will be interrupted, so let's move on to the next topic."

[0630] Step 7:

[0631] After the presentation ends, the server analyzes the collected data again and generates improvement suggestions for the next information presentation. The inputs are the reaction data obtained in step 5 and the presentation results, and the output is a report containing automatically generated improvement suggestions. This report is provided to the user and used to review future activities and information presentation strategies.

[0632] (Application Example 1)

[0633] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0634] In today's commercial environment, understanding customer interests and emotions in real time and responding quickly and accurately is key to providing effective and personalized service. However, traditional methods struggle to instantly judge human emotions and interests and appropriately adjust presentations and negotiation content. This results in missed opportunities to increase customer satisfaction and hinders commercial success.

[0635] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0636] In this invention, the server includes an analysis means for analyzing business record data and psychological analysis data to identify the subject's interests and reaction tendencies; a generation means for automatically creating optimal presentation content and information based on the analysis results; and an emotion recognition means for acquiring the subject's facial expressions and voice in real time and evaluating their psychological state. This makes it possible to instantly grasp the subject's interests and emotions during presentations and business negotiations and provide optimal information on the spot.

[0637] "Business record data" refers to a collection of data containing information about past business operations, which serves as the basis for analyzing customer interests and responses.

[0638] "Psychological analysis data" refers to a collection of data gathered to understand an individual's psychological state and reactions.

[0639] "Analysis means" refers to a device or method that extracts work record data and psychological analysis data and performs a process to identify the interests and reaction tendencies of the subject.

[0640] "Generation means" refers to a device or method that automatically creates optimal presentation content and information based on the analysis results.

[0641] "Emotion recognition means" refers to a device or technology for acquiring a subject's facial expressions and voice in real time and evaluating their psychological state based on that information.

[0642] "Display means" refers to a device or technology that visually displays acquired information and clearly indicates the subject's condition or analysis results.

[0643] A "follow-up device" is a device or method for reanalyzing data collected after a presentation and automatically generating improvement suggestions for the next presentation.

[0644] The system of this invention consists of a server, a terminal, and a user working together. First, the server collects work record data and psychological analysis data, and uses dedicated software for data analysis to identify the subject's interests and reaction tendencies. Specifically, it analyzes the data using natural language processing technology. Based on the analysis results, a generative AI model automatically creates the optimal presentation content and information. This process enables the most effective information delivery in presentations and business negotiations.

[0645] Next, the device, for example in the form of smart glasses or other portable information terminals, captures the subject's facial expressions and voice in real time. This information is analyzed using emotion recognition technology to immediately evaluate the subject's psychological state. Emotion recognition technology uses facial recognition libraries such as OpenCV or DeepFace.

[0646] Users utilize the presentation device to visually evaluate the analyzed information and adjust the presentation content in real time. This process can enhance the effectiveness of business negotiations and demonstrations.

[0647] A concrete example would be a scenario introducing a new smartphone. The server analyzes data collected from the target audience's past purchase history and product reviews to identify their interests in camera features and battery life. Based on this, the generated presentation text emphasizes these features. If the audience's facial expression on the device indicates a lack of interest, the user quickly switches to presenting information about other appealing features.

[0648] An example of a prompt might be, "If the customer is not interested in the dual-camera feature, generate presentation text to highlight other features." This prompt is then fed into the generative AI model to ensure that appropriate alternative information is provided.

[0649] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0650] Step 1:

[0651] The server collects work record data and psychological analysis data. It uses digital data related to past work and psychological survey data as input. Natural language processing techniques are applied to analyze this data to identify the subjects' areas of interest and response tendencies. This process provides the server with the foundational data needed to create optimal presentations and information.

[0652] Step 2:

[0653] The server uses a generative AI model based on the analysis results to automatically create optimal presentation content and information. The analysis results obtained in Step 1 are used as input. The generative AI model generates presentation text based on the prompt text, which is then output. This generated information is used in subsequent presentations.

[0654] Step 3:

[0655] The device captures the subject's facial expressions and voice in real time. It uses a camera and microphone to acquire video and audio of the subject as input. This data is processed using emotion recognition technology to evaluate their psychological state in real time. Through this process, the device obtains immediate data on the subject's behavior and reactions.

[0656] Step 4:

[0657] The user uses a presentation device to visually evaluate the presentation content based on the analyzed information and adjust it as needed. The inputs used are the presentation content generated in step 2 and the real-time sentiment analysis results obtained in step 3. Based on this information, the user appropriately modifies the presentation or sales negotiation content to effectively convey the information to the target audience.

[0658] Step 5:

[0659] After the presentation ends, the server re-analyzes the data and automatically generates improvement suggestions for the next time. The input is the data recorded during the presentation. The same natural language processing techniques used in the previous procedure are employed for this analysis. The output presents improvement suggestions, providing valuable feedback to prepare for future business negotiations and presentations.

[0660] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0661] The presentation optimization system of the present invention involves a server, a terminal, a user, and an emotion engine. First, the server collects past business presentation data and psychological data, and uses analytical means to identify customer interests and response patterns. This analysis utilizes a natural language processing model to clarify the customer's specific needs and interests.

[0662] Based on the analysis results, the server automatically generates optimal presentation text and materials using a generation mechanism. The user prepares based on these materials and delivers the presentation in real time via the terminal. During the presentation, the terminal uses emotion recognition to analyze the customer's facial expressions, voice, and even gaze in real time to determine the customer's psychological state.

[0663] Furthermore, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's own emotional state. This information is fed back to the user through the presentation system, helping them with self-awareness and providing guidance for carefully adjusting the presentation's progress.

[0664] Furthermore, this system uses follow-up methods to re-analyze the data collected after the presentation. Based on the analysis results, the server generates and provides specific improvement suggestions for the next presentation, utilizing user and customer sentiment data.

[0665] As a concrete example, when giving a presentation introducing a new product, the server analyzes data obtained from past success stories and provides the user with generated materials. As the user conducts the presentation while interacting with the customer, they receive feedback on their own emotional state from an emotion engine, while also having their emotions monitored by a terminal. Based on this information, they adjust the content as needed to achieve more effective communication. After the presentation, improvement suggestions obtained through follow-up become an important source of information that supports the success rate of future presentations. This system enables communication that captures the emotions of both the customer and the user, maximizing the effectiveness of the presentation.

[0666] The following describes the processing flow.

[0667] Step 1:

[0668] The server collects past business presentation data and psychological data. Specifically, it retrieves customer purchase history and past presentation feedback from the database.

[0669] Step 2:

[0670] The server processes the collected data using analytical tools, and a natural language processing model identifies customer areas of interest and response patterns. This reveals customer characteristics and needs.

[0671] Step 3:

[0672] Based on the analysis results, the server automatically generates presentation text and materials using a generation mechanism. During this process, content and visual materials that are likely to attract the customer's interest are highlighted.

[0673] Step 4:

[0674] Users prepare their presentations based on automatically generated materials. They verify the consistency of the materials and practice the flow of their presentations.

[0675] Step 5:

[0676] The device is used at the start of the presentation to capture the audience's facial expressions, gaze, and tone of voice in real time. This uses a camera and microphone.

[0677] Step 6:

[0678] The system analyzes data captured by the emotion recognition mechanism on the device to determine the customer's psychological state. Specifically, it identifies emotions such as interest, suspicion, and boredom.

[0679] Step 7:

[0680] The emotion engine analyzes the user's facial expressions and tone of voice in real time to recognize the user's emotional state. This allows the user to understand their own situation.

[0681] Step 8:

[0682] The device provides feedback to the user through presentation methods based on customer and user sentiment data. This allows the user to adjust the presentation accordingly.

[0683] Step 9:

[0684] After the presentation ends, the server re-analyzes the data collected during the presentation and generates improvement suggestions for the next presentation using follow-up methods.

[0685] Step 10:

[0686] The server provides the user with a report that includes suggestions for improvement. The user can use this report to prepare for their next presentation more effectively.

[0687] (Example 2)

[0688] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0689] In information presentation activities, there is a challenge in effectively capturing the recipient's interest and reactions, resulting in decreased efficiency of information transmission. Furthermore, it is difficult for the presenter to understand their own emotional state and appropriately adjust their presentation accordingly. To address these challenges, it is necessary to analyze the emotional states of both the presenter and the recipient in real time and achieve optimized information presentation.

[0690] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0691] In this invention, the server includes an analysis means for analyzing past information presentation data and psychological information to identify the recipient's interests and reaction patterns; a generation means for automatically generating optimal information presentation text and materials based on the analysis results; an emotion recognition means for capturing the recipient's facial expressions, voice, and gaze in real time and analyzing their psychological state; a self-recognition feedback means for recognizing the user's facial expressions and tone of voice and presenting their own emotional state; and a follow-up means for re-analyzing the data after information presentation and generating suggestions for improvement for the next time. This maximizes the effectiveness of information presentation and enables smooth communication.

[0692] "Past information presentation data" refers to all data related to presentations and explanations given in the past, including the content of statements and materials used.

[0693] "Psychological information" refers to data on recipients' emotions and behavioral patterns, based on psychological findings and research results.

[0694] "Analysis means" refers to methods and devices for processing target data and extracting important patterns and information from it.

[0695] "Generation method" refers to the method or process of creating new informational texts or materials based on the analysis results.

[0696] "Emotion recognition means" refers to methods and technologies that analyze and judge the emotional state of the recipient from facial expressions, voice, gaze, etc.

[0697] "Self-awareness feedback methods" refer to technologies and systems that present the sender's emotional state to themselves and promote self-awareness.

[0698] "Follow-up measures" refer to methods or devices for re-analyzing data after information has been presented and generating suggestions for future improvements.

[0699] This system optimizes presentations through the coordinated operation of a server, terminal, user, and emotion engine. The following describes a specific implementation of this system.

[0700] First, the server collects past information presentation data and psychological information, and analyzes this data. The server utilizes a natural language processing model to identify the recipient's interests and response patterns. This analysis makes it possible to determine what information is effective for the recipient. The analysis results are then automatically generated as optimal information presentation texts and materials through a generative AI model. Specifically, information about newly introduced products and services is constructed based on successful past presentation examples.

[0701] Users prepare their presentations based on the generated informational text and materials. The device plays a crucial role in the presentation itself. The device captures the audience's facial expressions, voice, and gaze in real time, and analyzes their psychological state using emotion recognition technology. This analysis allows for an immediate determination of how the audience is receiving the presentation.

[0702] In addition, the emotion engine captures the user's facial expressions and tone of voice in real time, recognizing the user's emotional state. This information is fed back to the user to help them with self-awareness and to guide them in adjusting the flow of their presentation.

[0703] After the presentation ends, the server re-analyzes the data for follow-up and generates suggestions for improvement for the next presentation. This can improve the success rate of future information presentations.

[0704] As a concrete example, when giving a presentation to introduce a new product, the following prompt can be used in the AI model: "Generate new product introduction materials based on past success stories. Focus on elements that will capture the audience's attention, and include suggestions for conducting the presentation while monitoring emotions in real time."

[0705] This system enables effective information transmission by understanding the emotions of both the user and the recipient, maximizing the impact of presentations.

[0706] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0707] Step 1:

[0708] The server collects historical information presentation data and psychological information. It retrieves relevant documents and datasets from databases and the internet as input. To analyze this information, it uses a natural language processing model to identify the recipient's areas of interest. As output, it generates a profile of the recipient's interests and response patterns.

[0709] Step 2:

[0710] Based on the analysis results obtained in Step 1, the server automatically generates optimal information presentation text and materials using a generative AI model. This process involves using prompts to instruct the generative AI model and design the flow and structure of the information. The input is the analysis result profile, and the output is specific presentation text and visual materials.

[0711] Step 3:

[0712] The user prepares their presentation based on the presentation text and materials provided by the server. The user reviews the materials, customizes them as needed, and prepares for the presentation. During the preparation process, they organize their presentation method and key points.

[0713] Step 4:

[0714] The user begins a presentation and presents information through the device. The device uses emotion recognition to capture the recipient's facial expressions, voice, and gaze in real time and analyzes their psychological state. The input is real-time data from the recipient, and this analysis allows the device to understand the recipient's shifts in interest and attention. As output, a report on the recipient's psychological state is generated.

[0715] Step 5:

[0716] The emotion engine captures the user's facial expressions and tone of voice to recognize the user's own emotional state. This information is fed back to the user through the device, providing guidance for adjusting the presentation. The input is real-time data from the user, and the output generates feedback that aids self-awareness.

[0717] Step 6:

[0718] After the presentation ends, the server uses follow-up mechanisms to re-analyze the collected data. The input is all the data acquired during the presentation, and the server performs data processing to derive optimal improvement measures. As output, specific improvement suggestions for the next information presentation are generated and provided to the user.

[0719] (Application Example 2)

[0720] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0721] In modern advertising, there is a need to understand the emotions of individual viewers in real time and present the most suitable advertisements that match their psychological state. However, conventional advertising systems have the challenge of not being able to effectively capture viewers' emotions and reactions and instantly generate corresponding advertisement content. Therefore, new methods are needed to maximize the effectiveness of advertising.

[0722] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0723] In this invention, the server includes an analysis means for analyzing past data and psychological data to identify an individual's interests and reaction patterns, a generation means for automatically generating optimal content based on the analysis results, and an emotion recognition means for capturing an individual's facial expressions and voice in real time and analyzing their psychological state. This makes it possible to present dynamic advertising content that responds to the viewer's emotional state.

[0724] "Analysis methods" refer to techniques that analyze past data and psychological data to identify individuals' interests and response patterns.

[0725] "Generation means" refers to a technology that automatically creates optimal content based on the results obtained by the analysis means.

[0726] "Emotion recognition technology" refers to a technique that collects an individual's facial expressions and voice in real time and analyzes their psychological state from that data.

[0727] "Presentation methods" refer to technologies that dynamically display different information according to an individual's psychological state.

[0728] "Follow-up methods" refer to technologies that reanalyze data collected after viewing and generate suggestions for improvement for the next time.

[0729] The system for implementing the present invention aims to optimize ad delivery and mainly includes analysis means, generation means, emotion recognition means, presentation means, and follow-up means. These means are specifically processed on the server and the user's terminal.

[0730] The server processes historical and psychological data using analytical tools to identify viewer interests and reaction patterns. This analysis utilizes natural language processing technology, OpenCV for facial recognition software, and Google's speech recognition API for voice analysis software. Based on the analysis results, the server uses a generative AI model (e.g., Hugging Face's Transformers) to automatically generate advertising content optimized for the viewer.

[0731] On the user's device, personal facial expressions and voice data collected from the smartphone's camera and microphone are analyzed in real time using emotion recognition technology to determine their psychological state. To maintain viewer interest, the presentation system displays different information on the fly and adjusts the advertising content accordingly. This information is dynamically updated based on the viewer's facial expressions and voice tone while watching the video.

[0732] After the presentation ends, the server uses follow-up methods to re-analyze the viewing data and generate improvement suggestions to help with the next advertising campaign. For example, if a user smiles while watching the video, content such as "Using this product will make your everyday life more enjoyable. Click here for more details!" will be displayed.

[0733] This system enables ad delivery that resonates with viewers' emotions, maximizing advertising effectiveness.

[0734] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0735] Step 1:

[0736] The server collects historical and psychological data. The input consists of business presentation data and datasets based on psychological research. These are analyzed using natural language processing techniques to identify audience interest and response patterns. The analysis identifies which topics attract interest, and this information is used as input for the next step.

[0737] Step 2:

[0738] The server automatically generates appropriate advertising content using a generative AI model based on insights obtained from the analysis. The input is the interest and response patterns obtained in step 1, which the generative AI model processes to output broadcastable advertising text and visuals. The generated content is optimized to easily attract the viewer's interest.

[0739] Step 3:

[0740] The user's device uses a camera and microphone to capture the viewer's facial expressions and voice in real time. The input consists of audio and image data, which are analyzed by emotion recognition systems to determine the viewer's psychological state. The output is real-time emotion data from the viewer, which is used to appropriately adjust the displayed advertisements.

[0741] Step 4:

[0742] The device uses presentation tools to display the most suitable advertisement content to the viewer based on the emotion recognition results. The input is the emotion data obtained in step 3, and based on this data, it selects and dynamically displays different advertisement content. The output is the customized advertisement presented to the viewer.

[0743] Step 5:

[0744] After the presentation, the server re-analyzes the viewing data and uses follow-up methods to generate suggestions for improving the next advertising campaign. The input is the viewing data collected in steps 3 and 4, which is analyzed to evaluate its effectiveness and output suggestions for the next campaign.

[0745] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0746] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (Internet Search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0747] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0748] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0749] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0750] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0751] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0752] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0753] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0754] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0755] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0756] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0757] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0758] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0759] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0760] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0761] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0762] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0763] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0764] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0765] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0766] The following is further disclosed regarding the embodiments described above.

[0767] (Claim 1)

[0768] An analytical method that analyzes past business presentation data and psychological data to identify customer interests and reaction patterns,

[0769] A generation method that automatically generates optimal presentation text and materials based on analysis results,

[0770] An emotion recognition system that captures the customer's facial expressions and voice in real time and analyzes their psychological state,

[0771] A presentation method that provides suggestions for improving presentations and presenting information based on the customer's psychological state,

[0772] A follow-up method that reanalyzes data after the presentation and generates improvement suggestions for the next time,

[0773] A system that includes this.

[0774] (Claim 2)

[0775] The system according to claim 1, wherein the analysis means uses a natural language processing model to analyze the customer's areas of interest.

[0776] (Claim 3)

[0777] The system according to claim 1, wherein the emotion recognition means analyzes the customer's facial expressions, gaze, and tone of voice to determine their psychological state.

[0778] "Example 1"

[0779] (Claim 1)

[0780] An information analysis tool that analyzes past business information and social science data to identify customer interests and reactions,

[0781] A data generation means that automatically generates an optimized information representation based on the analysis results,

[0782] A reaction analysis method that acquires customer facial expressions and voice in real time and analyzes their psychological state,

[0783] Information modification means that provide suggestions for improving information presentations and information provision according to the customer's psychological state,

[0784] A means of suggesting improvement measures that reanalyzes data after the information is released and generates suggestions for improvement for the next time,

[0785] A system that includes this.

[0786] (Claim 2)

[0787] The system according to claim 1, wherein the information analysis means uses natural language processing technology to analyze the customer's areas of interest.

[0788] (Claim 3)

[0789] The system according to claim 1, wherein the reaction analysis means analyzes the customer's facial features, gaze, and tone of voice to determine their psychological state.

[0790] "Application Example 1"

[0791] (Claim 1)

[0792] An analytical means for analyzing work record data and psychological analysis data to identify the interests and reaction tendencies of the subjects,

[0793] A generation means that automatically creates optimal presentation content and information based on the analysis results,

[0794] An emotion recognition method that acquires the subject's facial expressions and voice in real time and evaluates their psychological state,

[0795] A presentation method that provides suggestions for improving the content or updating the information based on the psychological state of the target audience,

[0796] A follow-up method to re-evaluate the data after presentation and generate improvement suggestions for the next time,

[0797] A display means for visually evaluating the condition of a subject using a display device,

[0798] A system that includes this.

[0799] (Claim 2)

[0800] The system according to claim 1, wherein the analysis means uses natural language processing technology to analyze the subject's interests.

[0801] (Claim 3)

[0802] The system according to claim 1, wherein the emotion recognition means analyzes the subject's facial expression, gaze, and tone of voice to determine their psychological state.

[0803] "Example 2 of combining an emotion engine"

[0804] (Claim 1)

[0805] An analytical means that analyzes past information presentation data and psychological information to identify the recipient's interests and response patterns,

[0806] A generation method that automatically generates optimal information presentation text and materials based on the analysis results,

[0807] An emotion recognition means that captures the recipient's facial expressions, voice, and gaze in real time and analyzes their psychological state,

[0808] A self-awareness feedback system that recognizes the user's facial expressions and tone of voice and presents their own emotional state,

[0809] A follow-up method that reanalyzes the data after presenting the information and generates improvement suggestions for the next time,

[0810] A system that includes this.

[0811] (Claim 2)

[0812] The system according to claim 1, wherein the analysis means uses a natural language processing model to analyze the recipient's domain of interest.

[0813] (Claim 3)

[0814] The system according to claim 1, wherein the emotion recognition means analyzes the recipient's facial expression, gaze, and tone of voice to determine their psychological state.

[0815] "Application example 2 when combining with an emotional engine"

[0816] (Claim 1)

[0817] An analytical method that analyzes past data and psychological data to identify individual interests and response patterns,

[0818] A generation method that automatically generates optimal content based on the analysis results,

[0819] An emotion recognition method that captures an individual's facial expressions and voice in real time and analyzes their psychological state,

[0820] A means of presenting different information on the spot,

[0821] A follow-up method that reanalyzes viewing data and generates suggestions for improvement for the next time,

[0822] A system that includes this.

[0823] (Claim 2)

[0824] The system according to claim 1, wherein the analysis means uses natural language processing technology to analyze an individual's areas of interest.

[0825] (Claim 3)

[0826] The system according to claim 1, wherein the emotion recognition means analyzes an individual's facial expressions, gaze, and tone of voice to determine their psychological state. [Explanation of symbols]

[0827] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. An analytical method that analyzes past business presentation data and psychological data to identify customer interests and reaction patterns, A generation method that automatically generates optimal presentation text and materials based on analysis results, An emotion recognition system that captures the customer's facial expressions and voice in real time and analyzes their psychological state, A presentation method that provides suggestions for improving presentations and presenting information based on the customer's psychological state, A follow-up method that reanalyzes data after the presentation and generates improvement suggestions for the next time, A system that includes this.

2. The system according to claim 1, wherein the analysis means uses a natural language processing model to analyze the customer's areas of interest.

3. The system according to claim 1, wherein the emotion recognition means analyzes the customer's facial expressions, gaze, and tone of voice to determine their psychological state.