system

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
The system addresses inefficiencies in creative production by using a generative model that automatically generates and refines content based on feedback, ensuring high-quality output and adaptability across various devices.

JP2026096624APending Publication Date: 2026-06-15SOFTBANK GROUP CORP

View PDF 1 Cites 0 Cited by

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: SOFTBANK GROUP CORP
Filing Date: 2024-12-03
Publication Date: 2026-06-15

Application Information

Patent Timeline

03 Dec 2024

Application

15 Jun 2026

Publication

JP2026096624A

IPC: G06Q50/10; G06Q30/0241; G06Q30/0251

AI Tagging

Application Domain

Commerce

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

AI Technical Summary

Technical Problem

Conventional creative production processes are inefficient, requiring significant time and effort, and struggle to incorporate feedback effectively, leading to difficulties in maintaining quality and consistency, especially in resource-constrained situations.

Method used

A system utilizing a generative model that automatically generates content, collects evaluation information, and adjusts its parameters based on feedback, operating in a cloud environment to enable efficient and remote access for multiple devices.

Benefits of technology

Enables high-quality content generation with continuous improvement, adapting to user feedback and emotional states, and supports flexible, efficient creative production across diverse devices.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 2026096624000001_ABST

Patent Text Reader

Abstract

We provide the system. [Solution] A means of automatically generating content using a generative model, A means for collecting evaluation information for the generated content, Means for adjusting the generative model based on the aforementioned evaluation information, A system that includes this.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0005] ,

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, and includes steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance as a response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] Conventional creative production processes have required a lot of time and effort and have been difficult to create content efficiently. There has also been a problem that the process of quickly collecting feedback on the created content and reflecting it in the next production is complicated. As a result, it has been difficult to continuously improve the quality and consistency of the production, and there has been concern about a decline in competitiveness, especially in situations where resources are limited.

Means for Solving the Problems

[0005] This invention provides a system that automatically generates content using a generative model. This system includes means for collecting evaluation information on the generated content and allows for adjustment of the generative model based on this evaluation information. This enables effective incorporation of feedback obtained during the content creation process, allowing for the provision of higher-quality content in subsequent productions. Furthermore, by targeting both visual design and text and operating in a cloud environment, it enables access from multiple information processing devices, realizing an efficient remote work environment.

[0006] A "generative model" is an artificial intelligence algorithm that learns from data to create new content.

[0007] "Content" refers to a collection of information and materials, including visual design and text.

[0008] "Evaluation information" refers to data regarding the quality and areas for improvement of the generated content.

[0009] "Means of collection" refers to methods or devices used to gather specific information.

[0010] "Means of adjustment" refer to methods or devices that optimize performance by changing the parameters or settings of a generative model.

[0011] A "system" is a complex of different elements working together to perform a function.

[0012] "Visual design" refers to the layout and composition used to convey information using visual elements.

[0013] "Text" refers to a written or printed expression in language.

[0014] A "cloud environment" is an infrastructure that provides computing resources via the internet.

[0015] An "information processing device" is a device for inputting, processing, and outputting data.

Brief Description of Drawings

[0016] [Figure 1] It is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.

Best Mode for Carrying Out the Invention

[0017] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0018] First, the terms used in the following description will be explained.

[0019] In the following embodiments, a processor with a reference numeral (hereinafter simply referred to as "processor") may be one arithmetic unit or a combination of a plurality of arithmetic units. Also, the processor may be one type of arithmetic unit or a combination of a plurality of types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0020] In the following embodiments, a RAM (Random Access Memory) with a reference numeral is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0021] In the following embodiments, a storage with a reference numeral is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.

[0022] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0023] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0024] [First Embodiment]

[0025] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0026] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0027] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0028] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0029] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0030] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0031] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0032] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0033] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0034] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0035] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0036] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0037] This invention relates to an automated content generation system using a generative model, and mainly consists of a server, a terminal, and a user. The server automatically generates content using the generative model based on requirements provided by the user through the terminal. The generated content includes visual design and text content. The server collects user evaluation information on the generated content via the terminal.

[0038] Users specify themes, objectives, and target audiences that suit their needs to the server and request content generation. For example, if advertising materials are needed, the user would specify "promotional advertising for a new product," and the server would generate relevant images and text based on the specified requirements.

[0039] The device presents the generated content to the user and collects feedback. This feedback includes suggestions for improvements regarding the design's color scheme, text style, and content structure.

[0040] The server adjusts the generation model based on the collected evaluation information and reflects it in the next content generation. This enables continuous quality improvement. Furthermore, because this system operates in a cloud environment, it can be accessed from different information processing devices and can be used effectively even in remote work environments.

[0041] For example, when a user requests the creation of a "summer campaign poster," the server automatically generates a poster that reflects seasonal colors and graphics, and also suggests a catchy slogan. Furthermore, it can be further improved based on user feedback. This process makes it possible to achieve both increased efficiency and improved quality in creative production.

[0042] The following describes the processing flow.

[0043] Step 1:

[0044] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0045] Step 2:

[0046] The server analyzes the requirements received from the terminal and uses this to determine the parameters of the generative model. During this process, it selects appropriate design templates and writing styles.

[0047] Step 3:

[0048] The server operates a generative model based on the analysis results to generate content. For visual designs, it selects relevant images from a material library and applies them to a template. For text, it creates sentences in the specified tone.

[0049] Step 4:

[0050] The server sends the generated content to the device and prompts the user for review. The user reviews the content on the device and provides feedback with any requested revisions.

[0051] Step 5:

[0052] The device collects user feedback and sends it to the server. The collected feedback includes suggestions for improvement and specific points to note.

[0053] Step 6:

[0054] The server analyzes the feedback information and uses it to adjust the generative model. This analysis clarifies areas for improvement to enhance the quality of content generated in the future.

[0055] Step 7:

[0056] The server restarts the process by hosting the optimized generative model in the cloud environment and preparing for the next content generation request.

[0057] (Example 1)

[0058] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0059] Automated content generation systems are required to efficiently generate high-quality visual and textual information based on specific user requests. However, conventional systems have struggled to improve the quality of generated content and flexibly respond to diverse user requests. Furthermore, they lacked sufficient flexibility for remote access.

[0060] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0061] In this invention, the server includes means for automatically generating content based on requests received from users using a generation model, means for presenting the generated content on an information display device and collecting evaluation information from users, means for modifying the generation model based on the collected evaluation information and reflecting the changes in subsequent generation, and means for operating in a cloud environment and enabling access from multiple information display devices. This enables the efficient provision of high-quality content that meets the diverse needs of users, as well as flexible use through remote access.

[0062] A "generative model" is a general term for algorithms and technologies used to automatically generate content based on user requests.

[0063] "Content" refers to creative works that combine visual art and textual information, thereby providing information that meets the user's needs.

[0064] An "information display device" refers to a terminal or device that displays content generated from a server, allowing users to recognize it and input their evaluations.

[0065] "Evaluation information" refers to feedback that users provide on generated content, which is used to improve content quality and modify the generation model.

[0066] A "cloud environment" is a distributed information processing infrastructure that provides computing resources and storage via the internet, enabling flexible access.

[0067] "Information processing equipment" refers to electronic devices such as computers, terminals, and servers that process, store, and communicate data.

[0068] A "prompt statement" is text input into a generative model, and its role is to provide instructions for generating appropriate content in response to a request.

[0069] This invention relates to an automated content generation system using a generative AI model. The system mainly consists of a server, terminals, and users.

[0070] The server generates content based on user requests using a generative model. This generation process utilizes software for natural language processing (e.g., the GPT model). On the server, an algorithm, corresponding to the requested prompt, processes the data and automatically generates content that combines visual art and textual information. For hardware, high-performance computing resources running on a cloud infrastructure (e.g., virtual servers from a cloud provider) are used.

[0071] The terminal plays the role of presenting the generated content to the user. Specifically, it displays the content in a viewable format through the user's browser or dedicated application. The terminal also provides an interface for collecting user feedback and evaluation information.

[0072] Users input requests to the server via their device to generate content tailored to their needs. An example of a prompt might be, "Create a visually appealing poster with bright colors for a summer campaign. The target audience is young people, and the theme is the beach." This provides specific instructions to the generation model, enabling the creation of content suitable for the user's objectives.

[0073] The server also has the functionality to improve the generation model based on the collected evaluation information and reflect that improvement in subsequent content generation. This enables continuous quality improvement and adjustments to meet user requests.

[0074] Because this invention operates in a cloud environment, it can be accessed from multiple information processing devices and can be effectively used in remote work. Therefore, this system provides users with a convenient and flexible content generation platform.

[0075] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0076] Step 1:

[0077] The user enters a request for the content they want to generate via their device. Specifically, they enter prompt text into a form in their browser or a dedicated application. This input is in text format, for example, "Please create a poster for a summer campaign. The target audience is young people, and the theme is the beach." The entered data is sent to the server by the device.

[0078] Step 2:

[0079] The server receives a user request. Based on this, the server constructs internal data to be passed to the generative model. Specifically, it parses the prompt text and extracts the necessary keywords and concepts. This prepares appropriate instructions for the generative AI model. The input is the received request, and the output is prompt data for the generative model.

[0080] Step 3:

[0081] The server generates content using a generative AI model. Prompt data is input to the generative model, and the algorithm processes the data to generate visual designs and text structures. As a result of the processing, image files and text data are generated. The input is prompt data, and the output is content data.

[0082] Step 4:

[0083] The terminal presents the generated content to the user. Specifically, it displays the generated images and text data on the screen for the user to review. The input is the generated content received from the server, and the output is what is displayed to the user.

[0084] Step 5:

[0085] Users evaluate the presented content and input feedback into their devices. Specifically, they write their opinions on the appearance, style, and content in text and send it to the server via their devices. The input is evaluation data from the user, and the output is sent to the server.

[0086] Step 6:

[0087] The server collects user feedback and uses it to improve the generative model. This feedback is analyzed to adjust the model's parameters, which are then used to improve future generation. This involves an optimization process for the generative model. The input is the feedback, and the output is the adjusted model parameters.

[0088] (Application Example 1)

[0089] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0090] In modern advertising production, it's necessary to create high-quality advertising materials quickly and effectively to appeal to the target audience, and to improve their effectiveness in real time. However, traditional methods are time-consuming and require manual trial and error, making improvements in production efficiency and effectiveness measurement a challenge.

[0091] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0092] In this invention, the server includes means for automatically generating advertising materials using a generative model, means for specifying the content of an advertising campaign based on requirements entered by the user through a terminal, and means for collecting feedback information on the generated advertising materials. This enables users to automatically generate high-quality advertising materials in a short time and to improve the advertising content based on feedback.

[0093] A "generative model" is an artificial intelligence algorithm that automatically generates advertising materials based on requirements provided by the user.

[0094] "Advertising materials" refer to content, including visual designs and written expressions, used for promotional purposes.

[0095] "Requirements" refer to information that indicates the target content and design guidelines for an advertising campaign.

[0096] "Feedback information" refers to information that indicates users' evaluations of the generated advertising materials and their requests for improvement.

[0097] "Cloud infrastructure" refers to a virtual computing resource and storage environment where servers are provided over the internet.

[0098] The system implementing this invention is centered around a server running on a cloud infrastructure. Users can input advertising campaign requirements into the server using a device such as a smartphone. The server automatically generates advertising materials based on the specified requirements using a generative AI model. Visual design and text are included as components in this process. The generated advertising materials are displayed on the user's device, and after reviewing them, the user can send feedback with suggestions for improvement or requests for changes.

[0099] The server adjusts the generation model based on this feedback information and incorporates it into subsequent ad generation. This improves the overall ad production capabilities of the system. Specifically, based on prompts such as "Create a summer campaign ad for young people, use a lot of blue in the color palette, and use modern fonts," effective ads for the target audience are generated.

[0100] The hardware used in this process includes the user's device (such as a smartphone or tablet). On the software side, a generative AI model runs on a server, along with a program to control and manage it. This allows users to quickly create advertising materials and improve their quality through real-time feedback.

[0101] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0102] Step 1:

[0103] The user enters the requirements for the advertising campaign using a device such as a smartphone. As a prompt, they send specific instructions to the server, such as, "Create a summer campaign ad targeting young people, use a color palette heavily featuring blue, and use a modern font." The input in this step is the prompt provided by the user, and the output is the requirements data sent to the server.

[0104] Step 2:

[0105] The server executes a generation AI model based on the received requirements data. This model automatically generates advertising materials based on input prompts. Here, it incorporates the specified color palette and font style to construct the visual design and catchphrase. The output obtained through the generation process is advertising material tailored to the user's requirements.

[0106] Step 3:

[0107] The generated advertising material is displayed on the user's device. The user reviews the advertising material and provides feedback on the visual design and wording. Through the input of feedback, specific revision requests, such as color adjustments or font changes, are sent to the server. The output of Step 3 is the feedback information.

[0108] Step 4:

[0109] The server analyzes the collected feedback information and adjusts the generating AI model. Specifically, it tunes model parameters and improves the algorithm to reflect user feedback. This adjustment aims to produce higher quality results in subsequent ad generation. The resulting output represents the new state of the adjusted model.

[0110] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0111] This invention combines an automated content generation system using a generative model with an emotion engine that recognizes user emotions. It primarily consists of a server, a terminal, and a user. The emotion engine understands the user's emotional state and generates and adjusts content based on that information.

[0112] Users access the system via a terminal and enter project requirements. These requirements include the theme, purpose, target audience, and desired design and tone style. This information is sent to the server and analyzed.

[0113] The server generates optimal content based on user requirements and emotional information provided by the emotion engine. The emotion engine recognizes emotions based on the user's facial expressions, voice, or text input, and feeds this data back to the server. By considering this emotional information, content is generated in a way that adapts to the user's psychological state.

[0114] For example, if a user wants to create a "promotional video," the emotion engine recognizes that the user's current emotions are positive based on facial expressions and voice tone collected through the device's camera and microphone. Based on this information, the server generates the video by selecting music with a cheerful atmosphere and brightening the colors.

[0115] The generated content is presented to the user via their device, and their evaluation of that content is collected as feedback. Furthermore, the feedback and sentiment information are fed back into the server's generation model and used to improve content quality. The integration of the sentiment engine enables the rapid delivery of more personalized, high-quality content.

[0116] Thus, the present invention enables the creation of adaptive content that responds to the user's emotional state, maximizing the efficiency of creative production resources. Implementing it in a cloud environment allows access from multiple information processing devices and supports remote work.

[0117] The following describes the processing flow.

[0118] Step 1:

[0119] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0120] Step 2:

[0121] The emotion engine uses the device's camera and microphone to recognize the user's emotional state from their facial expressions and voice. This emotional information is analyzed in real time and sent to the server.

[0122] Step 3:

[0123] The server analyzes the requirements and sentiment information received from the terminal and determines the appropriate generative model parameters. This process selects specific design templates and content tones that correspond to the user's emotions.

[0124] Step 4:

[0125] The server runs a generative model based on the analysis results to generate content. For visual design generation, it selects colors and adjusts layouts to consider user emotions. For text content generation, it adopts styles and language appropriate to those emotions.

[0126] Step 5:

[0127] The generated content is sent to the device and displayed to the user. The user reviews the content and provides feedback, including comments and suggestions for revisions.

[0128] Step 6:

[0129] The device collects user feedback and sends it to the server. This information includes specific suggestions for further improving the content.

[0130] Step 7:

[0131] The server analyzes emotional information and user feedback, using it to refine the generative model. This analysis allows for further customization of content generation to better reflect user emotions.

[0132] Step 8:

[0133] The server hosts the optimized generative model in the cloud environment, preparing for the next content generation request. This restarts the process, and it repeats.

[0134] (Example 2)

[0135] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0136] Conventional content generation systems face the challenge of generating personalized information that effectively reflects user emotions and feedback. Furthermore, there is a need for efficient methods to quickly respond to the diverse needs of users. Additionally, improving usability in environments accessible from multiple data processing devices is another challenge.

[0137] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0138] In this invention, the server includes means for automatically generating information using a generative model, means for collecting user sentiment information from a terminal, and means for adjusting the automatically generated information based on the collected sentiment information. This enables the rapid reflection of user sentiment and feedback, and the generation of personalized and high-quality information.

[0139] A "generative model" refers to an algorithm that automatically creates new information based on data.

[0140] "Information" refers to the elements that make up content, such as visual representations and textual information.

[0141] A "terminal" is a device used by users to access and operate an information processing system.

[0142] "User" refers to an individual or organization that generates or receives information using an information processing system.

[0143] "Emotional information" refers to data that indicates the psychological state of a user, analyzed based on factors such as their facial expressions and tone of voice.

[0144] "Evaluation information" refers to feedback data that reflects users' opinions and impressions about the generated information.

[0145] A "virtual environment" refers to a network infrastructure, often provided using cloud technology, that allows a large number of data processing devices to connect.

[0146] A "data processing device" refers to a computer device used for generating, processing, and transmitting information.

[0147] This invention is a system that automatically generates information using a generative AI model. Its main components are a server, a terminal, and a user.

[0148] server

[0149] The server is equipped with software to run a generative AI model and plays the role of automatically generating information. The server receives prompt messages sent from the user through the terminal and analyzes the text information. Then, based on the analysis results and sentiment information provided by the sentiment engine, it generates information with visual representations and textual information. As a result, the generated information becomes content that reflects the user's emotions and feedback.

[0150] terminal

[0151] A terminal is a hardware device used by users to access the system and input prompt messages. The terminal is equipped with a camera and microphone, which are used to collect emotional information such as the user's facial expressions and tone of voice. This data is transmitted to a server in real time and used for sentiment analysis.

[0152] User

[0153] The user enters project requirements using a terminal. The entered requirements include theme, purpose, target audience, and design and tone style, and this information is sent to the server as a prompt. For example, by entering a prompt such as, "Please create a promotional video. The theme is 'Summer Vacation,' and I would like a positive and energetic tone. The target audience is young people aged 18-25," it is possible to communicate specific generation requirements to the system.

[0154] Thus, this invention enables the efficient and rapid delivery of personalized, high-quality content based on the individual emotions and feedback of users. Because this system operates in a virtual environment, it allows for flexible access from various data processing devices and achieves high scalability in its operation.

[0155] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0156] Step 1:

[0157] The user inputs project requirements through a terminal. Specifically, they use the terminal's input interface to create prompt statements, which include the theme, purpose, target audience, design, and tone style. This input information is sent to the server. The entered prompt statements reach the server and are ready for analysis.

[0158] Step 2:

[0159] The device plays a role in collecting user emotional information. Using the camera and microphone built into the device, it collects facial expression and voice tone data in real time. This data is immediately sent to the server and analyzed by the emotion engine. Facial expressions and voice tone are sent to the server as input data and output as emotional information.

[0160] Step 3:

[0161] The server uses natural language processing techniques to analyze the received prompt message and understand its content. It combines the analyzed information with the emotion information recognized by the emotion engine to prepare a dataset for input to the generative AI model. The output dataset is formed from the input prompt message and emotion information.

[0162] Step 4:

[0163] The server uses a generative AI model to construct automatically generated information. The model generates visual representations and textual information based on the input dataset. For example, if positive emotional information is included, bright and cheerful content will be generated. The output obtained from the input dataset is the automatically generated information.

[0164] Step 5:

[0165] The terminal presents the generated information to the user. In this presentation, the generated visuals and text are displayed on the terminal's screen for the user to review. The generated information is then provided to the user as output.

[0166] Step 6:

[0167] The user provides feedback on the presented information. Using the terminal's feedback input interface, they input an evaluation, such as how well the information matches their expectations. This feedback is sent to the server. The feedback received from the user is input and sent to the server as output.

[0168] Step 7:

[0169] The server analyzes the collected feedback and uses it, along with emotional information, to adjust the AI model. The feedback improves the model's output accuracy and adaptability, which is then used to improve future information generation. Feedback and emotional information are input, and output is generated as part of the model adjustment.

[0170] (Application Example 2)

[0171] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0172] Current content generation systems have a problem in that they struggle to create personalized content that adapts to users' emotions. In particular, conventional technologies are insufficient to dynamically reflect a user's psychological state and provide content optimized for that moment. Therefore, new technologies are needed to improve the user experience and provide high-quality content.

[0173] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0174] In this invention, the server includes means for automatically generating content using a generative model, means for adjusting the generative model based on evaluation information and the user's psychological state, and means for recognizing the user's emotions and generating recommendation information based on that information. This makes it possible to generate and present optimal content that is in line with the user's emotions.

[0175] A "generative model" is an algorithm for automatically generating new content based on data.

[0176] "Evaluation information" refers to information based on feedback and comments received from users regarding the generated content.

[0177] "Psychological state" refers to the internal state that represents the user's emotions and mood.

[0178] "Means of recognizing emotions" refers to technologies that analyze a user's emotions from their facial expressions, tone of voice, and other factors.

[0179] "Recommendation information" refers to information used to present the most suitable content based on the user's emotional state.

[0180] The specific system for implementing this invention generates and recommends content that takes into account the user's psychological state. The user connects to the system using a terminal and provides their emotional state through a camera and microphone. This information is transmitted from the terminal to the server.

[0181] The server utilizes emotion analysis APIs (e.g., Affectiva or Microsoft® Azure® Face API) as a means of recognizing emotions. The resulting emotion data is processed to analyze the user's psychological state. Next, the server uses a generative model to generate appropriate recommendations based on the emotion information. At this stage, content best suited to the user's emotions is selected.

[0182] The generated content information is fed back to the user's device via the network and presented as part of the user's experience. This allows users to enjoy more personalized content in real time.

[0183] For example, if a user is seeking relaxation after work, the system will recognize the user's tired expression and recommend videos of relaxing nature scenes or music. An example of a prompt for the generative AI model would be, "Consider the most suitable video content for a user seeking relaxation and create a recommendation list."

[0184] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0185] Step 1:

[0186] The user accesses the system through a terminal. Here, the user uses a camera and microphone to record their facial expressions and voice, and sends this data to the server. The input is the user's facial expression data and voice data, and the output is the transmission of this data to the server. The data is encrypted and transmitted in a secure manner.

[0187] Step 2:

[0188] The server analyzes the received facial expression and voice data. It uses an emotion analysis API to analyze the user's emotional state. The input is facial expression and voice data, and the emotional information resulting from the analysis is output. In this step, the emotion recognition engine is used to determine, for example, whether the user is feeling "relaxed" or "stressed."

[0189] Step 3:

[0190] The server uses a generative model to generate appropriate content based on the analyzed sentiment information. The input is sentiment information, and the output is a content recommendation list for the user. The generative AI model creates a video list based on the prompt "Create a video suitable for when the user is seeking relaxation."

[0191] Step 4:

[0192] The server sends the generated content recommendation list to the user's device. The input is the list of recommended content, and the output is the provision of the list to the user's device. The data is transmitted in a streaming or downloadable format.

[0193] Step 5:

[0194] Users view the content provided on their devices and enjoy the experience. They play the output content and provide feedback. The user's feedback information is sent back to the server for use in subsequent analysis and adjustment of the generative model. The input is the viewer's feedback, and the output is the provision of feedback data to the server.

[0195] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0196] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0197] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0198] [Second Embodiment]

[0199] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0200] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0201] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0202] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0203] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0204] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0205] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0206] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0207] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0208] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0209] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0210] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0211] This invention relates to an automated content generation system using a generative model, and mainly consists of a server, a terminal, and a user. The server automatically generates content using the generative model based on requirements provided by the user through the terminal. The generated content includes visual design and text content. The server collects user evaluation information on the generated content via the terminal.

[0212] Users specify themes, objectives, and target audiences that suit their needs to the server and request content generation. For example, if advertising materials are needed, the user would specify "promotional advertising for a new product," and the server would generate relevant images and text based on the specified requirements.

[0213] The device presents the generated content to the user and collects feedback. This feedback includes suggestions for improvements regarding the design's color scheme, text style, and content structure.

[0214] The server adjusts the generation model based on the collected evaluation information and reflects it in the next content generation. This enables continuous quality improvement. Furthermore, because this system operates in a cloud environment, it can be accessed from different information processing devices and can be used effectively even in remote work environments.

[0215] For example, when a user requests the creation of a "summer campaign poster," the server automatically generates a poster that reflects seasonal colors and graphics, and also suggests a catchy slogan. Furthermore, it can be further improved based on user feedback. This process makes it possible to achieve both increased efficiency and improved quality in creative production.

[0216] The following describes the processing flow.

[0217] Step 1:

[0218] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0219] Step 2:

[0220] The server analyzes the requirements received from the terminal and uses this to determine the parameters of the generative model. During this process, it selects appropriate design templates and writing styles.

[0221] Step 3:

[0222] The server operates a generative model based on the analysis results to generate content. For visual designs, it selects relevant images from a material library and applies them to a template. For text, it creates sentences in the specified tone.

[0223] Step 4:

[0224] The server sends the generated content to the device and prompts the user for review. The user reviews the content on the device and provides feedback with any requested revisions.

[0225] Step 5:

[0226] The device collects user feedback and sends it to the server. The collected feedback includes suggestions for improvement and specific points to note.

[0227] Step 6:

[0228] The server analyzes the feedback information and uses it to adjust the generative model. This analysis clarifies areas for improvement to enhance the quality of content generated in the future.

[0229] Step 7:

[0230] The server restarts the process by hosting the optimized generative model in the cloud environment and preparing for the next content generation request.

[0231] (Example 1)

[0232] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0233] Automated content generation systems are required to efficiently generate high-quality visual and textual information based on specific user requests. However, conventional systems have struggled to improve the quality of generated content and flexibly respond to diverse user requests. Furthermore, they lacked sufficient flexibility for remote access.

[0234] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0235] In this invention, the server includes means for automatically generating content based on requests received from users using a generation model, means for presenting the generated content on an information display device and collecting evaluation information from users, means for modifying the generation model based on the collected evaluation information and reflecting the changes in subsequent generation, and means for operating in a cloud environment and enabling access from multiple information display devices. This enables the efficient provision of high-quality content that meets the diverse needs of users, as well as flexible use through remote access.

[0236] A "generative model" is a general term for algorithms and technologies used to automatically generate content based on user requests.

[0237] "Content" refers to creative works that combine visual art and textual information, thereby providing information that meets the user's needs.

[0238] An "information display device" refers to a terminal or device that displays content generated from a server, allowing users to recognize it and input their evaluations.

[0239] "Evaluation information" refers to feedback that users provide on generated content, which is used to improve content quality and modify the generation model.

[0240] A "cloud environment" is a distributed information processing infrastructure that provides computing resources and storage via the internet, enabling flexible access.

[0241] "Information processing equipment" refers to electronic devices such as computers, terminals, and servers that process, store, and communicate data.

[0242] A "prompt statement" is text input into a generative model, and its role is to provide instructions for generating appropriate content in response to a request.

[0243] This invention relates to an automated content generation system using a generative AI model. The system mainly consists of a server, terminals, and users.

[0244] The server generates content based on user requests using a generative model. This generation process utilizes software for natural language processing (e.g., the GPT model). On the server, an algorithm, corresponding to the requested prompt, processes the data and automatically generates content that combines visual art and textual information. For hardware, high-performance computing resources running on a cloud infrastructure (e.g., virtual servers from a cloud provider) are used.

[0245] The terminal plays the role of presenting the generated content to the user. Specifically, it displays the content in a viewable format through the user's browser or dedicated application. The terminal also provides an interface for collecting user feedback and evaluation information.

[0246] Users input requests to the server via their device to generate content tailored to their needs. An example of a prompt might be, "Create a visually appealing poster with bright colors for a summer campaign. The target audience is young people, and the theme is the beach." This provides specific instructions to the generation model, enabling the creation of content suitable for the user's objectives.

[0247] The server also has the functionality to improve the generation model based on the collected evaluation information and reflect that improvement in subsequent content generation. This enables continuous quality improvement and adjustments to meet user requests.

[0248] Because this invention operates in a cloud environment, it can be accessed from multiple information processing devices and can be effectively used in remote work. Therefore, this system provides users with a convenient and flexible content generation platform.

[0249] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0250] Step 1:

[0251] The user enters a request for the content they want to generate via their device. Specifically, they enter prompt text into a form in their browser or a dedicated application. This input is in text format, for example, "Please create a poster for a summer campaign. The target audience is young people, and the theme is the beach." The entered data is sent to the server by the device.

[0252] Step 2:

[0253] The server receives a user request. Based on this, the server constructs internal data to be passed to the generative model. Specifically, it parses the prompt text and extracts the necessary keywords and concepts. This prepares appropriate instructions for the generative AI model. The input is the received request, and the output is prompt data for the generative model.

[0254] Step 3:

[0255] The server generates content using a generative AI model. Prompt data is input to the generative model, and the algorithm processes the data to generate visual designs and text structures. As a result of the processing, image files and text data are generated. The input is prompt data, and the output is content data.

[0256] Step 4:

[0257] The terminal presents the generated content to the user. Specifically, it displays the generated images and text data on the screen for the user to review. The input is the generated content received from the server, and the output is what is displayed to the user.

[0258] Step 5:

[0259] Users evaluate the presented content and input feedback into their devices. Specifically, they write their opinions on the appearance, style, and content in text and send it to the server via their devices. The input is evaluation data from the user, and the output is sent to the server.

[0260] Step 6:

[0261] The server collects user feedback and uses it to improve the generative model. This feedback is analyzed to adjust the model's parameters, which are then used to improve future generation. This involves an optimization process for the generative model. The input is the feedback, and the output is the adjusted model parameters.

[0262] (Application Example 1)

[0263] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0264] In modern advertising production, it's necessary to create high-quality advertising materials quickly and effectively to appeal to the target audience, and to improve their effectiveness in real time. However, traditional methods are time-consuming and require manual trial and error, making improvements in production efficiency and effectiveness measurement a challenge.

[0265] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0266] In this invention, the server includes means for automatically generating advertising materials using a generative model, means for specifying the content of an advertising campaign based on requirements entered by the user through a terminal, and means for collecting feedback information on the generated advertising materials. This enables users to automatically generate high-quality advertising materials in a short time and to improve the advertising content based on feedback.

[0267] A "generative model" is an artificial intelligence algorithm that automatically generates advertising materials based on requirements provided by the user.

[0268] "Advertising materials" refer to content, including visual designs and written expressions, used for promotional purposes.

[0269] "Requirements" refer to information that indicates the target content and design guidelines for an advertising campaign.

[0270] "Feedback information" refers to information that indicates users' evaluations of the generated advertising materials and their requests for improvement.

[0271] "Cloud infrastructure" refers to a virtual computing resource and storage environment where servers are provided over the internet.

[0272] The system implementing this invention is centered around a server running on a cloud infrastructure. Users can input advertising campaign requirements into the server using a device such as a smartphone. The server automatically generates advertising materials based on the specified requirements using a generative AI model. Visual design and text are included as components in this process. The generated advertising materials are displayed on the user's device, and after reviewing them, the user can send feedback with suggestions for improvement or requests for changes.

[0273] The server adjusts the generation model based on this feedback information and incorporates it into subsequent ad generation. This improves the overall ad production capabilities of the system. Specifically, based on prompts such as "Create a summer campaign ad for young people, use a lot of blue in the color palette, and use modern fonts," effective ads for the target audience are generated.

[0274] The hardware used in this process includes the user's device (such as a smartphone or tablet). On the software side, a generative AI model runs on a server, along with a program to control and manage it. This allows users to quickly create advertising materials and improve their quality through real-time feedback.

[0275] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0276] Step 1:

[0277] The user enters the requirements for the advertising campaign using a device such as a smartphone. As a prompt, they send specific instructions to the server, such as, "Create a summer campaign ad targeting young people, use a color palette heavily featuring blue, and use a modern font." The input in this step is the prompt provided by the user, and the output is the requirements data sent to the server.

[0278] Step 2:

[0279] The server executes a generation AI model based on the received requirements data. This model automatically generates advertising materials based on input prompts. Here, it incorporates the specified color palette and font style to construct the visual design and catchphrase. The output obtained through the generation process is advertising material tailored to the user's requirements.

[0280] Step 3:

[0281] The generated advertising material is displayed on the user's device. The user reviews the advertising material and provides feedback on the visual design and wording. Through the input of feedback, specific revision requests, such as color adjustments or font changes, are sent to the server. The output of Step 3 is the feedback information.

[0282] Step 4:

[0283] The server analyzes the collected feedback information and adjusts the generating AI model. Specifically, it tunes model parameters and improves the algorithm to reflect user feedback. This adjustment aims to produce higher quality results in subsequent ad generation. The resulting output represents the new state of the adjusted model.

[0284] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0285] The present invention combines an emotion engine that recognizes the emotions of a user with an automatic content generation system using a generative model. It mainly consists of a server, a terminal, and a user. The emotion engine grasps the emotional state of the user and generates and adjusts content based on that information.

[0286] The user accesses the system via the terminal and inputs the requirements of the project. The requirements include the theme, purpose, target audience, and the desired design and tone style. This information is sent to the server and analyzed.

[0287] The server generates optimal content based on the requirements from the user and the emotion information provided by the emotion engine. The emotion engine recognizes emotions based on the user's expression, voice, or text input and feeds the data back to the server. By considering this emotion information, the content is generated in a form that adapts to the user's psychological state.

[0288] For example, when the user wants to create a "promotional video", the emotion engine recognizes that the current emotion is positive from the expression and voice tone collected through the terminal's camera and microphone. Based on this information, the server selects music with a fun atmosphere or brightens the color tone to generate the video.

[0289] The generated content is presented to the user via the terminal, and an evaluation of the content is collected as feedback. Furthermore, the feedback and emotion information are returned to the server's generative model and used to improve the quality of the content. By integrating the emotion engine, more personalized and high-quality content can be provided quickly.

[0290] Thus, the present invention enables the creation of adaptive content that responds to the user's emotional state, maximizing the efficiency of creative production resources. Implementing it in a cloud environment allows access from multiple information processing devices and supports remote work.

[0291] The following describes the processing flow.

[0292] Step 1:

[0293] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0294] Step 2:

[0295] The emotion engine uses the device's camera and microphone to recognize the user's emotional state from their facial expressions and voice. This emotional information is analyzed in real time and sent to the server.

[0296] Step 3:

[0297] The server analyzes the requirements and sentiment information received from the terminal and determines the appropriate generative model parameters. This process selects specific design templates and content tones that correspond to the user's emotions.

[0298] Step 4:

[0299] The server runs a generative model based on the analysis results to generate content. For visual design generation, it selects colors and adjusts layouts to consider user emotions. For text content generation, it adopts styles and language appropriate to those emotions.

[0300] Step 5:

[0301] The generated content is sent to the terminal and displayed to the user. The user checks the content and inputs opinions and requests for correction as feedback.

[0302] Step 6:

[0303] The terminal collects the feedback from the user and sends it to the server. This information includes specific points regarding further improvement of the content.

[0304] Step 7:

[0305] The server analyzes the sentiment information and the user's feedback and uses it to adjust the generation model. This analysis enables further customization adapted to the user's sentiment in subsequent content generation.

[0306] Step 8:

[0307] The server hosts the adjusted generation model in the cloud environment to prepare for the next content generation request. As a result, the process starts again and is repeated.

[0308] (Example 2)

[0309] Next, Example 2 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".

[0310] In a conventional content generation system, there is a problem that it is difficult to generate personalized information that effectively reflects the emotions and feedback of users. Also, an efficient method for quickly responding to the diverse demands of users is required. Furthermore, improving the operability in an environment accessible from multiple data processing devices is also an issue.

[0311] The specific processing by the specific processing unit 290 of the data processing device 12 in Example 2 is realized by the following respective means.

[0312] In this invention, the server includes means for automatically generating information using a generative model, means for collecting user sentiment information from a terminal, and means for adjusting the automatically generated information based on the collected sentiment information. This enables the rapid reflection of user sentiment and feedback, and the generation of personalized and high-quality information.

[0313] A "generative model" refers to an algorithm that automatically creates new information based on data.

[0314] "Information" refers to the elements that make up content, such as visual representations and textual information.

[0315] A "terminal" is a device used by users to access and operate an information processing system.

[0316] "User" refers to an individual or organization that generates or receives information using an information processing system.

[0317] "Emotional information" refers to data that indicates the psychological state of a user, analyzed based on factors such as their facial expressions and tone of voice.

[0318] "Evaluation information" refers to feedback data that reflects users' opinions and impressions about the generated information.

[0319] A "virtual environment" refers to a network infrastructure, often provided using cloud technology, that allows a large number of data processing devices to connect.

[0320] A "data processing device" refers to a computer device used for generating, processing, and transmitting information.

[0321] This invention is a system that automatically generates information using a generative AI model. Its main components are a server, a terminal, and a user.

[0322] server

[0323] The server is equipped with software to run a generative AI model and plays the role of automatically generating information. The server receives prompt messages sent from the user through the terminal and analyzes the text information. Then, based on the analysis results and sentiment information provided by the sentiment engine, it generates information with visual representations and textual information. As a result, the generated information becomes content that reflects the user's emotions and feedback.

[0324] terminal

[0325] A terminal is a hardware device used by users to access the system and input prompt messages. The terminal is equipped with a camera and microphone, which are used to collect emotional information such as the user's facial expressions and tone of voice. This data is transmitted to a server in real time and used for sentiment analysis.

[0326] User

[0327] The user enters project requirements using a terminal. The entered requirements include theme, purpose, target audience, and design and tone style, and this information is sent to the server as a prompt. For example, by entering a prompt such as, "Please create a promotional video. The theme is 'Summer Vacation,' and I would like a positive and energetic tone. The target audience is young people aged 18-25," it is possible to communicate specific generation requirements to the system.

[0328] Thus, this invention enables the efficient and rapid delivery of personalized, high-quality content based on the individual emotions and feedback of users. Because this system operates in a virtual environment, it allows for flexible access from various data processing devices and achieves high scalability in its operation.

[0329] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0330] Step 1:

[0331] The user inputs project requirements through a terminal. Specifically, they use the terminal's input interface to create prompt statements, which include the theme, purpose, target audience, design, and tone style. This input information is sent to the server. The entered prompt statements reach the server and are ready for analysis.

[0332] Step 2:

[0333] The device plays a role in collecting user emotional information. Using the camera and microphone built into the device, it collects facial expression and voice tone data in real time. This data is immediately sent to the server and analyzed by the emotion engine. Facial expressions and voice tone are sent to the server as input data and output as emotional information.

[0334] Step 3:

[0335] The server uses natural language processing techniques to analyze the received prompt message and understand its content. It combines the analyzed information with the emotion information recognized by the emotion engine to prepare a dataset for input to the generative AI model. The output dataset is formed from the input prompt message and emotion information.

[0336] Step 4:

[0337] The server uses a generative AI model to construct automatically generated information. The model generates visual representations and textual information based on the input dataset. For example, if positive emotional information is included, bright and cheerful content will be generated. The output obtained from the input dataset is the automatically generated information.

[0338] Step 5:

[0339] The terminal presents the generated information to the user. In this presentation, the generated visuals and text are displayed on the terminal's screen for the user to review. The generated information is then provided to the user as output.

[0340] Step 6:

[0341] The user provides feedback on the presented information. Using the terminal's feedback input interface, they input an evaluation, such as how well the information matches their expectations. This feedback is sent to the server. The feedback received from the user is input and sent to the server as output.

[0342] Step 7:

[0343] The server analyzes the collected feedback and uses it, along with emotional information, to adjust the AI model. The feedback improves the model's output accuracy and adaptability, which is then used to improve future information generation. Feedback and emotional information are input, and output is generated as part of the model adjustment.

[0344] (Application Example 2)

[0345] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0346] Current content generation systems have a problem in that they struggle to create personalized content that adapts to users' emotions. In particular, conventional technologies are insufficient to dynamically reflect a user's psychological state and provide content optimized for that moment. Therefore, new technologies are needed to improve the user experience and provide high-quality content.

[0347] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0348] In this invention, the server includes means for automatically generating content using a generative model, means for adjusting the generative model based on evaluation information and the user's psychological state, and means for recognizing the user's emotions and generating recommendation information based on that information. This makes it possible to generate and present optimal content that is in line with the user's emotions.

[0349] A "generative model" is an algorithm for automatically generating new content based on data.

[0350] "Evaluation information" refers to information based on feedback and comments received from users regarding the generated content.

[0351] "Psychological state" refers to the internal state that represents the user's emotions and mood.

[0352] "Means of recognizing emotions" refers to technologies that analyze a user's emotions from their facial expressions, tone of voice, and other factors.

[0353] "Recommendation information" refers to information used to present the most suitable content based on the user's emotional state.

[0354] The specific system for implementing this invention generates and recommends content that takes into account the user's psychological state. The user connects to the system using a terminal and provides their emotional state through a camera and microphone. This information is transmitted from the terminal to the server.

[0355] The server utilizes emotion analysis APIs (e.g., Affectiva or Microsoft Azure Face API) as a means of recognizing emotions. The resulting emotion data is processed to analyze the user's psychological state. Next, the server uses a generative model to generate appropriate recommendations based on the emotion information. At this stage, content best suited to the user's emotions is selected.

[0356] The generated content information is fed back to the user's device via the network and presented as part of the user's experience. This allows users to enjoy more personalized content in real time.

[0357] For example, if a user is seeking relaxation after work, the system will recognize the user's tired expression and recommend videos of relaxing nature scenes or music. An example of a prompt for the generative AI model would be, "Consider the most suitable video content for a user seeking relaxation and create a recommendation list."

[0358] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0359] Step 1:

[0360] The user accesses the system through a terminal. Here, the user uses a camera and microphone to record their facial expressions and voice, and sends this data to the server. The input is the user's facial expression data and voice data, and the output is the transmission of this data to the server. The data is encrypted and transmitted in a secure manner.

[0361] Step 2:

[0362] The server analyzes the received facial expression and voice data. It uses an emotion analysis API to analyze the user's emotional state. The input is facial expression and voice data, and the emotional information resulting from the analysis is output. In this step, the emotion recognition engine is used to determine, for example, whether the user is feeling "relaxed" or "stressed."

[0363] Step 3:

[0364] The server uses a generative model to generate appropriate content based on the analyzed sentiment information. The input is sentiment information, and the output is a content recommendation list for the user. The generative AI model creates a video list based on the prompt "Create a video suitable for when the user is seeking relaxation."

[0365] Step 4:

[0366] The server sends the generated content recommendation list to the user's device. The input is the list of recommended content, and the output is the provision of the list to the user's device. The data is transmitted in a streaming or downloadable format.

[0367] Step 5:

[0368] Users view the content provided on their devices and enjoy the experience. They play the output content and provide feedback. The user's feedback information is sent back to the server for use in subsequent analysis and adjustment of the generative model. The input is the viewer's feedback, and the output is the provision of feedback data to the server.

[0369] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0370] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0371] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0372] [Third Embodiment]

[0373] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0374] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0375] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0376] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0377] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0378] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0379] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0380] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0381] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0382] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0383] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0384] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0385] This invention relates to an automated content generation system using a generative model, and mainly consists of a server, a terminal, and a user. The server automatically generates content using the generative model based on requirements provided by the user through the terminal. The generated content includes visual design and text content. The server collects user evaluation information on the generated content via the terminal.

[0386] Users specify themes, objectives, and target audiences that suit their needs to the server and request content generation. For example, if advertising materials are needed, the user would specify "promotional advertising for a new product," and the server would generate relevant images and text based on the specified requirements.

[0387] The device presents the generated content to the user and collects feedback. This feedback includes suggestions for improvements regarding the design's color scheme, text style, and content structure.

[0388] The server adjusts the generation model based on the collected evaluation information and reflects it in the next content generation. This enables continuous quality improvement. Furthermore, because this system operates in a cloud environment, it can be accessed from different information processing devices and can be used effectively even in remote work environments.

[0389] For example, when a user requests the creation of a "summer campaign poster," the server automatically generates a poster that reflects seasonal colors and graphics, and also suggests a catchy slogan. Furthermore, it can be further improved based on user feedback. This process makes it possible to achieve both increased efficiency and improved quality in creative production.

[0390] The following describes the processing flow.

[0391] Step 1:

[0392] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0393] Step 2:

[0394] The server analyzes the requirements received from the terminal and uses this to determine the parameters of the generative model. During this process, it selects appropriate design templates and writing styles.

[0395] Step 3:

[0396] The server operates a generative model based on the analysis results to generate content. For visual designs, it selects relevant images from a material library and applies them to a template. For text, it creates sentences in the specified tone.

[0397] Step 4:

[0398] The server sends the generated content to the device and prompts the user for review. The user reviews the content on the device and provides feedback with any requested revisions.

[0399] Step 5:

[0400] The device collects user feedback and sends it to the server. The collected feedback includes suggestions for improvement and specific points to note.

[0401] Step 6:

[0402] The server analyzes the feedback information and uses it to adjust the generative model. This analysis clarifies areas for improvement to enhance the quality of content generated in the future.

[0403] Step 7:

[0404] The server restarts the process by hosting the optimized generative model in the cloud environment and preparing for the next content generation request.

[0405] (Example 1)

[0406] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0407] Automated content generation systems are required to efficiently generate high-quality visual and textual information based on specific user requests. However, conventional systems have struggled to improve the quality of generated content and flexibly respond to diverse user requests. Furthermore, they lacked sufficient flexibility for remote access.

[0408] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0409] In this invention, the server includes means for automatically generating content based on requests received from users using a generation model, means for presenting the generated content on an information display device and collecting evaluation information from users, means for modifying the generation model based on the collected evaluation information and reflecting the changes in subsequent generation, and means for operating in a cloud environment and enabling access from multiple information display devices. This enables the efficient provision of high-quality content that meets the diverse needs of users, as well as flexible use through remote access.

[0410] A "generative model" is a general term for algorithms and technologies used to automatically generate content based on user requests.

[0411] "Content" refers to creative works that combine visual art and textual information, thereby providing information that meets the user's needs.

[0412] An "information display device" refers to a terminal or device that displays content generated from a server, allowing users to recognize it and input their evaluations.

[0413] "Evaluation information" refers to feedback that users provide on generated content, which is used to improve content quality and modify the generation model.

[0414] A "cloud environment" is a distributed information processing infrastructure that provides computing resources and storage via the internet, enabling flexible access.

[0415] "Information processing equipment" refers to electronic devices such as computers, terminals, and servers that process, store, and communicate data.

[0416] A "prompt statement" is text input into a generative model, and its role is to provide instructions for generating appropriate content in response to a request.

[0417] This invention relates to an automated content generation system using a generative AI model. The system mainly consists of a server, terminals, and users.

[0418] The server generates content based on user requests using a generative model. This generation process utilizes software for natural language processing (e.g., the GPT model). On the server, an algorithm, corresponding to the requested prompt, processes the data and automatically generates content that combines visual art and textual information. For hardware, high-performance computing resources running on a cloud infrastructure (e.g., virtual servers from a cloud provider) are used.

[0419] The terminal plays the role of presenting the generated content to the user. Specifically, it displays the content in a viewable format through the user's browser or dedicated application. The terminal also provides an interface for collecting user feedback and evaluation information.

[0420] Users input requests to the server via their device to generate content tailored to their needs. An example of a prompt might be, "Create a visually appealing poster with bright colors for a summer campaign. The target audience is young people, and the theme is the beach." This provides specific instructions to the generation model, enabling the creation of content suitable for the user's objectives.

[0421] The server also has the functionality to improve the generation model based on the collected evaluation information and reflect that improvement in subsequent content generation. This enables continuous quality improvement and adjustments to meet user requests.

[0422] Because this invention operates in a cloud environment, it can be accessed from multiple information processing devices and can be effectively used in remote work. Therefore, this system provides users with a convenient and flexible content generation platform.

[0423] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0424] Step 1:

[0425] The user enters a request for the content they want to generate via their device. Specifically, they enter prompt text into a form in their browser or a dedicated application. This input is in text format, for example, "Please create a poster for a summer campaign. The target audience is young people, and the theme is the beach." The entered data is sent to the server by the device.

[0426] Step 2:

[0427] The server receives a user request. Based on this, the server constructs internal data to be passed to the generative model. Specifically, it parses the prompt text and extracts the necessary keywords and concepts. This prepares appropriate instructions for the generative AI model. The input is the received request, and the output is prompt data for the generative model.

[0428] Step 3:

[0429] The server generates content using a generative AI model. Prompt data is input to the generative model, and the algorithm processes the data to generate visual designs and text structures. As a result of the processing, image files and text data are generated. The input is prompt data, and the output is content data.

[0430] Step 4:

[0431] The terminal presents the generated content to the user. Specifically, it displays the generated images and text data on the screen for the user to review. The input is the generated content received from the server, and the output is what is displayed to the user.

[0432] Step 5:

[0433] Users evaluate the presented content and input feedback into their devices. Specifically, they write their opinions on the appearance, style, and content in text and send it to the server via their devices. The input is evaluation data from the user, and the output is sent to the server.

[0434] Step 6:

[0435] The server collects user feedback and uses it to improve the generative model. This feedback is analyzed to adjust the model's parameters, which are then used to improve future generation. This involves an optimization process for the generative model. The input is the feedback, and the output is the adjusted model parameters.

[0436] (Application Example 1)

[0437] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0438] In modern advertising production, it's necessary to create high-quality advertising materials quickly and effectively to appeal to the target audience, and to improve their effectiveness in real time. However, traditional methods are time-consuming and require manual trial and error, making improvements in production efficiency and effectiveness measurement a challenge.

[0439] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0440] In this invention, the server includes means for automatically generating advertising materials using a generative model, means for specifying the content of an advertising campaign based on requirements entered by the user through a terminal, and means for collecting feedback information on the generated advertising materials. This enables users to automatically generate high-quality advertising materials in a short time and to improve the advertising content based on feedback.

[0441] A "generative model" is an artificial intelligence algorithm that automatically generates advertising materials based on requirements provided by the user.

[0442] "Advertising materials" refer to content, including visual designs and written expressions, used for promotional purposes.

[0443] "Requirements" refer to information that indicates the target content and design guidelines for an advertising campaign.

[0444] "Feedback information" refers to information that indicates users' evaluations of the generated advertising materials and their requests for improvement.

[0445] "Cloud infrastructure" refers to a virtual computing resource and storage environment where servers are provided over the internet.

[0446] The system implementing this invention is centered around a server running on a cloud infrastructure. Users can input advertising campaign requirements into the server using a device such as a smartphone. The server automatically generates advertising materials based on the specified requirements using a generative AI model. Visual design and text are included as components in this process. The generated advertising materials are displayed on the user's device, and after reviewing them, the user can send feedback with suggestions for improvement or requests for changes.

[0447] The server adjusts the generation model based on this feedback information and incorporates it into subsequent ad generation. This improves the overall ad production capabilities of the system. Specifically, based on prompts such as "Create a summer campaign ad for young people, use a lot of blue in the color palette, and use modern fonts," effective ads for the target audience are generated.

[0448] The hardware used in this process includes the user's device (such as a smartphone or tablet). On the software side, a generative AI model runs on a server, along with a program to control and manage it. This allows users to quickly create advertising materials and improve their quality through real-time feedback.

[0449] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0450] Step 1:

[0451] The user enters the requirements for the advertising campaign using a device such as a smartphone. As a prompt, they send specific instructions to the server, such as, "Create a summer campaign ad targeting young people, use a color palette heavily featuring blue, and use a modern font." The input in this step is the prompt provided by the user, and the output is the requirements data sent to the server.

[0452] Step 2:

[0453] The server executes a generation AI model based on the received requirements data. This model automatically generates advertising materials based on input prompts. Here, it incorporates the specified color palette and font style to construct the visual design and catchphrase. The output obtained through the generation process is advertising material tailored to the user's requirements.

[0454] Step 3:

[0455] The generated advertising material is displayed on the user's device. The user reviews the advertising material and provides feedback on the visual design and wording. Through the input of feedback, specific revision requests, such as color adjustments or font changes, are sent to the server. The output of Step 3 is the feedback information.

[0456] Step 4:

[0457] The server analyzes the collected feedback information and adjusts the generating AI model. Specifically, it tunes model parameters and improves the algorithm to reflect user feedback. This adjustment aims to produce higher quality results in subsequent ad generation. The resulting output represents the new state of the adjusted model.

[0458] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0459] This invention combines an automated content generation system using a generative model with an emotion engine that recognizes user emotions. It primarily consists of a server, a terminal, and a user. The emotion engine understands the user's emotional state and generates and adjusts content based on that information.

[0460] Users access the system via a terminal and enter project requirements. These requirements include the theme, purpose, target audience, and desired design and tone style. This information is sent to the server and analyzed.

[0461] The server generates optimal content based on user requirements and emotional information provided by the emotion engine. The emotion engine recognizes emotions based on the user's facial expressions, voice, or text input, and feeds this data back to the server. By considering this emotional information, content is generated in a way that adapts to the user's psychological state.

[0462] For example, if a user wants to create a "promotional video," the emotion engine recognizes that the user's current emotions are positive based on facial expressions and voice tone collected through the device's camera and microphone. Based on this information, the server generates the video by selecting music with a cheerful atmosphere and brightening the colors.

[0463] The generated content is presented to the user via their device, and their evaluation of that content is collected as feedback. Furthermore, the feedback and sentiment information are fed back into the server's generation model and used to improve content quality. The integration of the sentiment engine enables the rapid delivery of more personalized, high-quality content.

[0464] Thus, the present invention enables the creation of adaptive content that responds to the user's emotional state, maximizing the efficiency of creative production resources. Implementing it in a cloud environment allows access from multiple information processing devices and supports remote work.

[0465] The following describes the processing flow.

[0466] Step 1:

[0467] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0468] Step 2:

[0469] The emotion engine uses the device's camera and microphone to recognize the user's emotional state from their facial expressions and voice. This emotional information is analyzed in real time and sent to the server.

[0470] Step 3:

[0471] The server analyzes the requirements and sentiment information received from the terminal and determines the appropriate generative model parameters. This process selects specific design templates and content tones that correspond to the user's emotions.

[0472] Step 4:

[0473] The server runs a generative model based on the analysis results to generate content. For visual design generation, it selects colors and adjusts layouts to consider user emotions. For text content generation, it adopts styles and language appropriate to those emotions.

[0474] Step 5:

[0475] The generated content is sent to the device and displayed to the user. The user reviews the content and provides feedback, including comments and suggestions for revisions.

[0476] Step 6:

[0477] The device collects user feedback and sends it to the server. This information includes specific suggestions for further improving the content.

[0478] Step 7:

[0479] The server analyzes emotional information and user feedback, using it to refine the generative model. This analysis allows for further customization of content generation to better reflect user emotions.

[0480] Step 8:

[0481] The server hosts the optimized generative model in the cloud environment, preparing for the next content generation request. This restarts the process, and it repeats.

[0482] (Example 2)

[0483] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0484] Conventional content generation systems face the challenge of generating personalized information that effectively reflects user emotions and feedback. Furthermore, there is a need for efficient methods to quickly respond to the diverse needs of users. Additionally, improving usability in environments accessible from multiple data processing devices is another challenge.

[0485] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0486] In this invention, the server includes means for automatically generating information using a generative model, means for collecting user sentiment information from a terminal, and means for adjusting the automatically generated information based on the collected sentiment information. This enables the rapid reflection of user sentiment and feedback, and the generation of personalized and high-quality information.

[0487] A "generative model" refers to an algorithm that automatically creates new information based on data.

[0488] "Information" refers to the elements that make up content, such as visual representations and textual information.

[0489] A "terminal" is a device used by users to access and operate an information processing system.

[0490] "User" refers to an individual or organization that generates or receives information using an information processing system.

[0491] "Emotional information" refers to data that indicates the psychological state of a user, analyzed based on factors such as their facial expressions and tone of voice.

[0492] "Evaluation information" refers to feedback data that reflects users' opinions and impressions about the generated information.

[0493] A "virtual environment" refers to a network infrastructure, often provided using cloud technology, that allows a large number of data processing devices to connect.

[0494] A "data processing device" refers to a computer device used for generating, processing, and transmitting information.

[0495] This invention is a system that automatically generates information using a generative AI model. Its main components are a server, a terminal, and a user.

[0496] server

[0497] The server is equipped with software to run a generative AI model and plays the role of automatically generating information. The server receives prompt messages sent from the user through the terminal and analyzes the text information. Then, based on the analysis results and sentiment information provided by the sentiment engine, it generates information with visual representations and textual information. As a result, the generated information becomes content that reflects the user's emotions and feedback.

[0498] terminal

[0499] A terminal is a hardware device used by users to access the system and input prompt messages. The terminal is equipped with a camera and microphone, which are used to collect emotional information such as the user's facial expressions and tone of voice. This data is transmitted to a server in real time and used for sentiment analysis.

[0500] User

[0501] The user enters project requirements using a terminal. The entered requirements include theme, purpose, target audience, and design and tone style, and this information is sent to the server as a prompt. For example, by entering a prompt such as, "Please create a promotional video. The theme is 'Summer Vacation,' and I would like a positive and energetic tone. The target audience is young people aged 18-25," it is possible to communicate specific generation requirements to the system.

[0502] Thus, this invention enables the efficient and rapid delivery of personalized, high-quality content based on the individual emotions and feedback of users. Because this system operates in a virtual environment, it allows for flexible access from various data processing devices and achieves high scalability in its operation.

[0503] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0504] Step 1:

[0505] The user inputs project requirements through a terminal. Specifically, they use the terminal's input interface to create prompt statements, which include the theme, purpose, target audience, design, and tone style. This input information is sent to the server. The entered prompt statements reach the server and are ready for analysis.

[0506] Step 2:

[0507] The device plays a role in collecting user emotional information. Using the camera and microphone built into the device, it collects facial expression and voice tone data in real time. This data is immediately sent to the server and analyzed by the emotion engine. Facial expressions and voice tone are sent to the server as input data and output as emotional information.

[0508] Step 3:

[0509] The server uses natural language processing techniques to analyze the received prompt message and understand its content. It combines the analyzed information with the emotion information recognized by the emotion engine to prepare a dataset for input to the generative AI model. The output dataset is formed from the input prompt message and emotion information.

[0510] Step 4:

[0511] The server uses a generative AI model to construct automatically generated information. The model generates visual representations and textual information based on the input dataset. For example, if positive emotional information is included, bright and cheerful content will be generated. The output obtained from the input dataset is the automatically generated information.

[0512] Step 5:

[0513] The terminal presents the generated information to the user. In this presentation, the generated visuals and text are displayed on the terminal's screen for the user to review. The generated information is then provided to the user as output.

[0514] Step 6:

[0515] The user provides feedback on the presented information. Using the terminal's feedback input interface, they input an evaluation, such as how well the information matches their expectations. This feedback is sent to the server. The feedback received from the user is input and sent to the server as output.

[0516] Step 7:

[0517] The server analyzes the collected feedback and uses it, along with emotional information, to adjust the AI model. The feedback improves the model's output accuracy and adaptability, which is then used to improve future information generation. Feedback and emotional information are input, and output is generated as part of the model adjustment.

[0518] (Application Example 2)

[0519] Next, we will explain Application Example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0520] Current content generation systems have a problem in that they struggle to create personalized content that adapts to users' emotions. In particular, conventional technologies are insufficient to dynamically reflect a user's psychological state and provide content optimized for that moment. Therefore, new technologies are needed to improve the user experience and provide high-quality content.

[0521] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0522] In this invention, the server includes means for automatically generating content using a generative model, means for adjusting the generative model based on evaluation information and the user's psychological state, and means for recognizing the user's emotions and generating recommendation information based on that information. This makes it possible to generate and present optimal content that is in line with the user's emotions.

[0523] A "generative model" is an algorithm for automatically generating new content based on data.

[0524] "Evaluation information" refers to information based on feedback and comments received from users regarding the generated content.

[0525] "Psychological state" refers to the internal state that represents the user's emotions and mood.

[0526] "Means of recognizing emotions" refers to technologies that analyze a user's emotions from their facial expressions, tone of voice, and other factors.

[0527] "Recommendation information" refers to information used to present the most suitable content based on the user's emotional state.

[0528] The specific system for implementing this invention generates and recommends content that takes into account the user's psychological state. The user connects to the system using a terminal and provides their emotional state through a camera and microphone. This information is transmitted from the terminal to the server.

[0529] The server utilizes emotion analysis APIs (e.g., Affectiva or Microsoft Azure Face API) as a means of recognizing emotions. The resulting emotion data is processed to analyze the user's psychological state. Next, the server uses a generative model to generate appropriate recommendations based on the emotion information. At this stage, content best suited to the user's emotions is selected.

[0530] The generated content information is fed back to the user's device via the network and presented as part of the user's experience. This allows users to enjoy more personalized content in real time.

[0531] For example, if a user is seeking relaxation after work, the system will recognize the user's tired expression and recommend videos of relaxing nature scenes or music. An example of a prompt for the generative AI model would be, "Consider the most suitable video content for a user seeking relaxation and create a recommendation list."

[0532] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0533] Step 1:

[0534] The user accesses the system through a terminal. Here, the user uses a camera and microphone to record their facial expressions and voice, and sends this data to the server. The input is the user's facial expression data and voice data, and the output is the transmission of this data to the server. The data is encrypted and transmitted in a secure manner.

[0535] Step 2:

[0536] The server analyzes the received facial expression and voice data. It uses an emotion analysis API to analyze the user's emotional state. The input is facial expression and voice data, and the emotional information resulting from the analysis is output. In this step, the emotion recognition engine is used to determine, for example, whether the user is feeling "relaxed" or "stressed."

[0537] Step 3:

[0538] The server uses a generative model to generate appropriate content based on the analyzed sentiment information. The input is sentiment information, and the output is a content recommendation list for the user. The generative AI model creates a video list based on the prompt "Create a video suitable for when the user is seeking relaxation."

[0539] Step 4:

[0540] The server sends the generated content recommendation list to the user's device. The input is the list of recommended content, and the output is the provision of the list to the user's device. The data is transmitted in a streaming or downloadable format.

[0541] Step 5:

[0542] Users view the content provided on their devices and enjoy the experience. They play the output content and provide feedback. The user's feedback information is sent back to the server for use in subsequent analysis and adjustment of the generative model. The input is the viewer's feedback, and the output is the provision of feedback data to the server.

[0543] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0544] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0545] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0546] [Fourth Embodiment]

[0547] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0548] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0549] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0550] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0551] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0552] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0553] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0554] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0555] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0556] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0557] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0558] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0559] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0560] This invention relates to an automated content generation system using a generative model, and mainly consists of a server, a terminal, and a user. The server automatically generates content using the generative model based on requirements provided by the user through the terminal. The generated content includes visual design and text content. The server collects user evaluation information on the generated content via the terminal.

[0561] Users specify themes, objectives, and target audiences that suit their needs to the server and request content generation. For example, if advertising materials are needed, the user would specify "promotional advertising for a new product," and the server would generate relevant images and text based on the specified requirements.

[0562] The device presents the generated content to the user and collects feedback. This feedback includes suggestions for improvements regarding the design's color scheme, text style, and content structure.

[0563] The server adjusts the generation model based on the collected evaluation information and reflects it in the next content generation. This enables continuous quality improvement. Furthermore, because this system operates in a cloud environment, it can be accessed from different information processing devices and can be used effectively even in remote work environments.

[0564] For example, when a user requests the creation of a "summer campaign poster," the server automatically generates a poster that reflects seasonal colors and graphics, and also suggests a catchy slogan. Furthermore, it can be further improved based on user feedback. This process makes it possible to achieve both increased efficiency and improved quality in creative production.

[0565] The following describes the processing flow.

[0566] Step 1:

[0567] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0568] Step 2:

[0569] The server analyzes the requirements received from the terminal and uses this to determine the parameters of the generative model. During this process, it selects appropriate design templates and writing styles.

[0570] Step 3:

[0571] The server operates a generative model based on the analysis results to generate content. For visual designs, it selects relevant images from a material library and applies them to a template. For text, it creates sentences in the specified tone.

[0572] Step 4:

[0573] The server sends the generated content to the device and prompts the user for review. The user reviews the content on the device and provides feedback with any requested revisions.

[0574] Step 5:

[0575] The device collects user feedback and sends it to the server. The collected feedback includes suggestions for improvement and specific points to note.

[0576] Step 6:

[0577] The server analyzes the feedback information and uses it to adjust the generative model. This analysis clarifies areas for improvement to enhance the quality of content generated in the future.

[0578] Step 7:

[0579] The server restarts the process by hosting the optimized generative model in the cloud environment and preparing for the next content generation request.

[0580] (Example 1)

[0581] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0582] Automated content generation systems are required to efficiently generate high-quality visual and textual information based on specific user requests. However, conventional systems have struggled to improve the quality of generated content and flexibly respond to diverse user requests. Furthermore, they lacked sufficient flexibility for remote access.

[0583] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0584] In this invention, the server includes means for automatically generating content based on requests received from users using a generation model, means for presenting the generated content on an information display device and collecting evaluation information from users, means for modifying the generation model based on the collected evaluation information and reflecting the changes in subsequent generation, and means for operating in a cloud environment and enabling access from multiple information display devices. This enables the efficient provision of high-quality content that meets the diverse needs of users, as well as flexible use through remote access.

[0585] A "generative model" is a general term for algorithms and technologies used to automatically generate content based on user requests.

[0586] "Content" refers to creative works that combine visual art and textual information, thereby providing information that meets the user's needs.

[0587] An "information display device" refers to a terminal or device that displays content generated from a server, allowing users to recognize it and input their evaluations.

[0588] "Evaluation information" refers to feedback that users provide on generated content, which is used to improve content quality and modify the generation model.

[0589] A "cloud environment" is a distributed information processing infrastructure that provides computing resources and storage via the internet, enabling flexible access.

[0590] "Information processing equipment" refers to electronic devices such as computers, terminals, and servers that process, store, and communicate data.

[0591] A "prompt statement" is text input into a generative model, and its role is to provide instructions for generating appropriate content in response to a request.

[0592] This invention relates to an automated content generation system using a generative AI model. The system mainly consists of a server, terminals, and users.

[0593] The server generates content based on user requests using a generative model. This generation process utilizes software for natural language processing (e.g., the GPT model). On the server, an algorithm, corresponding to the requested prompt, processes the data and automatically generates content that combines visual art and textual information. For hardware, high-performance computing resources running on a cloud infrastructure (e.g., virtual servers from a cloud provider) are used.

[0594] The terminal plays the role of presenting the generated content to the user. Specifically, it displays the content in a viewable format through the user's browser or dedicated application. The terminal also provides an interface for collecting user feedback and evaluation information.

[0595] Users input requests to the server via their device to generate content tailored to their needs. An example of a prompt might be, "Create a visually appealing poster with bright colors for a summer campaign. The target audience is young people, and the theme is the beach." This provides specific instructions to the generation model, enabling the creation of content suitable for the user's objectives.

[0596] The server also has the functionality to improve the generation model based on the collected evaluation information and reflect that improvement in subsequent content generation. This enables continuous quality improvement and adjustments to meet user requests.

[0597] Because this invention operates in a cloud environment, it can be accessed from multiple information processing devices and can be effectively used in remote work. Therefore, this system provides users with a convenient and flexible content generation platform.

[0598] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0599] Step 1:

[0600] The user enters a request for the content they want to generate via their device. Specifically, they enter prompt text into a form in their browser or a dedicated application. This input is in text format, for example, "Please create a poster for a summer campaign. The target audience is young people, and the theme is the beach." The entered data is sent to the server by the device.

[0601] Step 2:

[0602] The server receives a user request. Based on this, the server constructs internal data to be passed to the generative model. Specifically, it parses the prompt text and extracts the necessary keywords and concepts. This prepares appropriate instructions for the generative AI model. The input is the received request, and the output is prompt data for the generative model.

[0603] Step 3:

[0604] The server generates content using a generative AI model. Prompt data is input to the generative model, and the algorithm processes the data to generate visual designs and text structures. As a result of the processing, image files and text data are generated. The input is prompt data, and the output is content data.

[0605] Step 4:

[0606] The terminal presents the generated content to the user. Specifically, it displays the generated images and text data on the screen for the user to review. The input is the generated content received from the server, and the output is what is displayed to the user.

[0607] Step 5:

[0608] Users evaluate the presented content and input feedback into their devices. Specifically, they write their opinions on the appearance, style, and content in text and send it to the server via their devices. The input is evaluation data from the user, and the output is sent to the server.

[0609] Step 6:

[0610] The server collects user feedback and uses it to improve the generative model. This feedback is analyzed to adjust the model's parameters, which are then used to improve future generation. This involves an optimization process for the generative model. The input is the feedback, and the output is the adjusted model parameters.

[0611] (Application Example 1)

[0612] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0613] In modern advertising production, it's necessary to create high-quality advertising materials quickly and effectively to appeal to the target audience, and to improve their effectiveness in real time. However, traditional methods are time-consuming and require manual trial and error, making improvements in production efficiency and effectiveness measurement a challenge.

[0614] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0615] In this invention, the server includes means for automatically generating advertising materials using a generative model, means for specifying the content of an advertising campaign based on requirements entered by the user through a terminal, and means for collecting feedback information on the generated advertising materials. This enables users to automatically generate high-quality advertising materials in a short time and to improve the advertising content based on feedback.

[0616] A "generative model" is an artificial intelligence algorithm that automatically generates advertising materials based on requirements provided by the user.

[0617] "Advertising materials" refer to content, including visual designs and written expressions, used for promotional purposes.

[0618] "Requirements" refer to information that indicates the target content and design guidelines for an advertising campaign.

[0619] "Feedback information" refers to information that indicates users' evaluations of the generated advertising materials and their requests for improvement.

[0620] "Cloud infrastructure" refers to a virtual computing resource and storage environment where servers are provided over the internet.

[0621] The system implementing this invention is centered around a server running on a cloud infrastructure. Users can input advertising campaign requirements into the server using a device such as a smartphone. The server automatically generates advertising materials based on the specified requirements using a generative AI model. Visual design and text are included as components in this process. The generated advertising materials are displayed on the user's device, and after reviewing them, the user can send feedback with suggestions for improvement or requests for changes.

[0622] The server adjusts the generation model based on this feedback information and incorporates it into subsequent ad generation. This improves the overall ad production capabilities of the system. Specifically, based on prompts such as "Create a summer campaign ad for young people, use a lot of blue in the color palette, and use modern fonts," effective ads for the target audience are generated.

[0623] The hardware used in this process includes the user's device (such as a smartphone or tablet). On the software side, a generative AI model runs on a server, along with a program to control and manage it. This allows users to quickly create advertising materials and improve their quality through real-time feedback.

[0624] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0625] Step 1:

[0626] The user enters the requirements for the advertising campaign using a device such as a smartphone. As a prompt, they send specific instructions to the server, such as, "Create a summer campaign ad targeting young people, use a color palette heavily featuring blue, and use a modern font." The input in this step is the prompt provided by the user, and the output is the requirements data sent to the server.

[0627] Step 2:

[0628] The server executes a generation AI model based on the received requirements data. This model automatically generates advertising materials based on input prompts. Here, it incorporates the specified color palette and font style to construct the visual design and catchphrase. The output obtained through the generation process is advertising material tailored to the user's requirements.

[0629] Step 3:

[0630] The generated advertising material is displayed on the user's device. The user reviews the advertising material and provides feedback on the visual design and wording. Through the input of feedback, specific revision requests, such as color adjustments or font changes, are sent to the server. The output of Step 3 is the feedback information.

[0631] Step 4:

[0632] The server analyzes the collected feedback information and adjusts the generating AI model. Specifically, it tunes model parameters and improves the algorithm to reflect user feedback. This adjustment aims to produce higher quality results in subsequent ad generation. The resulting output represents the new state of the adjusted model.

[0633] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0634] This invention combines an automated content generation system using a generative model with an emotion engine that recognizes user emotions. It primarily consists of a server, a terminal, and a user. The emotion engine understands the user's emotional state and generates and adjusts content based on that information.

[0635] Users access the system via a terminal and enter project requirements. These requirements include the theme, purpose, target audience, and desired design and tone style. This information is sent to the server and analyzed.

[0636] The server generates optimal content based on user requirements and emotional information provided by the emotion engine. The emotion engine recognizes emotions based on the user's facial expressions, voice, or text input, and feeds this data back to the server. By considering this emotional information, content is generated in a way that adapts to the user's psychological state.

[0637] For example, if a user wants to create a "promotional video," the emotion engine recognizes that the user's current emotions are positive based on facial expressions and voice tone collected through the device's camera and microphone. Based on this information, the server generates the video by selecting music with a cheerful atmosphere and brightening the colors.

[0638] The generated content is presented to the user via their device, and their evaluation of that content is collected as feedback. Furthermore, the feedback and sentiment information are fed back into the server's generation model and used to improve content quality. The integration of the sentiment engine enables the rapid delivery of more personalized, high-quality content.

[0639] Thus, the present invention enables the creation of adaptive content that responds to the user's emotional state, maximizing the efficiency of creative production resources. Implementing it in a cloud environment allows access from multiple information processing devices and supports remote work.

[0640] The following describes the processing flow.

[0641] Step 1:

[0642] Users access the system via a terminal and enter project requirements, including the theme, purpose, target audience, and desired design and tone style.

[0643] Step 2:

[0644] The emotion engine uses the device's camera and microphone to recognize the user's emotional state from their facial expressions and voice. This emotional information is analyzed in real time and sent to the server.

[0645] Step 3:

[0646] The server analyzes the requirements and sentiment information received from the terminal and determines the appropriate generative model parameters. This process selects specific design templates and content tones that correspond to the user's emotions.

[0647] Step 4:

[0648] The server runs a generative model based on the analysis results to generate content. For visual design generation, it selects colors and adjusts layouts to consider user emotions. For text content generation, it adopts styles and language appropriate to those emotions.

[0649] Step 5:

[0650] The generated content is sent to the device and displayed to the user. The user reviews the content and provides feedback, including comments and suggestions for revisions.

[0651] Step 6:

[0652] The device collects user feedback and sends it to the server. This information includes specific suggestions for further improving the content.

[0653] Step 7:

[0654] The server analyzes emotional information and user feedback, using it to refine the generative model. This analysis allows for further customization of content generation to better reflect user emotions.

[0655] Step 8:

[0656] The server hosts the optimized generative model in the cloud environment, preparing for the next content generation request. This restarts the process, and it repeats.

[0657] (Example 2)

[0658] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0659] Conventional content generation systems face the challenge of generating personalized information that effectively reflects user emotions and feedback. Furthermore, there is a need for efficient methods to quickly respond to the diverse needs of users. Additionally, improving usability in environments accessible from multiple data processing devices is another challenge.

[0660] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0661] In this invention, the server includes means for automatically generating information using a generative model, means for collecting user sentiment information from a terminal, and means for adjusting the automatically generated information based on the collected sentiment information. This enables the rapid reflection of user sentiment and feedback, and the generation of personalized and high-quality information.

[0662] A "generative model" refers to an algorithm that automatically creates new information based on data.

[0663] "Information" refers to the elements that make up content, such as visual representations and textual information.

[0664] A "terminal" is a device used by users to access and operate an information processing system.

[0665] "User" refers to an individual or organization that generates or receives information using an information processing system.

[0666] "Emotional information" refers to data that indicates the psychological state of a user, analyzed based on factors such as their facial expressions and tone of voice.

[0667] "Evaluation information" refers to feedback data that reflects users' opinions and impressions about the generated information.

[0668] A "virtual environment" refers to a network infrastructure, often provided using cloud technology, that allows a large number of data processing devices to connect.

[0669] A "data processing device" refers to a computer device used for generating, processing, and transmitting information.

[0670] This invention is a system that automatically generates information using a generative AI model. Its main components are a server, a terminal, and a user.

[0671] server

[0672] The server is equipped with software to run a generative AI model and plays the role of automatically generating information. The server receives prompt messages sent from the user through the terminal and analyzes the text information. Then, based on the analysis results and sentiment information provided by the sentiment engine, it generates information with visual representations and textual information. As a result, the generated information becomes content that reflects the user's emotions and feedback.

[0673] terminal

[0674] A terminal is a hardware device used by users to access the system and input prompt messages. The terminal is equipped with a camera and microphone, which are used to collect emotional information such as the user's facial expressions and tone of voice. This data is transmitted to a server in real time and used for sentiment analysis.

[0675] User

[0676] The user enters project requirements using a terminal. The entered requirements include theme, purpose, target audience, and design and tone style, and this information is sent to the server as a prompt. For example, by entering a prompt such as, "Please create a promotional video. The theme is 'Summer Vacation,' and I would like a positive and energetic tone. The target audience is young people aged 18-25," it is possible to communicate specific generation requirements to the system.

[0677] Thus, this invention enables the efficient and rapid delivery of personalized, high-quality content based on the individual emotions and feedback of users. Because this system operates in a virtual environment, it allows for flexible access from various data processing devices and achieves high scalability in its operation.

[0678] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0679] Step 1:

[0680] The user inputs project requirements through a terminal. Specifically, they use the terminal's input interface to create prompt statements, which include the theme, purpose, target audience, design, and tone style. This input information is sent to the server. The entered prompt statements reach the server and are ready for analysis.

[0681] Step 2:

[0682] The device plays a role in collecting user emotional information. Using the camera and microphone built into the device, it collects facial expression and voice tone data in real time. This data is immediately sent to the server and analyzed by the emotion engine. Facial expressions and voice tone are sent to the server as input data and output as emotional information.

[0683] Step 3:

[0684] The server uses natural language processing techniques to analyze the received prompt message and understand its content. It combines the analyzed information with the emotion information recognized by the emotion engine to prepare a dataset for input to the generative AI model. The output dataset is formed from the input prompt message and emotion information.

[0685] Step 4:

[0686] The server uses a generative AI model to construct automatically generated information. The model generates visual representations and textual information based on the input dataset. For example, if positive emotional information is included, bright and cheerful content will be generated. The output obtained from the input dataset is the automatically generated information.

[0687] Step 5:

[0688] The terminal presents the generated information to the user. In this presentation, the generated visuals and text are displayed on the terminal's screen for the user to review. The generated information is then provided to the user as output.

[0689] Step 6:

[0690] The user provides feedback on the presented information. Using the terminal's feedback input interface, they input an evaluation, such as how well the information matches their expectations. This feedback is sent to the server. The feedback received from the user is input and sent to the server as output.

[0691] Step 7:

[0692] The server analyzes the collected feedback and uses it, along with emotional information, to adjust the AI model. The feedback improves the model's output accuracy and adaptability, which is then used to improve future information generation. Feedback and emotional information are input, and output is generated as part of the model adjustment.

[0693] (Application Example 2)

[0694] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0695] Current content generation systems have a problem in that they struggle to create personalized content that adapts to users' emotions. In particular, conventional technologies are insufficient to dynamically reflect a user's psychological state and provide content optimized for that moment. Therefore, new technologies are needed to improve the user experience and provide high-quality content.

[0696] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0697] In this invention, the server includes means for automatically generating content using a generative model, means for adjusting the generative model based on evaluation information and the user's psychological state, and means for recognizing the user's emotions and generating recommendation information based on that information. This makes it possible to generate and present optimal content that is in line with the user's emotions.

[0698] A "generative model" is an algorithm for automatically generating new content based on data.

[0699] "Evaluation information" refers to information based on feedback and comments received from users regarding the generated content.

[0700] "Psychological state" refers to the internal state that represents the user's emotions and mood.

[0701] "Means of recognizing emotions" refers to technologies that analyze a user's emotions from their facial expressions, tone of voice, and other factors.

[0702] "Recommendation information" refers to information used to present the most suitable content based on the user's emotional state.

[0703] The specific system for implementing this invention generates and recommends content that takes into account the user's psychological state. The user connects to the system using a terminal and provides their emotional state through a camera and microphone. This information is transmitted from the terminal to the server.

[0704] The server utilizes emotion analysis APIs (e.g., Affectiva or Microsoft Azure Face API) as a means of recognizing emotions. The resulting emotion data is processed to analyze the user's psychological state. Next, the server uses a generative model to generate appropriate recommendations based on the emotion information. At this stage, content best suited to the user's emotions is selected.

[0705] The generated content information is fed back to the user's device via the network and presented as part of the user's experience. This allows users to enjoy more personalized content in real time.

[0706] For example, if a user is seeking relaxation after work, the system will recognize the user's tired expression and recommend videos of relaxing nature scenes or music. An example of a prompt for the generative AI model would be, "Consider the most suitable video content for a user seeking relaxation and create a recommendation list."

[0707] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0708] Step 1:

[0709] The user accesses the system through a terminal. Here, the user uses a camera and microphone to record their facial expressions and voice, and sends this data to the server. The input is the user's facial expression data and voice data, and the output is the transmission of this data to the server. The data is encrypted and transmitted in a secure manner.

[0710] Step 2:

[0711] The server analyzes the received facial expression and voice data. It uses an emotion analysis API to analyze the user's emotional state. The input is facial expression and voice data, and the emotional information resulting from the analysis is output. In this step, the emotion recognition engine is used to determine, for example, whether the user is feeling "relaxed" or "stressed."

[0712] Step 3:

[0713] The server uses a generative model to generate appropriate content based on the analyzed sentiment information. The input is sentiment information, and the output is a content recommendation list for the user. The generative AI model creates a video list based on the prompt "Create a video suitable for when the user is seeking relaxation."

[0714] Step 4:

[0715] The server sends the generated content recommendation list to the user's device. The input is the list of recommended content, and the output is the provision of the list to the user's device. The data is transmitted in a streaming or downloadable format.

[0716] Step 5:

[0717] Users view the content provided on their devices and enjoy the experience. They play the output content and provide feedback. The user's feedback information is sent back to the server for use in subsequent analysis and adjustment of the generative model. The input is the viewer's feedback, and the output is the provision of feedback data to the server.

[0718] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0719] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0720] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0721] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0722] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0723] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0724] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0725] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0726] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0727] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0728] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0729] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0730] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0731] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0732] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0733] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0734] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0735] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0736] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0737] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0738] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted as being incorporated by reference.

[0739] The following is further disclosed regarding the embodiments described above.

[0740] (Claim 1)

[0741] A means of automatically generating content using a generative model,

[0742] A means for collecting evaluation information for the generated content,

[0743] Means for adjusting the generative model based on the aforementioned evaluation information,

[0744] A system that includes this.

[0745] (Claim 2)

[0746] The system according to claim 1, wherein the content consists of visual design and text.

[0747] (Claim 3)

[0748] The system according to claim 1, wherein the system operates in a cloud environment and allows access from multiple information processing devices.

[0749] "Example 1"

[0750] (Claim 1)

[0751] A means of automatically generating content based on requests received from users using a generative model,

[0752] A means for displaying the generated content on an information display device and collecting evaluation information from users,

[0753] A means for modifying the generation model based on the aforementioned evaluation information and reflecting the changes in subsequent generation,

[0754] A means of operating in a cloud environment and enabling access from multiple information processing devices,

[0755] A system that includes this.

[0756] (Claim 2)

[0757] The system according to claim 1, wherein the content consists of visual arts and textual information.

[0758] (Claim 3)

[0759] The system according to claim 1, wherein in the content generation process, a prompt statement is created for the generation AI model, and content is generated based on the prompt statement.

[0760] "Application Example 1"

[0761] (Claim 1)

[0762] A method for automatically generating advertising materials using a generative model,

[0763] A means of specifying the content of an advertising campaign based on requirements entered by the user through their device,

[0764] A means for collecting feedback information on the generated advertising material,

[0765] A means for improving the generative model based on the aforementioned feedback information,

[0766] A system that includes this.

[0767] (Claim 2)

[0768] The system according to claim 1, wherein the advertising material consists of a visual design and a written expression.

[0769] (Claim 3)

[0770] The system according to claim 1, wherein the system operates on a cloud infrastructure and allows access from multiple information processing terminals.

[0771] "Example 2 of combining an emotion engine"

[0772] (Claim 1)

[0773] A means of automatically generating information using a generative model,

[0774] A means of collecting user sentiment information from a device,

[0775] A means of adjusting information automatically generated based on collected emotional information,

[0776] means for collecting evaluation information on the generated information,

[0777] Means for adjusting the generative model based on the aforementioned evaluation information and sentiment information,

[0778] A system that includes this.

[0779] (Claim 2)

[0780] The system according to claim 1, wherein the aforementioned information consists of visual representations and textual information.

[0781] (Claim 3)

[0782] The system according to claim 1, wherein the system operates in a virtual environment and allows access from multiple data processing devices.

[0783] "Application example 2 when combining with an emotional engine"

[0784] (Claim 1)

[0785] A means of automatically generating content using a generative model,

[0786] A means for collecting evaluation information for the generated content,

[0787] Means for adjusting the generation model based on the aforementioned evaluation information and the user's psychological state,

[0788] A means for recognizing user emotions and generating recommendation information based on that information,

[0789] A system that includes this.

[0790] (Claim 2)

[0791] The system according to claim 1, wherein the content includes video and audio.

[0792] (Claim 3)

[0793] The system according to claim 1, wherein the system operates in a network environment and allows access from multiple data processing devices. [Explanation of symbols]

[0794] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means of automatically generating content using a generative model, A means for collecting evaluation information for the generated content, Means for adjusting the generative model based on the aforementioned evaluation information, A system that includes this.

2. The system according to claim 1, wherein the content consists of visual design and text.

3. The system according to claim 1, wherein the system operates in a cloud environment and allows access from multiple information processing devices.