system

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
The system addresses legal risks in generative AI content by automatically detecting copyright infringement and suggesting corrections, enhancing user confidence in the safety and originality of generated content.

JP2026105449APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office: JP · JP
Patent Type: Applications
Current Assignee / Owner: SOFTBANK GROUP CORP
Filing Date: 2024-12-16
Publication Date: 2026-06-26

AI Technical Summary

Technical Problem

Content generated by generative AI may contain hallucinated information and infringe copyright, posing legal risks for users who lack confidence in its safe use.

Method used

A system that automatically determines copyright infringement and novelty of information by extracting images and text from content, comparing them with an existing database, and providing user notifications for corrections, ensuring legal compliance and originality.

Benefits of technology

Reduces legal risks and enhances user confidence by automatically identifying and correcting potential copyright infringements and ensuring the originality of generated content.

✦ Generated by Eureka AI based on patent content.

Smart Images

Figure 2026105449000001_ABST

Patent Text Reader

Abstract

We provide the system. [Solution] In a system connected to a content generation device, An extraction means for extracting visual and textual information from content, A matching means for comparing extracted visual and textual information with existing information aggregates, A determination means for determining intellectual property infringement and originality of information based on the matching results, A system that includes a suggestion mechanism for making improvement suggestions based on identified similarities.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] Content created by generative AI may contain non-existent information called hallucination, and furthermore, there may be cases where materials that may infringe copyright are used. Using such content as it is involves legal risks, and there is a problem that it is difficult for users to use the products with confidence.

Means for Solving the Problems

[0005] This invention provides a system that automatically determines copyright infringement and novelty of information by extracting images and text from content generated by a generation AI and comparing them with an existing database. The system notifies the user of the determination result and, if necessary, re-checks the user's modified content, thereby reducing legal risks and enabling the safe use of the generated content.

[0006] A "generator" is a hardware or software component used to automatically generate content.

[0007] A "data processing device" is a computer system used to analyze and process content.

[0008] An "extraction means" is a process or module that has the function of separating and extracting specific elements, such as images and text, from content.

[0009] A "matching means" is a process or module that has the function of comparing extracted data with an existing database to confirm a match or similarity.

[0010] "Determination means" refers to a process or module that has the function of evaluating whether the information contained in the content infringes copyright and whether the information is novel, based on the matching results.

[0011] A "notification means" is a process or module that has the function of informing the user of the judgment result.

[0012] A "re-evaluation mechanism" is a process or module that has the function of re-analyzing the modified content and re-evaluating copyright infringement and the novelty of the information. [Brief explanation of the drawing]

[0013] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [[ID= twenty-three ]] [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when an emotion engine is combined.

Mode for Carrying Out the Invention

[0014] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0015] First, the terms used in the following description will be explained.

[0016] In the following embodiments, a processor with a reference numeral (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0017] In the following embodiments, a RAM (Random Access Memory) with a reference numeral is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0018] In the following embodiments, a storage with a reference numeral is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.

[0019] In the following embodiments, a communication I / F (Interface) with a reference numeral is an interface including a communication processor and an antenna, etc. The communication I / F controls communication between multiple computers. Examples of communication standards applied to the communication I / F include wireless communication standards including 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark), etc.

[0020] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0021] [First Embodiment]

[0022] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0023] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0024] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0025] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0026] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0027] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0028] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0029] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0030] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0031] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0032] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0033] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0034] This invention provides a system for automatically verifying content created by generation AI and reducing legal risks. This system is operated by a server and ensures the legality and accuracy of the generated content by analyzing and matching images and text.

[0035] The server first receives content generated from the user's device and extracts image and text data from it. The extracted data is sent to a database connected to the server as a matching key. This database contains existing copyright data and published guidelines, which are then used to compare and analyze the data.

[0036] During the matching process, the server uses image recognition algorithms to extract visual features and natural language processing algorithms to analyze the semantic content of the text. This allows the server to evaluate the similarity of the content with high accuracy.

[0037] After the initial assessment, the server performs a detailed evaluation of the content deemed problematic based on the matching results. For example, if a particular image matches existing copyrighted material, the server detects this and sends a notification to the user's device. The notification includes recommendations on which parts are problematic and how to correct them.

[0038] Afterward, the user can modify the content on the device that received the notification. Once completed, the user resubmits the modified content to the server. The server then verifies it again to determine if the problem has been resolved. If all issues are resolved, the server notifies the user and approves the use of the content.

[0039] For example, consider a scenario where a user creates an image for online posting using a generation AI, and that image partially resembles an existing copyrighted work. In this case, the server identifies the relevant part and sends instructions to the user to make corrections. After the user appropriately corrects the image, the server re-examines it to confirm that the problem has been resolved, allowing the user to confidently publish the image. Through this process, users can use the generated content without worrying about legal hurdles.

[0040] The following describes the processing flow.

[0041] Step 1:

[0042] The user uses a device to instruct the generation AI to create content, and uploads the generated images and text to the system. The server receives them.

[0043] Step 2:

[0044] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology obtains the text as digital character information.

[0045] Step 3:

[0046] The server uses the extracted data to begin matching it against its internally maintained database. The database contains copyright information and publicly available content data; images are compared using feature vectors, and text is compared using document matching techniques.

[0047] Step 4:

[0048] The server determines, based on the matching results, whether there is a potential copyright infringement and whether the information is inaccurate. This determination includes numerical analysis of the degree of agreement and confidence of the results.

[0049] Step 5:

[0050] The server notifies the user's device of the detection results. The notification includes the specific part of the content in which the problem was found, as well as advice on how to address and correct it.

[0051] Step 6:

[0052] The user uses their device to receive notifications from the server and correct the content that has been flagged as problematic. Once the necessary corrections are complete, they resubmit the content to the server.

[0053] Step 7:

[0054] The server receives the corrected content, performs another check, and verifies that the problem has been resolved. If all check results are clear, it is determined that there is no problem.

[0055] Step 8:

[0056] The server will notify the user again, confirming that the corrected content is free of problems. This allows the user to use the corrected content with confidence.

[0057] (Example 1)

[0058] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0059] There is a need to efficiently verify copyright infringement and originality of content created using generative AI models, thereby reducing legal risks and ensuring users can use the content with confidence. Current methods require specialized knowledge and a significant amount of time, which is burdensome for users.

[0060] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0061] In this invention, the server includes separation means for separating image data and text data from content, comparison means for comparing the separated image data and text data with existing information aggregation devices, and means for notifying the user of instructions based on the evaluation results. This enables automatic content analysis and notification of potential legal infringements.

[0062] "Generative devices" refer to devices or software used to artificially generate content, and primarily refer to data generation using generative AI models.

[0063] An "information processing device" refers to an electronic device or system that processes, analyzes, and compares data, and performs various operations on content received from a user.

[0064] "Separation means" refers to a process or mechanism for individually extracting image data and text data from received content.

[0065] An "information aggregation device" refers to a database or storage system in which existing data and information are accumulated, and is used for referencing, matching, and comparison.

[0066] "Comparison means" refers to a process or algorithm for determining similarity or novelty by comparing extracted or separated data with existing information.

[0067] "Evaluation methods" refer to methods and functions for determining the legal compliance and uniqueness of data based on comparative results.

[0068] "Visual features" refer to features used to analyze and extract elements such as shape, color, and pattern from image data.

[0069] "Content interpretation means" refers to algorithms or processes for analyzing the structure and meaning of linguistic data and interpreting the significance and intent of the information.

[0070] "Instruction and notification means" refers to a means of communication or presentation used to convey specific corrections or warnings to the user based on the evaluation results.

[0071] This invention is a system for mitigating legal risks associated with content created by generative AI models. This system is primarily server-based and processes content through the user's terminal. Specifically, it includes the following components:

[0072] First, the user generates content such as images and text using a generative AI model and sends it to the server via their device. This content arrives at the server in the form of image data and text data. The server then prepares these for individual analysis using a data separation mechanism.

[0073] Next, the server uses an information aggregation device, i.e., a database, for verification. This database stores information on known copyrighted works and legal guidelines. Using comparison tools, the server compares this existing information with the content data. In this process, image recognition algorithms (e.g., OpenCV) are used for analyzing the visual features of image data, and natural language processing (e.g., SpaCy) is used for interpreting the content of text data. This allows the server to evaluate the originality and legal compliance of the generated content.

[0074] Once the evaluation is complete, the user will be notified of any issues and instructions for correction as needed. For example, if an image generated by the user using an AI model for online posting is similar to an existing copyrighted work, a partial revision will be suggested. The user will receive this notification and make the necessary corrections on their device. After the revisions, the content will be resubmitted to the server for re-evaluation and final approval.

[0075] A concrete example of a prompt message is: "We are developing an AI that generates original images for social media posts. Please explain how to automate compliance checks to ensure these images are not similar to other copyrighted works and to avoid legal risks." In this way, the system provides users with the ability to use generated content with peace of mind, without legal concerns.

[0076] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0077] Step 1:

[0078] The user's device uses a generation AI model to generate content. The input at this stage is the user's prompt text, and the generated content (images and text) is output. The user then prepares to send this output content to the server.

[0079] Step 2:

[0080] The server receives content generated by the user. The server's input is the content file received from the user's terminal. The server receives the content and uses a separation mechanism to decompose it into image data and text data. The output is image data and text data that can be processed individually. This separation prepares the server for data analysis.

[0081] Step 3:

[0082] The server compares the separated image and text data with the information aggregation device. The input is the image and text data sent to the server, and the output is the comparison result. The server uses a comparison means to extract visual features with an image recognition algorithm and analyze the semantic content of the text with a natural language processing algorithm. Similarity is evaluated based on the output comparison result.

[0083] Step 4:

[0084] The server performs an evaluation based on the matching results. It uses evaluation tools to determine legal compliance and originality. The input is the matching results obtained in the previous step. The output is the evaluation results. If infringement is suspected, specific problem areas and necessary modifications are identified.

[0085] Step 5:

[0086] The server sends a notification to the user's terminal based on the evaluation results. The input is information about the evaluation results, and the output is a message to the user with specific correction instructions or warnings. Through the notification system, the user is informed in detail about the problem areas and which parts need to be corrected and how.

[0087] Step 6:

[0088] The user modifies the content on their device based on the notification. The input is the notification from the server. The user makes the necessary corrections and generates the corrected content as output. This corrected version is then sent back to the server for re-evaluation.

[0089] Step 7:

[0090] The server re-evaluates the revised content. The input is the revised content submitted by the user, and the output is the final evaluation result. The server performs another analysis to check for any problems. If the problems are resolved, the server notifies the user that they are approved to use the content.

[0091] (Application Example 1)

[0092] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0093] In recent years, content creation using generative AI has rapidly become widespread. However, the potential for the generated content to infringe on intellectual property rights such as copyrights poses a legal risk. Furthermore, there is a need for effective systems that can automatically detect similar content and provide appropriate improvement suggestions.

[0094] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0095] In this invention, the server includes an extraction means for extracting visual and textual information from content, a matching means for comparing the extracted visual and textual information with an existing information archive, a determination means for determining intellectual property infringement and originality of the information based on the matching results, and a suggestion means for making improvement suggestions based on the identified similarities. This reduces the legal risks of content generated by AI, while enabling users to confidently publish their own content.

[0096] An "information processing device" is a computer system used to process digital data and perform specific tasks.

[0097] "Visual information" refers to data that is visually recognized, such as images and shapes.

[0098] "Character information" refers to a series of characters, symbols, and other data that make up text data.

[0099] "Extraction means" refers to a process or apparatus for obtaining specific elements, such as visual or textual information, from content.

[0100] "Information aggregation" refers to a collection of existing information stored in a database, including data and knowledge bases collected in the past.

[0101] "Verification means" refers to a method or function for comparing newly obtained data with existing information collections.

[0102] A "determination means" is a mechanism or method for evaluating whether a particular criterion is met based on the results of a data comparison.

[0103] "Intellectual property infringement" refers to the act of illegally using or infringing on another person's intellectual property rights.

[0104] "Uniqueness" refers to possessing new and distinctive characteristics that clearly distinguish it from other products or content.

[0105] A "proposal method" is a function or process for presenting the optimal solution or improvement method under specific conditions.

[0106] In this invention, first, the user creates images and text information through a generation AI model using a content generation terminal. The user then transmits the generated content to an information processing device. The information processing device extracts visual and text information from the content using an extraction means. This extraction is performed through the steady cooperation of each component within the system. The extracted information is sent to a matching means and compared with an existing information aggregate. The information aggregate used here includes historical data and publicly available guidelines.

[0107] Based on the matching results, the server uses a determination tool to assess the possibility of intellectual property infringement and the originality of the information. During the determination process, software libraries such as TENSORFLOW® and SpaCy are used to effectively perform image recognition and natural language processing. If a problem is identified through this process, the user is notified of improvement suggestions using a suggestion tool.

[0108] Users receive improvement suggestions on their devices and revise the content. The revised content is then sent back to the information processing device for re-evaluation. This ensures that the improved content is legally compliant, allowing for confident use and publication.

[0109] As a concrete example, consider a scenario where a user creates digital art and it is determined to be similar to an existing painting. In this case, the server clarifies the similarities and suggests how to enhance the originality by modifying certain parts of the visual information. These suggestions might include changing the color tone or adding unique patterns.

[0110] As an example of a prompt, the AI model is given the prompt, "Determine whether this digital art is similar to other works, and tell me which parts need to be corrected and how to make those corrections." The AI then suggests the most suitable correction method.

[0111] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0112] Step 1:

[0113] The user creates content from their device using a generative AI model. The input consists of text and image data created by the user. This data is then generated by the generative AI model and stored on the device.

[0114] Step 2:

[0115] The terminal sends content data to the information processing device. The input includes the text and image data generated in step 1. The output is the data being passed to the information processing device. This action prepares the data for processing on the server.

[0116] Step 3:

[0117] The server uses extraction methods to extract visual and textual information from the content. Data sent from the terminal is taken into the server as input. The output is the extracted visual and textual information. The server uses image recognition algorithms to extract visual features and natural language processing to analyze the text content.

[0118] Step 4:

[0119] The server uses matching mechanisms to compare the extracted data with existing data sets. The input is the extraction results from step 3. The output is the similarity score and corresponding data points as the matching results. In this process, the data is compared with the information in the database and a similarity analysis is performed.

[0120] Step 5:

[0121] The server evaluates the possibility of intellectual property infringement based on the matching results through a determination mechanism. The input is the matching results from step 4. The output generates determination information regarding the possibility of infringement and originality. Here, the evaluation is performed according to intellectual property rules.

[0122] Step 6:

[0123] The server uses a suggestion mechanism to notify the user of improvement suggestions based on the judgment result. The judgment information from step 5 is used as input. The output is a notification of what needs to be improved and specific correction suggestions. The user receives this information on their terminal and is encouraged to make appropriate corrections.

[0124] Step 7:

[0125] The user modifies the content on their device and sends it back to the server. The input is the modified content data. The output is the modified data being sent back to the server. The revision is completed by the user's editing action.

[0126] Step 8:

[0127] The server re-evaluates the corrected content using the evaluation method. The input is the corrected data submitted in step 7. The output is the final evaluation result of the improved content. The server can then confirm that there are no legal issues and notify the user.

[0128] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0129] This invention provides a system that evaluates content generated by a generative AI and gives appropriate notifications to the user, by combining it with an emotion engine to provide interactions that take the user's emotions into consideration. In this system, the server is central and performs both emotion analysis and content matching.

[0130] The server receives content uploaded from the user's device and first extracts images and text. The server then compares this data with its internal database to determine potential copyright infringement and the novelty of the information. If the comparison reveals any issues, the server notifies the user.

[0131] Furthermore, this system is equipped with an emotion engine that can analyze the user's emotions. The emotion engine evaluates the user's emotions in real time based on data acquired from the camera, microphone, etc., on the user's device. This emotion data is sent to the server, and the notification content is adjusted according to the user's emotional state. For example, if the user is feeling stressed, the system will take an approach such as making the tone of the notification softer.

[0132] As a concrete example, consider a situation where a user creates a work using a generative AI and encounters copyright issues. In this case, the server uses an emotion engine to analyze the user's emotional state. If the user is feeling anxious, the server provides a gentle notification, offering not only corrections but also resources and support options to create a more user-friendly experience. Furthermore, the server performs a similar process when the user resubmits the content after making corrections, supporting the entire process.

[0133] By combining these emotion engines, we have created a system that improves the user experience and ensures both the legal safety of the content and the psychological comfort of the user.

[0134] The following describes the processing flow.

[0135] Step 1:

[0136] The user uses their device to request content creation from the AI generator and uploads the generated images and text to the system. The server receives this content.

[0137] Step 2:

[0138] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology digitizes and retrieves the text.

[0139] Step 3:

[0140] The server compares the extracted images and text against an internal database. This database contains copyright information and known data; images are searched based on visual similarity, and text is compared using natural language processing techniques.

[0141] Step 4:

[0142] Based on the matching results, the server determines the potential for copyright infringement of the content and the novelty of the information. The results include a numerical analysis of discrepancies and similarities.

[0143] Step 5:

[0144] The server reviews the assessment results and, if a problem is detected, activates the emotion engine. Based on camera and microphone data acquired from the user's device, it evaluates the user's emotional state.

[0145] Step 6:

[0146] The server adjusts the notification content according to the user's emotional state. For example, if the user is feeling stressed, it will notify them of the problem and suggest improvements in a gentle tone.

[0147] Step 7:

[0148] The user receives a notification on their device and corrects the indicated content. Once the user has completed the correction, the updated content is resent to the server.

[0149] Step 8:

[0150] The server receives the corrected content, performs another check, and verifies that all issues have been resolved.

[0151] Step 9:

[0152] The server notifies the user that the modified content has been cleared, providing ultimate reassurance. This allows the user to confirm that the content is safe to use.

[0153] (Example 2)

[0154] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0155] In recent years, the creation of content using generative AI has increased. However, this has led to problems such as the risk of copyright infringement and a lack of originality in the created content. Furthermore, in order for users to respond appropriately to these risks, interactive notifications that take into account the user's emotional state are required. Therefore, a system is needed that guarantees the rights and originality of content while taking user emotions into consideration.

[0156] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0157] In this invention, the server includes extraction means for extracting images and text from content; matching means for comparing the extracted images and text with an existing database; determination means for determining copyright infringement of the content and novelty of the information based on the matching results; sentiment analysis means for analyzing the user's emotions; notification adjustment means for adjusting notification content based on the analyzed sentiment data; and notification means for providing notifications according to the user's emotional state. This makes it possible to provide users with information about the content in an appropriate and emotionally sensitive manner, thereby reducing legal risks and realizing a comfortable user experience.

[0158] A "generating device" refers to the entire system used to create content using generative AI.

[0159] A "data processing device" refers to the entire system of equipment that performs data processing, such as evaluating generated content and analyzing user sentiment.

[0160] "Extraction method" refers to a function that extracts image and text data from received content.

[0161] "Matching means" refers to a function that compares extracted image and text data with data in an existing database.

[0162] "Determination means" refers to a function that evaluates the possibility of copyright infringement and the novelty of content based on the results of the comparison means.

[0163] "Emotional analysis means" refers to a function that integrates software and hardware for analyzing a user's emotions.

[0164] "Notification adjustment mechanism" refers to a function that adjusts the content and tone of notifications sent to the user based on analyzed sentiment data.

[0165] "Notification means" refers to the entire set of communication technologies and interfaces used to provide information to users.

[0166] In this invention, a content generation device and a data processing device operate in conjunction. The server receives content generated using a generation AI model from the user's terminal. This processing uses the HTTP protocol based on data transfer technology. Image and text data are extracted from the received content by an extraction means. In this process, software such as "OpenCV" is used as an image analysis library, and "Tesseract OCR" is used for text extraction.

[0167] Next, the server uses matching means to compare the extracted data with an existing database. The database contains copyright information and data on similar existing content in the market, and this information is used to determine the potential for copyright infringement and the novelty of the content.

[0168] The server also performs sentiment analysis using camera and microphone data acquired from the user's device. This sentiment analysis utilizes technologies such as the "Emotion API" to evaluate the user's emotional state. The results of this evaluation are then used by a notification adjustment mechanism to flexibly modify the content and tone of notifications delivered to the user. These notifications are delivered in various forms, such as push notifications or email, with the receiving method tailored to the user's preference.

[0169] As a concrete example, suppose a user uses AI to create a new piece of art and uploads it. The server checks for similarity to existing works, and if a similar work exists, it analyzes the user's emotional state and, if it determines that the user is feeling anxious, sends a message in a gentle tone such as, "This work may be similar to an existing one. Please refer to this guide to try revising it. If you need assistance, please contact customer support."

[0170] A concrete example of a prompt message for a generative AI model would be: "When using generative AI to create a new piece of artwork, please help me identify how my work differs from existing works."

[0171] In this way, this system can provide an assessment of content copyright while taking into consideration the user's psychological comfort.

[0172] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0173] Step 1:

[0174] Users generate content using a generative AI model and upload it to the server. Input includes images and text data generated by the user. The server receives this data using the HTTP protocol. The output is a digital file of the content stored on the server.

[0175] Step 2:

[0176] The server extracts images and text from the received content. It receives image and text data contained in a digital file as input. The server processes the image data using an image analysis library (e.g., OpenCV) and extracts the text using a text analysis tool (e.g., Tesseract OCR). The output consists of the extracted image and text information.

[0177] Step 3:

[0178] The server uses matching mechanisms to compare extracted images and text information with existing databases. The input consists of extracted images and text information, as well as existing data stored in the database. The server performs a similarity calculation to evaluate how similar the content is to existing material. The output provides a determination of the content's potential for copyright infringement and its novelty.

[0179] Step 4:

[0180] The server analyzes the user's emotions using emotion analysis tools. It uses camera video and audio data acquired from the user's device as input. The server uses an emotion analysis API to evaluate the user's emotional state in real time from this data. The output is information about the user's emotional state.

[0181] Step 5:

[0182] The server uses a notification adjustment mechanism to adjust the content and tone of notifications according to the user's emotional state. The server takes the judgment result and emotional state information as input and generates the notification content. The adjusted notification message is output.

[0183] Step 6:

[0184] The server sends notifications to the user using various notification methods. Inputs include a tailored notification message and information about how the user receives notifications. The server provides information to the user via push notifications or email. Outputs include feedback on the content from the notification messages received by the user.

[0185] (Application Example 2)

[0186] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0187] It is necessary to simultaneously address legal issues and improve the user experience of content created by generative AI. However, conventional systems have been superficial in their legal criticism and have failed to provide feedback that takes user emotions into consideration. As a result, there has been a lack of means to alleviate anxiety and stress when users experience them.

[0188] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0189] In this invention, the server includes extraction means for extracting images and text from content, matching means for comparing the extracted data with an existing database, and adjustment means for adjusting the tone of feedback based on the analyzed user sentiment. This makes it possible to provide feedback from a legal perspective while taking user sentiment into consideration, thereby improving the user experience.

[0190] A "data processing device" is a device connected to a content generation device for analyzing and processing content data.

[0191] An "extraction means" is a means that has the function of identifying and extracting images and text from content.

[0192] "Matching means" refers to a method of comparing extracted images and text with an existing database to confirm matches or similarities.

[0193] "Determination method" refers to a means of determining whether content infringes copyright or whether the information is novel, based on the results of the comparison.

[0194] "Emotional analysis methods" refer to methods for analyzing a user's emotions based on data acquired from cameras and voice input devices.

[0195] "Adjustment methods" refer to means of changing the tone and content of the feedback provided in accordance with the analyzed emotions of the user.

[0196] The system for implementing this invention consists of multiple elements and realizes comprehensive data processing and user interaction. First, the server receives content uploaded by the user. The devices used here include internet-connected terminals such as smartphones and smart glasses.

[0197] The server first uses data analysis software to extract images and text from the content, identifying and removing clear portions of the content. This process may utilize tools such as Google® Cloud Vision API for image analysis.

[0198] Next, the server compares the extracted data with an existing internal database. This comparison process uses generative AI models such as OpenAI's GPT to verify copyright and determine the novelty of the information.

[0199] Furthermore, the server processes camera and microphone information sent from the user's device in real time to analyze the user's emotions. IBM Watson's (registered trademark) emotion analysis API may be used for this analysis. Based on the analyzed emotion data, the server adjusts the content of notifications and feedback provided to ensure appropriate communication tailored to the user's psychological state.

[0200] As a concrete example, consider a scenario where a user creates content using a generation AI and encounters copyright issues. In this case, the server provides tailored feedback such as, "This section is similar to existing work. Here are some tips to enhance its originality. Relax and enjoy coming up with great ideas!"

[0201] An example of a prompt for a generative AI model might be: "Analyze user-generated content and generate copyright feedback in a gentle tone using sentiment data."

[0202] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0203] Step 1:

[0204] Users upload generated content to the system via devices such as smartphones or smart glasses. The input is content data from the user's device, which is sent to the server. The output is the content data received by the server.

[0205] Step 2:

[0206] The server extracts images and text from uploaded content. The input is the received content data, and the server uses image analysis software and text analysis algorithms to extract data based on this data. The output is the extracted image and text data.

[0207] Step 3:

[0208] The server compares the extracted images and text with an existing database. The input is the extracted data, which is compared with the information in the database using OpenAI's GPT model and image matching algorithms. The output is an evaluation of the degree of match and similarity as a result of the matching.

[0209] Step 4:

[0210] The server receives data in real time from the terminal's camera and microphone to analyze the user's emotional data. The input is the user's emotional information, and the server performs analysis using IBM Watson's emotional analysis API. The output is an evaluation of the user's current emotional state.

[0211] Step 5:

[0212] The server generates feedback for the user based on the matching results and emotional state. The inputs are the legal evaluation of the judged content and the user's emotional state. Accordingly, the server adjusts the tone of the feedback and, if necessary, uses a generative AI model to generate specific advice and suggestions. The output is the adjusted feedback message sent to the user.

[0213] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0214] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0215] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0216] [Second Embodiment]

[0217] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0218] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0219] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0220] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0221] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0222] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0223] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0224] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0225] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0226] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0227] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0228] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0229] This invention provides a system for automatically verifying content created by generation AI and reducing legal risks. This system is operated by a server and ensures the legality and accuracy of the generated content by analyzing and matching images and text.

[0230] The server first receives content generated from the user's device and extracts image and text data from it. The extracted data is sent to a database connected to the server as a matching key. This database contains existing copyright data and published guidelines, which are then used to compare and analyze the data.

[0231] During the matching process, the server uses image recognition algorithms to extract visual features and natural language processing algorithms to analyze the semantic content of the text. This allows the server to evaluate the similarity of the content with high accuracy.

[0232] After the initial assessment, the server performs a detailed evaluation of the content deemed problematic based on the matching results. For example, if a particular image matches existing copyrighted material, the server detects this and sends a notification to the user's device. The notification includes recommendations on which parts are problematic and how to correct them.

[0233] Afterward, the user can modify the content on the device that received the notification. Once completed, the user resubmits the modified content to the server. The server then verifies it again to determine if the problem has been resolved. If all issues are resolved, the server notifies the user and approves the use of the content.

[0234] For example, consider a scenario where a user creates an image for online posting using a generation AI, and that image partially resembles an existing copyrighted work. In this case, the server identifies the relevant part and sends instructions to the user to make corrections. After the user appropriately corrects the image, the server re-examines it to confirm that the problem has been resolved, allowing the user to confidently publish the image. Through this process, users can use the generated content without worrying about legal hurdles.

[0235] The following describes the processing flow.

[0236] Step 1:

[0237] The user uses a device to instruct the generation AI to create content, and uploads the generated images and text to the system. The server receives them.

[0238] Step 2:

[0239] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology obtains the text as digital character information.

[0240] Step 3:

[0241] The server uses the extracted data to begin matching it against its internally maintained database. The database contains copyright information and publicly available content data; images are compared using feature vectors, and text is compared using document matching techniques.

[0242] Step 4:

[0243] The server determines, based on the matching results, whether there is a potential copyright infringement and whether the information is inaccurate. This determination includes numerical analysis of the degree of agreement and confidence of the results.

[0244] Step 5:

[0245] The server notifies the user's device of the detection results. The notification includes the specific part of the content in which the problem was found, as well as advice on how to address and correct it.

[0246] Step 6:

[0247] The user uses their device to receive notifications from the server and correct the content that has been flagged as problematic. Once the necessary corrections are complete, they resubmit the content to the server.

[0248] Step 7:

[0249] The server receives the corrected content, performs another check, and verifies that the problem has been resolved. If all check results are clear, it is determined that there is no problem.

[0250] Step 8:

[0251] The server will notify the user again, confirming that the corrected content is free of problems. This allows the user to use the corrected content with confidence.

[0252] (Example 1)

[0253] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0254] There is a need to efficiently verify copyright infringement and originality of content created using generative AI models, thereby reducing legal risks and ensuring users can use the content with confidence. Current methods require specialized knowledge and a significant amount of time, which is burdensome for users.

[0255] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0256] In this invention, the server includes separation means for separating image data and text data from content, comparison means for comparing the separated image data and text data with existing information aggregation devices, and means for notifying the user of instructions based on the evaluation results. This enables automatic content analysis and notification of potential legal infringements.

[0257] "Generative devices" refer to devices or software used to artificially generate content, and primarily refer to data generation using generative AI models.

[0258] An "information processing device" refers to an electronic device or system that processes, analyzes, and compares data, and performs various operations on content received from a user.

[0259] "Separation means" refers to a process or mechanism for individually extracting image data and text data from received content.

[0260] An "information aggregation device" refers to a database or storage system in which existing data and information are accumulated, and is used for referencing, matching, and comparison.

[0261] "Comparison means" refers to a process or algorithm for determining similarity or novelty by comparing extracted or separated data with existing information.

[0262] "Evaluation methods" refer to methods and functions for determining the legal compliance and uniqueness of data based on comparative results.

[0263] "Visual features" refer to features used to analyze and extract elements such as shape, color, and pattern from image data.

[0264] "Content interpretation means" refers to algorithms or processes for analyzing the structure and meaning of linguistic data and interpreting the significance and intent of the information.

[0265] "Instruction and notification means" refers to a means of communication or presentation used to convey specific corrections or warnings to the user based on the evaluation results.

[0266] This invention is a system for mitigating legal risks associated with content created by generative AI models. This system is primarily server-based and processes content through the user's terminal. Specifically, it includes the following components:

[0267] First, the user generates content such as images and text using a generative AI model and sends it to the server via their device. This content arrives at the server in the form of image data and text data. The server then prepares these for individual analysis using a data separation mechanism.

[0268] Next, the server uses an information aggregation device, i.e., a database, for verification. This database stores information on known copyrighted works and legal guidelines. Using comparison tools, the server compares this existing information with the content data. In this process, image recognition algorithms (e.g., OpenCV) are used for analyzing the visual features of image data, and natural language processing (e.g., SpaCy) is used for interpreting the content of text data. This allows the server to evaluate the originality and legal compliance of the generated content.

[0269] Once the evaluation is complete, the user will be notified of any issues and instructions for correction as needed. For example, if an image generated by the user using an AI model for online posting is similar to an existing copyrighted work, a partial revision will be suggested. The user will receive this notification and make the necessary corrections on their device. After the revisions, the content will be resubmitted to the server for re-evaluation and final approval.

[0270] A concrete example of a prompt message is: "We are developing an AI that generates original images for social media posts. Please explain how to automate compliance checks to ensure these images are not similar to other copyrighted works and to avoid legal risks." In this way, the system provides users with the ability to use generated content with peace of mind, without legal concerns.

[0271] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0272] Step 1:

[0273] The user's device uses a generation AI model to generate content. The input at this stage is the user's prompt text, and the generated content (images and text) is output. The user then prepares to send this output content to the server.

[0274] Step 2:

[0275] The server receives content generated by the user. The server's input is the content file received from the user's terminal. The server receives the content and uses a separation mechanism to decompose it into image data and text data. The output is image data and text data that can be processed individually. This separation prepares the server for data analysis.

[0276] Step 3:

[0277] The server compares the separated image and text data with the information aggregation device. The input is the image and text data sent to the server, and the output is the comparison result. The server uses a comparison means to extract visual features with an image recognition algorithm and analyze the semantic content of the text with a natural language processing algorithm. Similarity is evaluated based on the output comparison result.

[0278] Step 4:

[0279] The server performs an evaluation based on the matching results. It uses evaluation tools to determine legal compliance and originality. The input is the matching results obtained in the previous step. The output is the evaluation results. If infringement is suspected, specific problem areas and necessary modifications are identified.

[0280] Step 5:

[0281] The server sends a notification to the user's terminal based on the evaluation results. The input is information about the evaluation results, and the output is a message to the user with specific correction instructions or warnings. Through the notification system, the user is informed in detail about the problem areas and which parts need to be corrected and how.

[0282] Step 6:

[0283] The user modifies the content on the terminal based on the notification. The input is the content of the notification from the server. The user makes the necessary modifications and generates the modified content as the output. This modified content is sent back to the server for re-evaluation.

[0284] Step 7:

[0285] The server re-evaluates the modified content. The input is the modified content sent from the user, and the output is the final evaluation result. The server performs analysis again to check for any problems. If the problem is solved, the server notifies the user of the approval to use the content.

[0286] (Application Example 1)

[0287] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as the "server", and the smart glasses 214 are referred to as the "terminal".

[0288] In recent years, content creation using generative AI has been rapidly spreading. However, there is a possibility that the generated content may infringe on intellectual property rights such as copyright, and its legal risks are an issue. Also, an effective system for automatically detecting similar content and making appropriate improvement proposals is required.

[0289] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0290] In this invention, the server includes an extraction means for extracting visual information and character information from the content, a collation means for collating the extracted visual information and character information with the existing information integration, a determination means for determining the infringement of the intellectual property rights of the content and the uniqueness of the information based on the collation result, and a proposal means for making improvement proposals based on the identified similarities. As a result, while reducing the legal risks of content by generative AI, it becomes possible for users to safely publish their own content.

[0291] An "information processing device" is a computer system used to process digital data and perform specific tasks.

[0292] "Visual information" refers to data that is visually recognized, such as images and shapes.

[0293] "Character information" refers to a series of characters, symbols, and other data that make up text data.

[0294] "Extraction means" refers to a process or apparatus for obtaining specific elements, such as visual or textual information, from content.

[0295] "Information aggregation" refers to a collection of existing information stored in a database, including data and knowledge bases collected in the past.

[0296] "Verification means" refers to a method or function for comparing newly obtained data with existing information collections.

[0297] A "determination means" is a mechanism or method for evaluating whether a particular criterion is met based on the results of a data comparison.

[0298] "Intellectual property infringement" refers to the act of illegally using or infringing on another person's intellectual property rights.

[0299] "Uniqueness" refers to possessing new and distinctive characteristics that clearly distinguish it from other products or content.

[0300] A "proposal method" is a function or process for presenting the optimal solution or improvement method under specific conditions.

[0301] In this invention, first, the user creates images and text information through a generation AI model using a content generation terminal. The user then transmits the generated content to an information processing device. The information processing device extracts visual and text information from the content using an extraction means. This extraction is performed through the steady cooperation of each component within the system. The extracted information is sent to a matching means and compared with an existing information aggregate. The information aggregate used here includes historical data and publicly available guidelines.

[0302] Based on the matching results, the server uses a determination tool to assess the potential for intellectual property infringement and the originality of the information. During the determination process, software libraries such as TensorFlow and SpaCy are used to effectively perform image recognition and natural language processing. If a problem is identified through this process, the user is notified of improvement suggestions using a suggestion tool.

[0303] Users receive improvement suggestions on their devices and revise the content. The revised content is then sent back to the information processing device for re-evaluation. This ensures that the improved content is legally compliant, allowing for confident use and publication.

[0304] As a concrete example, consider a scenario where a user creates digital art and it is determined to be similar to an existing painting. In this case, the server clarifies the similarities and suggests how to enhance the originality by modifying certain parts of the visual information. These suggestions might include changing the color tone or adding unique patterns.

[0305] As an example of a prompt, the AI model is given the prompt, "Determine whether this digital art is similar to other works, and tell me which parts need to be corrected and how to make those corrections." The AI then suggests the most suitable correction method.

[0306] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0307] Step 1:

[0308] The user creates content from a terminal using a generative AI model. As input, data such as text and images created by the user is obtained. This data is generated by the generative AI model and saved on the terminal.

[0309] Step 2:

[0310] The terminal sends the content data to the information processing device. As input, the text and image data generated in Step 1 are included. The output is that the data is passed to the information processing device. By this operation, the data is ready to be processed by the server.

[0311] Step 3:

[0312] The server uses extraction means to extract visual information and character information from the content. As input, the data transmitted from the terminal is taken in by the server. The output is the extracted visual information and character information. The server uses an image recognition algorithm to extract visual features and uses natural language processing to analyze the content of the text.

[0313] Step 4:

[0314] The server uses matching means to match the extracted data with existing information aggregations. This input is the extraction result of Step 3. The output is the similarity score as the matching result and the corresponding data points. In this process, a comparison is made with the information in the database and a similarity analysis is performed.

[0315] Step 5:

[0316] The server evaluates the possibility of intellectual property infringement based on the matching result through judgment means. The input is the matching result of Step 4. As output, judgment information regarding the possibility of infringement and uniqueness is generated. Here, the evaluation is performed according to the rules regarding intellectual property.

[0317] Step 6:

[0318] The server uses a suggestion mechanism to notify the user of improvement suggestions based on the judgment result. The judgment information from step 5 is used as input. The output is a notification of what needs to be improved and specific correction suggestions. The user receives this information on their terminal and is encouraged to make appropriate corrections.

[0319] Step 7:

[0320] The user modifies the content on their device and sends it back to the server. The input is the modified content data. The output is the modified data being sent back to the server. The revision is completed by the user's editing action.

[0321] Step 8:

[0322] The server re-evaluates the corrected content using the evaluation method. The input is the corrected data submitted in step 7. The output is the final evaluation result of the improved content. The server can then confirm that there are no legal issues and notify the user.

[0323] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0324] This invention provides a system that evaluates content generated by a generative AI and gives appropriate notifications to the user, by combining it with an emotion engine to provide interactions that take the user's emotions into consideration. In this system, the server is central and performs both emotion analysis and content matching.

[0325] The server receives content uploaded from the user's device and first extracts images and text. The server then compares this data with its internal database to determine potential copyright infringement and the novelty of the information. If the comparison reveals any issues, the server notifies the user.

[0326] Furthermore, this system is equipped with an emotion engine that can analyze the user's emotions. The emotion engine evaluates the user's emotions in real time based on data acquired from the camera, microphone, etc., on the user's device. This emotion data is sent to the server, and the notification content is adjusted according to the user's emotional state. For example, if the user is feeling stressed, the system will take an approach such as making the tone of the notification softer.

[0327] As a concrete example, consider a situation where a user creates a work using a generative AI and encounters copyright issues. In this case, the server uses an emotion engine to analyze the user's emotional state. If the user is feeling anxious, the server provides a gentle notification, offering not only corrections but also resources and support options to create a more user-friendly experience. Furthermore, the server performs a similar process when the user resubmits the content after making corrections, supporting the entire process.

[0328] By combining these emotion engines, we have created a system that improves the user experience and ensures both the legal safety of the content and the psychological comfort of the user.

[0329] The following describes the processing flow.

[0330] Step 1:

[0331] The user uses their device to request content creation from the AI generator and uploads the generated images and text to the system. The server receives this content.

[0332] Step 2:

[0333] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology digitizes and retrieves the text.

[0334] Step 3:

[0335] The server compares the extracted images and text against an internal database. This database contains copyright information and known data; images are searched based on visual similarity, and text is compared using natural language processing techniques.

[0336] Step 4:

[0337] Based on the matching results, the server determines the potential for copyright infringement of the content and the novelty of the information. The results include a numerical analysis of discrepancies and similarities.

[0338] Step 5:

[0339] The server reviews the assessment results and, if a problem is detected, activates the emotion engine. Based on camera and microphone data acquired from the user's device, it evaluates the user's emotional state.

[0340] Step 6:

[0341] The server adjusts the notification content according to the user's emotional state. For example, if the user is feeling stressed, it will notify them of the problem and suggest improvements in a gentle tone.

[0342] Step 7:

[0343] The user receives a notification on their device and corrects the indicated content. Once the user has completed the correction, the updated content is resent to the server.

[0344] Step 8:

[0345] The server receives the corrected content, performs another check, and verifies that all issues have been resolved.

[0346] Step 9:

[0347] The server notifies the user that the modified content has been cleared, providing ultimate reassurance. This allows the user to confirm that the content is safe to use.

[0348] (Example 2)

[0349] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0350] In recent years, the creation of content using generative AI has increased. However, this has led to problems such as the risk of copyright infringement and a lack of originality in the created content. Furthermore, in order for users to respond appropriately to these risks, interactive notifications that take into account the user's emotional state are required. Therefore, a system is needed that guarantees the rights and originality of content while taking user emotions into consideration.

[0351] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0352] In this invention, the server includes extraction means for extracting images and text from content; matching means for comparing the extracted images and text with an existing database; determination means for determining copyright infringement of the content and novelty of the information based on the matching results; sentiment analysis means for analyzing the user's emotions; notification adjustment means for adjusting notification content based on the analyzed sentiment data; and notification means for providing notifications according to the user's emotional state. This makes it possible to provide users with information about the content in an appropriate and emotionally sensitive manner, thereby reducing legal risks and realizing a comfortable user experience.

[0353] A "generating device" refers to the entire system used to create content using generative AI.

[0354] A "data processing device" refers to the entire system of equipment that performs data processing, such as evaluating generated content and analyzing user sentiment.

[0355] "Extraction method" refers to a function that extracts image and text data from received content.

[0356] "Matching means" refers to a function that compares extracted image and text data with data in an existing database.

[0357] "Determination means" refers to a function that evaluates the possibility of copyright infringement and the novelty of content based on the results of the comparison means.

[0358] "Emotional analysis means" refers to a function that integrates software and hardware for analyzing a user's emotions.

[0359] "Notification adjustment mechanism" refers to a function that adjusts the content and tone of notifications sent to the user based on analyzed sentiment data.

[0360] "Notification means" refers to the entire set of communication technologies and interfaces used to provide information to users.

[0361] In this invention, a content generation device and a data processing device operate in conjunction. The server receives content generated using a generation AI model from the user's terminal. This processing uses the HTTP protocol based on data transfer technology. Image and text data are extracted from the received content by an extraction means. In this process, software such as "OpenCV" is used as an image analysis library, and "Tesseract OCR" is used for text extraction.

[0362] Next, the server uses matching means to compare the extracted data with an existing database. The database contains copyright information and data on similar existing content in the market, and this information is used to determine the potential for copyright infringement and the novelty of the content.

[0363] The server also performs sentiment analysis using camera and microphone data acquired from the user's device. This sentiment analysis utilizes technologies such as the "Emotion API" to evaluate the user's emotional state. The results of this evaluation are then used by a notification adjustment mechanism to flexibly modify the content and tone of notifications delivered to the user. These notifications are delivered in various forms, such as push notifications or email, with the receiving method tailored to the user's preference.

[0364] As a concrete example, suppose a user uses AI to create a new piece of art and uploads it. The server checks for similarity to existing works, and if a similar work exists, it analyzes the user's emotional state and, if it determines that the user is feeling anxious, sends a message in a gentle tone such as, "This work may be similar to an existing one. Please refer to this guide to try revising it. If you need assistance, please contact customer support."

[0365] A concrete example of a prompt message for a generative AI model would be: "When using generative AI to create a new piece of artwork, please help me identify how my work differs from existing works."

[0366] In this way, this system can provide an assessment of content copyright while taking into consideration the user's psychological comfort.

[0367] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0368] Step 1:

[0369] Users generate content using a generative AI model and upload it to the server. Input includes images and text data generated by the user. The server receives this data using the HTTP protocol. The output is a digital file of the content stored on the server.

[0370] Step 2:

[0371] The server extracts images and text from the received content. It receives image and text data contained in a digital file as input. The server processes the image data using an image analysis library (e.g., OpenCV) and extracts the text using a text analysis tool (e.g., Tesseract OCR). The output consists of the extracted image and text information.

[0372] Step 3:

[0373] The server uses matching mechanisms to compare extracted images and text information with existing databases. The input consists of extracted images and text information, as well as existing data stored in the database. The server performs a similarity calculation to evaluate how similar the content is to existing material. The output provides a determination of the content's potential for copyright infringement and its novelty.

[0374] Step 4:

[0375] The server analyzes the user's emotions using emotion analysis tools. It uses camera video and audio data acquired from the user's device as input. The server uses an emotion analysis API to evaluate the user's emotional state in real time from this data. The output is information about the user's emotional state.

[0376] Step 5:

[0377] The server uses a notification adjustment mechanism to adjust the content and tone of notifications according to the user's emotional state. The server takes the judgment result and emotional state information as input and generates the notification content. The adjusted notification message is output.

[0378] Step 6:

[0379] The server sends notifications to the user using various notification methods. Inputs include a tailored notification message and information about how the user receives notifications. The server provides information to the user via push notifications or email. Outputs include feedback on the content from the notification messages received by the user.

[0380] (Application Example 2)

[0381] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0382] It is necessary to simultaneously address legal issues and improve the user experience of content created by generative AI. However, conventional systems have been superficial in their legal criticism and have failed to provide feedback that takes user emotions into consideration. As a result, there has been a lack of means to alleviate anxiety and stress when users experience them.

[0383] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0384] In this invention, the server includes extraction means for extracting images and text from content, matching means for comparing the extracted data with an existing database, and adjustment means for adjusting the tone of feedback based on the analyzed user sentiment. This makes it possible to provide feedback from a legal perspective while taking user sentiment into consideration, thereby improving the user experience.

[0385] A "data processing device" is a device connected to a content generation device for analyzing and processing content data.

[0386] An "extraction means" is a means that has the function of identifying and extracting images and text from content.

[0387] "Matching means" refers to a method of comparing extracted images and text with an existing database to confirm matches or similarities.

[0388] "Determination method" refers to a means of determining whether content infringes copyright or whether the information is novel, based on the results of the comparison.

[0389] "Emotional analysis methods" refer to methods for analyzing a user's emotions based on data acquired from cameras and voice input devices.

[0390] "Adjustment methods" refer to means of changing the tone and content of the feedback provided in accordance with the analyzed emotions of the user.

[0391] The system for implementing this invention consists of multiple elements and realizes comprehensive data processing and user interaction. First, the server receives content uploaded by the user. The devices used here include internet-connected terminals such as smartphones and smart glasses.

[0392] The server first uses data analysis software to extract images and text from the content, identifying and removing specific parts of the content. This process may utilize tools such as the Google Cloud Vision API for image analysis.

[0393] Next, the server compares the extracted data with an existing internal database. This comparison process uses generative AI models such as OpenAI's GPT to verify copyright and determine the novelty of the information.

[0394] Furthermore, the server processes camera and microphone information sent from the user's device in real time to analyze the user's emotions. IBM Watson's emotion analysis API may be used for this analysis. Based on the analyzed emotion data, the server adjusts the content of notifications and feedback provided to ensure appropriate communication tailored to the user's psychological state.

[0395] As a concrete example, consider a scenario where a user creates content using a generation AI and encounters copyright issues. In this case, the server provides tailored feedback such as, "This section is similar to existing work. Here are some tips to enhance its originality. Relax and enjoy coming up with great ideas!"

[0396] An example of a prompt for a generative AI model might be: "Analyze user-generated content and generate copyright feedback in a gentle tone using sentiment data."

[0397] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0398] Step 1:

[0399] Users upload generated content to the system via devices such as smartphones or smart glasses. The input is content data from the user's device, which is sent to the server. The output is the content data received by the server.

[0400] Step 2:

[0401] The server extracts images and text from uploaded content. The input is the received content data, and the server uses image analysis software and text analysis algorithms to extract data based on this data. The output is the extracted image and text data.

[0402] Step 3:

[0403] The server compares the extracted images and text with an existing database. The input is the extracted data, which is compared with the information in the database using OpenAI's GPT model and image matching algorithms. The output is an evaluation of the degree of match and similarity as a result of the matching.

[0404] Step 4:

[0405] The server receives data in real time from the terminal's camera and microphone to analyze the user's emotional data. The input is the user's emotional information, and the server performs analysis using IBM Watson's emotional analysis API. The output is an evaluation of the user's current emotional state.

[0406] Step 5:

[0407] The server generates feedback for the user based on the matching results and emotional state. The inputs are the legal evaluation of the judged content and the user's emotional state. Accordingly, the server adjusts the tone of the feedback and, if necessary, uses a generative AI model to generate specific advice and suggestions. The output is the adjusted feedback message sent to the user.

[0408] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0409] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0410] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0411] [Third Embodiment]

[0412] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0413] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0414] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0415] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0416] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0417] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0418] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0419] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0420] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0421] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0422] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0423] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0424] This invention provides a system for automatically verifying content created by generation AI and reducing legal risks. This system is operated by a server and ensures the legality and accuracy of the generated content by analyzing and matching images and text.

[0425] The server first receives content generated from the user's device and extracts image and text data from it. The extracted data is sent to a database connected to the server as a matching key. This database contains existing copyright data and published guidelines, which are then used to compare and analyze the data.

[0426] During the matching process, the server uses image recognition algorithms to extract visual features and natural language processing algorithms to analyze the semantic content of the text. This allows the server to evaluate the similarity of the content with high accuracy.

[0427] After the initial assessment, the server performs a detailed evaluation of the content deemed problematic based on the matching results. For example, if a particular image matches existing copyrighted material, the server detects this and sends a notification to the user's device. The notification includes recommendations on which parts are problematic and how to correct them.

[0428] Afterward, the user can modify the content on the device that received the notification. Once completed, the user resubmits the modified content to the server. The server then verifies it again to determine if the problem has been resolved. If all issues are resolved, the server notifies the user and approves the use of the content.

[0429] For example, consider a scenario where a user creates an image for online posting using a generation AI, and that image partially resembles an existing copyrighted work. In this case, the server identifies the relevant part and sends instructions to the user to make corrections. After the user appropriately corrects the image, the server re-examines it to confirm that the problem has been resolved, allowing the user to confidently publish the image. Through this process, users can use the generated content without worrying about legal hurdles.

[0430] The following describes the processing flow.

[0431] Step 1:

[0432] The user uses a device to instruct the generation AI to create content, and uploads the generated images and text to the system. The server receives them.

[0433] Step 2:

[0434] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology obtains the text as digital character information.

[0435] Step 3:

[0436] The server uses the extracted data to begin matching it against its internally maintained database. The database contains copyright information and publicly available content data; images are compared using feature vectors, and text is compared using document matching techniques.

[0437] Step 4:

[0438] The server determines, based on the matching results, whether there is a potential copyright infringement and whether the information is inaccurate. This determination includes numerical analysis of the degree of agreement and confidence of the results.

[0439] Step 5:

[0440] The server notifies the user's device of the detection results. The notification includes the specific part of the content in which the problem was found, as well as advice on how to address and correct it.

[0441] Step 6:

[0442] The user uses their device to receive notifications from the server and correct the content that has been flagged as problematic. Once the necessary corrections are complete, they resubmit the content to the server.

[0443] Step 7:

[0444] The server receives the corrected content, performs another check, and verifies that the problem has been resolved. If all check results are clear, it is determined that there is no problem.

[0445] Step 8:

[0446] The server will notify the user again, confirming that the corrected content is free of problems. This allows the user to use the corrected content with confidence.

[0447] (Example 1)

[0448] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0449] There is a need to efficiently verify copyright infringement and originality of content created using generative AI models, thereby reducing legal risks and ensuring users can use the content with confidence. Current methods require specialized knowledge and a significant amount of time, which is burdensome for users.

[0450] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0451] In this invention, the server includes separation means for separating image data and text data from content, comparison means for comparing the separated image data and text data with existing information aggregation devices, and means for notifying the user of instructions based on the evaluation results. This enables automatic content analysis and notification of potential legal infringements.

[0452] "Generative devices" refer to devices or software used to artificially generate content, and primarily refer to data generation using generative AI models.

[0453] An "information processing device" refers to an electronic device or system that processes, analyzes, and compares data, and performs various operations on content received from a user.

[0454] "Separation means" refers to a process or mechanism for individually extracting image data and text data from received content.

[0455] An "information aggregation device" refers to a database or storage system in which existing data and information are accumulated, and is used for referencing, matching, and comparison.

[0456] "Comparison means" refers to a process or algorithm for determining similarity or novelty by comparing extracted or separated data with existing information.

[0457] "Evaluation methods" refer to methods and functions for determining the legal compliance and uniqueness of data based on comparative results.

[0458] "Visual features" refer to features used to analyze and extract elements such as shape, color, and pattern from image data.

[0459] "Content interpretation means" refers to algorithms or processes for analyzing the structure and meaning of linguistic data and interpreting the significance and intent of the information.

[0460] "Instruction and notification means" refers to a means of communication or presentation used to convey specific corrections or warnings to the user based on the evaluation results.

[0461] This invention is a system for mitigating legal risks associated with content created by generative AI models. This system is primarily server-based and processes content through the user's terminal. Specifically, it includes the following components:

[0462] First, the user generates content such as images and text using a generative AI model and sends it to the server via their device. This content arrives at the server in the form of image data and text data. The server then prepares these for individual analysis using a data separation mechanism.

[0463] Next, the server uses an information aggregation device, i.e., a database, for verification. This database stores information on known copyrighted works and legal guidelines. Using comparison tools, the server compares this existing information with the content data. In this process, image recognition algorithms (e.g., OpenCV) are used for analyzing the visual features of image data, and natural language processing (e.g., SpaCy) is used for interpreting the content of text data. This allows the server to evaluate the originality and legal compliance of the generated content.

[0464] Once the evaluation is complete, the user will be notified of any issues and instructions for correction as needed. For example, if an image generated by the user using an AI model for online posting is similar to an existing copyrighted work, a partial revision will be suggested. The user will receive this notification and make the necessary corrections on their device. After the revisions, the content will be resubmitted to the server for re-evaluation and final approval.

[0465] A concrete example of a prompt message is: "We are developing an AI that generates original images for social media posts. Please explain how to automate compliance checks to ensure these images are not similar to other copyrighted works and to avoid legal risks." In this way, the system provides users with the ability to use generated content with peace of mind, without legal concerns.

[0466] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0467] Step 1:

[0468] The user's device uses a generation AI model to generate content. The input at this stage is the user's prompt text, and the generated content (images and text) is output. The user then prepares to send this output content to the server.

[0469] Step 2:

[0470] The server receives content generated by the user. The server's input is the content file received from the user's terminal. The server receives the content and uses a separation mechanism to decompose it into image data and text data. The output is image data and text data that can be processed individually. This separation prepares the server for data analysis.

[0471] Step 3:

[0472] The server compares the separated image and text data with the information aggregation device. The input is the image and text data sent to the server, and the output is the comparison result. The server uses a comparison means to extract visual features with an image recognition algorithm and analyze the semantic content of the text with a natural language processing algorithm. Similarity is evaluated based on the output comparison result.

[0473] Step 4:

[0474] The server performs an evaluation based on the matching results. It uses evaluation tools to determine legal compliance and originality. The input is the matching results obtained in the previous step. The output is the evaluation results. If infringement is suspected, specific problem areas and necessary modifications are identified.

[0475] Step 5:

[0476] The server sends a notification to the user's terminal based on the evaluation results. The input is information about the evaluation results, and the output is a message to the user with specific correction instructions or warnings. Through the notification system, the user is informed in detail about the problem areas and which parts need to be corrected and how.

[0477] Step 6:

[0478] The user modifies the content on their device based on the notification. The input is the notification from the server. The user makes the necessary corrections and generates the corrected content as output. This corrected version is then sent back to the server for re-evaluation.

[0479] Step 7:

[0480] The server re-evaluates the revised content. The input is the revised content submitted by the user, and the output is the final evaluation result. The server performs another analysis to check for any problems. If the problems are resolved, the server notifies the user that they are approved to use the content.

[0481] (Application Example 1)

[0482] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0483] In recent years, content creation using generative AI has rapidly become widespread. However, the potential for the generated content to infringe on intellectual property rights such as copyrights poses a legal risk. Furthermore, there is a need for effective systems that can automatically detect similar content and provide appropriate improvement suggestions.

[0484] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0485] In this invention, the server includes an extraction means for extracting visual and textual information from content, a matching means for comparing the extracted visual and textual information with an existing information archive, a determination means for determining intellectual property infringement and originality of the information based on the matching results, and a suggestion means for making improvement suggestions based on the identified similarities. This reduces the legal risks of content generated by AI, while enabling users to confidently publish their own content.

[0486] An "information processing device" is a computer system used to process digital data and perform specific tasks.

[0487] "Visual information" refers to data that is visually recognized, such as images and shapes.

[0488] "Character information" refers to a series of characters, symbols, and other data that make up text data.

[0489] "Extraction means" refers to a process or apparatus for obtaining specific elements, such as visual or textual information, from content.

[0490] "Information aggregation" refers to a collection of existing information stored in a database, including data and knowledge bases collected in the past.

[0491] "Verification means" refers to a method or function for comparing newly obtained data with existing information collections.

[0492] A "determination means" is a mechanism or method for evaluating whether a particular criterion is met based on the results of a data comparison.

[0493] "Intellectual property infringement" refers to the act of illegally using or infringing on another person's intellectual property rights.

[0494] "Uniqueness" refers to possessing new and distinctive characteristics that clearly distinguish it from other products or content.

[0495] A "proposal method" is a function or process for presenting the optimal solution or improvement method under specific conditions.

[0496] In this invention, first, the user creates images and text information through a generation AI model using a content generation terminal. The user then transmits the generated content to an information processing device. The information processing device extracts visual and text information from the content using an extraction means. This extraction is performed through the steady cooperation of each component within the system. The extracted information is sent to a matching means and compared with an existing information aggregate. The information aggregate used here includes historical data and publicly available guidelines.

[0497] Based on the matching results, the server uses a determination tool to assess the potential for intellectual property infringement and the originality of the information. During the determination process, software libraries such as TensorFlow and SpaCy are used to effectively perform image recognition and natural language processing. If a problem is identified through this process, the user is notified of improvement suggestions using a suggestion tool.

[0498] Users receive improvement suggestions on their devices and revise the content. The revised content is then sent back to the information processing device for re-evaluation. This ensures that the improved content is legally compliant, allowing for confident use and publication.

[0499] As a concrete example, consider a scenario where a user creates digital art and it is determined to be similar to an existing painting. In this case, the server clarifies the similarities and suggests how to enhance the originality by modifying certain parts of the visual information. These suggestions might include changing the color tone or adding unique patterns.

[0500] As an example of a prompt, the AI model is given the prompt, "Determine whether this digital art is similar to other works, and tell me which parts need to be corrected and how to make those corrections." The AI then suggests the most suitable correction method.

[0501] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0502] Step 1:

[0503] The user creates content from their device using a generative AI model. The input consists of text and image data created by the user. This data is then generated by the generative AI model and stored on the device.

[0504] Step 2:

[0505] The terminal sends content data to the information processing device. The input includes the text and image data generated in step 1. The output is the data being passed to the information processing device. This action prepares the data for processing on the server.

[0506] Step 3:

[0507] The server uses extraction methods to extract visual and textual information from the content. Data sent from the terminal is taken into the server as input. The output is the extracted visual and textual information. The server uses image recognition algorithms to extract visual features and natural language processing to analyze the text content.

[0508] Step 4:

[0509] The server uses matching mechanisms to compare the extracted data with existing data sets. The input is the extraction results from step 3. The output is the similarity score and corresponding data points as the matching results. In this process, the data is compared with the information in the database and a similarity analysis is performed.

[0510] Step 5:

[0511] The server evaluates the possibility of intellectual property infringement based on the matching results through a determination mechanism. The input is the matching results from step 4. The output generates determination information regarding the possibility of infringement and originality. Here, the evaluation is performed according to intellectual property rules.

[0512] Step 6:

[0513] The server uses a suggestion mechanism to notify the user of improvement suggestions based on the judgment result. The judgment information from step 5 is used as input. The output is a notification of what needs to be improved and specific correction suggestions. The user receives this information on their terminal and is encouraged to make appropriate corrections.

[0514] Step 7:

[0515] The user modifies the content on their device and sends it back to the server. The input is the modified content data. The output is the modified data being sent back to the server. The revision is completed by the user's editing action.

[0516] Step 8:

[0517] The server re-evaluates the corrected content using the evaluation method. The input is the corrected data submitted in step 7. The output is the final evaluation result of the improved content. The server can then confirm that there are no legal issues and notify the user.

[0518] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0519] This invention provides a system that evaluates content generated by a generative AI and gives appropriate notifications to the user, by combining it with an emotion engine to provide interactions that take the user's emotions into consideration. In this system, the server is central and performs both emotion analysis and content matching.

[0520] The server receives content uploaded from the user's device and first extracts images and text. The server then compares this data with its internal database to determine potential copyright infringement and the novelty of the information. If the comparison reveals any issues, the server notifies the user.

[0521] Furthermore, this system is equipped with an emotion engine that can analyze the user's emotions. The emotion engine evaluates the user's emotions in real time based on data acquired from the camera, microphone, etc., on the user's device. This emotion data is sent to the server, and the notification content is adjusted according to the user's emotional state. For example, if the user is feeling stressed, the system will take an approach such as making the tone of the notification softer.

[0522] As a concrete example, consider a situation where a user creates a work using a generative AI and encounters copyright issues. In this case, the server uses an emotion engine to analyze the user's emotional state. If the user is feeling anxious, the server provides a gentle notification, offering not only corrections but also resources and support options to create a more user-friendly experience. Furthermore, the server performs a similar process when the user resubmits the content after making corrections, supporting the entire process.

[0523] By combining these emotion engines, we have created a system that improves the user experience and ensures both the legal safety of the content and the psychological comfort of the user.

[0524] The following describes the processing flow.

[0525] Step 1:

[0526] The user uses their device to request content creation from the AI generator and uploads the generated images and text to the system. The server receives this content.

[0527] Step 2:

[0528] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology digitizes and retrieves the text.

[0529] Step 3:

[0530] The server compares the extracted images and text against an internal database. This database contains copyright information and known data; images are searched based on visual similarity, and text is compared using natural language processing techniques.

[0531] Step 4:

[0532] Based on the matching results, the server determines the potential for copyright infringement of the content and the novelty of the information. The results include a numerical analysis of discrepancies and similarities.

[0533] Step 5:

[0534] The server reviews the assessment results and, if a problem is detected, activates the emotion engine. Based on camera and microphone data acquired from the user's device, it evaluates the user's emotional state.

[0535] Step 6:

[0536] The server adjusts the notification content according to the user's emotional state. For example, if the user is feeling stressed, it will notify them of the problem and suggest improvements in a gentle tone.

[0537] Step 7:

[0538] The user receives a notification on their device and corrects the indicated content. Once the user has completed the correction, the updated content is resent to the server.

[0539] Step 8:

[0540] The server receives the corrected content, performs another check, and verifies that all issues have been resolved.

[0541] Step 9:

[0542] The server notifies the user that the modified content has been cleared, providing ultimate reassurance. This allows the user to confirm that the content is safe to use.

[0543] (Example 2)

[0544] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0545] In recent years, the creation of content using generative AI has increased. However, this has led to problems such as the risk of copyright infringement and a lack of originality in the created content. Furthermore, in order for users to respond appropriately to these risks, interactive notifications that take into account the user's emotional state are required. Therefore, a system is needed that guarantees the rights and originality of content while taking user emotions into consideration.

[0546] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0547] In this invention, the server includes extraction means for extracting images and text from content; matching means for comparing the extracted images and text with an existing database; determination means for determining copyright infringement of the content and novelty of the information based on the matching results; sentiment analysis means for analyzing the user's emotions; notification adjustment means for adjusting notification content based on the analyzed sentiment data; and notification means for providing notifications according to the user's emotional state. This makes it possible to provide users with information about the content in an appropriate and emotionally sensitive manner, thereby reducing legal risks and realizing a comfortable user experience.

[0548] A "generating device" refers to the entire system used to create content using generative AI.

[0549] A "data processing device" refers to the entire system of equipment that performs data processing, such as evaluating generated content and analyzing user sentiment.

[0550] "Extraction method" refers to a function that extracts image and text data from received content.

[0551] "Matching means" refers to a function that compares extracted image and text data with data in an existing database.

[0552] "Determination means" refers to a function that evaluates the possibility of copyright infringement and the novelty of content based on the results of the comparison means.

[0553] "Emotional analysis means" refers to a function that integrates software and hardware for analyzing a user's emotions.

[0554] "Notification adjustment mechanism" refers to a function that adjusts the content and tone of notifications sent to the user based on analyzed sentiment data.

[0555] "Notification means" refers to the entire set of communication technologies and interfaces used to provide information to users.

[0556] In this invention, a content generation device and a data processing device operate in conjunction. The server receives content generated using a generation AI model from the user's terminal. This processing uses the HTTP protocol based on data transfer technology. Image and text data are extracted from the received content by an extraction means. In this process, software such as "OpenCV" is used as an image analysis library, and "Tesseract OCR" is used for text extraction.

[0557] Next, the server uses matching means to compare the extracted data with an existing database. The database contains copyright information and data on similar existing content in the market, and this information is used to determine the potential for copyright infringement and the novelty of the content.

[0558] The server also performs sentiment analysis using camera and microphone data acquired from the user's device. This sentiment analysis utilizes technologies such as the "Emotion API" to evaluate the user's emotional state. The results of this evaluation are then used by a notification adjustment mechanism to flexibly modify the content and tone of notifications delivered to the user. These notifications are delivered in various forms, such as push notifications or email, with the receiving method tailored to the user's preference.

[0559] As a concrete example, suppose a user uses AI to create a new piece of art and uploads it. The server checks for similarity to existing works, and if a similar work exists, it analyzes the user's emotional state and, if it determines that the user is feeling anxious, sends a message in a gentle tone such as, "This work may be similar to an existing one. Please refer to this guide to try revising it. If you need assistance, please contact customer support."

[0560] A concrete example of a prompt message for a generative AI model would be: "When using generative AI to create a new piece of artwork, please help me identify how my work differs from existing works."

[0561] In this way, this system can provide an assessment of content copyright while taking into consideration the user's psychological comfort.

[0562] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0563] Step 1:

[0564] Users generate content using a generative AI model and upload it to the server. Input includes images and text data generated by the user. The server receives this data using the HTTP protocol. The output is a digital file of the content stored on the server.

[0565] Step 2:

[0566] The server extracts images and text from the received content. It receives image and text data contained in a digital file as input. The server processes the image data using an image analysis library (e.g., OpenCV) and extracts the text using a text analysis tool (e.g., Tesseract OCR). The output consists of the extracted image and text information.

[0567] Step 3:

[0568] The server uses matching mechanisms to compare extracted images and text information with existing databases. The input consists of extracted images and text information, as well as existing data stored in the database. The server performs a similarity calculation to evaluate how similar the content is to existing material. The output provides a determination of the content's potential for copyright infringement and its novelty.

[0569] Step 4:

[0570] The server analyzes the user's emotions using emotion analysis tools. It uses camera video and audio data acquired from the user's device as input. The server uses an emotion analysis API to evaluate the user's emotional state in real time from this data. The output is information about the user's emotional state.

[0571] Step 5:

[0572] The server uses a notification adjustment mechanism to adjust the content and tone of notifications according to the user's emotional state. The server takes the judgment result and emotional state information as input and generates the notification content. The adjusted notification message is output.

[0573] Step 6:

[0574] The server sends notifications to the user using various notification methods. Inputs include a tailored notification message and information about how the user receives notifications. The server provides information to the user via push notifications or email. Outputs include feedback on the content from the notification messages received by the user.

[0575] (Application Example 2)

[0576] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0577] It is necessary to simultaneously address legal issues and improve the user experience of content created by generative AI. However, conventional systems have been superficial in their legal criticism and have failed to provide feedback that takes user emotions into consideration. As a result, there has been a lack of means to alleviate anxiety and stress when users experience them.

[0578] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0579] In this invention, the server includes extraction means for extracting images and text from content, matching means for comparing the extracted data with an existing database, and adjustment means for adjusting the tone of feedback based on the analyzed user sentiment. This makes it possible to provide feedback from a legal perspective while taking user sentiment into consideration, thereby improving the user experience.

[0580] A "data processing device" is a device connected to a content generation device for analyzing and processing content data.

[0581] An "extraction means" is a means that has the function of identifying and extracting images and text from content.

[0582] "Matching means" refers to a method of comparing extracted images and text with an existing database to confirm matches or similarities.

[0583] "Determination method" refers to a means of determining whether content infringes copyright or whether the information is novel, based on the results of the comparison.

[0584] "Emotional analysis methods" refer to methods for analyzing a user's emotions based on data acquired from cameras and voice input devices.

[0585] "Adjustment methods" refer to means of changing the tone and content of the feedback provided in accordance with the analyzed emotions of the user.

[0586] The system for implementing this invention consists of multiple elements and realizes comprehensive data processing and user interaction. First, the server receives content uploaded by the user. The devices used here include internet-connected terminals such as smartphones and smart glasses.

[0587] The server first uses data analysis software to extract images and text from the content, identifying and removing specific parts of the content. This process may utilize tools such as the Google Cloud Vision API for image analysis.

[0588] Next, the server compares the extracted data with an existing internal database. This comparison process uses generative AI models such as OpenAI's GPT to verify copyright and determine the novelty of the information.

[0589] Furthermore, the server processes camera and microphone information sent from the user's device in real time to analyze the user's emotions. IBM Watson's emotion analysis API may be used for this analysis. Based on the analyzed emotion data, the server adjusts the content of notifications and feedback provided to ensure appropriate communication tailored to the user's psychological state.

[0590] As a concrete example, consider a scenario where a user creates content using a generation AI and encounters copyright issues. In this case, the server provides tailored feedback such as, "This section is similar to existing work. Here are some tips to enhance its originality. Relax and enjoy coming up with great ideas!"

[0591] An example of a prompt for a generative AI model might be: "Analyze user-generated content and generate copyright feedback in a gentle tone using sentiment data."

[0592] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0593] Step 1:

[0594] Users upload generated content to the system via devices such as smartphones or smart glasses. The input is content data from the user's device, which is sent to the server. The output is the content data received by the server.

[0595] Step 2:

[0596] The server extracts images and text from uploaded content. The input is the received content data, and the server uses image analysis software and text analysis algorithms to extract data based on this data. The output is the extracted image and text data.

[0597] Step 3:

[0598] The server compares the extracted images and text with an existing database. The input is the extracted data, which is compared with the information in the database using OpenAI's GPT model and image matching algorithms. The output is an evaluation of the degree of match and similarity as a result of the matching.

[0599] Step 4:

[0600] The server receives data in real time from the terminal's camera and microphone to analyze the user's emotional data. The input is the user's emotional information, and the server performs analysis using IBM Watson's emotional analysis API. The output is an evaluation of the user's current emotional state.

[0601] Step 5:

[0602] The server generates feedback for the user based on the matching results and emotional state. The inputs are the legal evaluation of the judged content and the user's emotional state. Accordingly, the server adjusts the tone of the feedback and, if necessary, uses a generative AI model to generate specific advice and suggestions. The output is the adjusted feedback message sent to the user.

[0603] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0604] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0605] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0606] [Fourth Embodiment]

[0607] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0608] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0609] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0610] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0611] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0612] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0613] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0614] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0615] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0616] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0617] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0618] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0619] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0620] This invention provides a system for automatically verifying content created by generation AI and reducing legal risks. This system is operated by a server and ensures the legality and accuracy of the generated content by analyzing and matching images and text.

[0621] The server first receives content generated from the user's device and extracts image and text data from it. The extracted data is sent to a database connected to the server as a matching key. This database contains existing copyright data and published guidelines, which are then used to compare and analyze the data.

[0622] During the matching process, the server uses image recognition algorithms to extract visual features and natural language processing algorithms to analyze the semantic content of the text. This allows the server to evaluate the similarity of the content with high accuracy.

[0623] After the initial assessment, the server performs a detailed evaluation of the content deemed problematic based on the matching results. For example, if a particular image matches existing copyrighted material, the server detects this and sends a notification to the user's device. The notification includes recommendations on which parts are problematic and how to correct them.

[0624] Afterward, the user can modify the content on the device that received the notification. Once completed, the user resubmits the modified content to the server. The server then verifies it again to determine if the problem has been resolved. If all issues are resolved, the server notifies the user and approves the use of the content.

[0625] For example, consider a scenario where a user creates an image for online posting using a generation AI, and that image partially resembles an existing copyrighted work. In this case, the server identifies the relevant part and sends instructions to the user to make corrections. After the user appropriately corrects the image, the server re-examines it to confirm that the problem has been resolved, allowing the user to confidently publish the image. Through this process, users can use the generated content without worrying about legal hurdles.

[0626] The following describes the processing flow.

[0627] Step 1:

[0628] The user uses a device to instruct the generation AI to create content, and uploads the generated images and text to the system. The server receives them.

[0629] Step 2:

[0630] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology obtains the text as digital character information.

[0631] Step 3:

[0632] The server uses the extracted data to begin matching it against its internally maintained database. The database contains copyright information and publicly available content data; images are compared using feature vectors, and text is compared using document matching techniques.

[0633] Step 4:

[0634] The server determines, based on the matching results, whether there is a potential copyright infringement and whether the information is inaccurate. This determination includes numerical analysis of the degree of agreement and confidence of the results.

[0635] Step 5:

[0636] The server notifies the user's device of the detection results. The notification includes the specific part of the content in which the problem was found, as well as advice on how to address and correct it.

[0637] Step 6:

[0638] The user uses their device to receive notifications from the server and correct the content that has been flagged as problematic. Once the necessary corrections are complete, they resubmit the content to the server.

[0639] Step 7:

[0640] The server receives the corrected content, performs another check, and verifies that the problem has been resolved. If all check results are clear, it is determined that there is no problem.

[0641] Step 8:

[0642] The server will notify the user again, confirming that the corrected content is free of problems. This allows the user to use the corrected content with confidence.

[0643] (Example 1)

[0644] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0645] There is a need to efficiently verify copyright infringement and originality of content created using generative AI models, thereby reducing legal risks and ensuring users can use the content with confidence. Current methods require specialized knowledge and a significant amount of time, which is burdensome for users.

[0646] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0647] In this invention, the server includes separation means for separating image data and text data from content, comparison means for comparing the separated image data and text data with existing information aggregation devices, and means for notifying the user of instructions based on the evaluation results. This enables automatic content analysis and notification of potential legal infringements.

[0648] "Generative devices" refer to devices or software used to artificially generate content, and primarily refer to data generation using generative AI models.

[0649] An "information processing device" refers to an electronic device or system that processes, analyzes, and compares data, and performs various operations on content received from a user.

[0650] "Separation means" refers to a process or mechanism for individually extracting image data and text data from received content.

[0651] An "information aggregation device" refers to a database or storage system in which existing data and information are accumulated, and is used for referencing, matching, and comparison.

[0652] "Comparison means" refers to a process or algorithm for determining similarity or novelty by comparing extracted or separated data with existing information.

[0653] "Evaluation methods" refer to methods and functions for determining the legal compliance and uniqueness of data based on comparative results.

[0654] "Visual features" refer to features used to analyze and extract elements such as shape, color, and pattern from image data.

[0655] "Content interpretation means" refers to algorithms or processes for analyzing the structure and meaning of linguistic data and interpreting the significance and intent of the information.

[0656] "Instruction and notification means" refers to a means of communication or presentation used to convey specific corrections or warnings to the user based on the evaluation results.

[0657] This invention is a system for mitigating legal risks associated with content created by generative AI models. This system is primarily server-based and processes content through the user's terminal. Specifically, it includes the following components:

[0658] First, the user generates content such as images and text using a generative AI model and sends it to the server via their device. This content arrives at the server in the form of image data and text data. The server then prepares these for individual analysis using a data separation mechanism.

[0659] Next, the server uses an information aggregation device, i.e., a database, for verification. This database stores information on known copyrighted works and legal guidelines. Using comparison tools, the server compares this existing information with the content data. In this process, image recognition algorithms (e.g., OpenCV) are used for analyzing the visual features of image data, and natural language processing (e.g., SpaCy) is used for interpreting the content of text data. This allows the server to evaluate the originality and legal compliance of the generated content.

[0660] Once the evaluation is complete, the user will be notified of any issues and instructions for correction as needed. For example, if an image generated by the user using an AI model for online posting is similar to an existing copyrighted work, a partial revision will be suggested. The user will receive this notification and make the necessary corrections on their device. After the revisions, the content will be resubmitted to the server for re-evaluation and final approval.

[0661] A concrete example of a prompt message is: "We are developing an AI that generates original images for social media posts. Please explain how to automate compliance checks to ensure these images are not similar to other copyrighted works and to avoid legal risks." In this way, the system provides users with the ability to use generated content with peace of mind, without legal concerns.

[0662] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0663] Step 1:

[0664] The user's device uses a generation AI model to generate content. The input at this stage is the user's prompt text, and the generated content (images and text) is output. The user then prepares to send this output content to the server.

[0665] Step 2:

[0666] The server receives content generated by the user. The server's input is the content file received from the user's terminal. The server receives the content and uses a separation mechanism to decompose it into image data and text data. The output is image data and text data that can be processed individually. This separation prepares the server for data analysis.

[0667] Step 3:

[0668] The server compares the separated image and text data with the information aggregation device. The input is the image and text data sent to the server, and the output is the comparison result. The server uses a comparison means to extract visual features with an image recognition algorithm and analyze the semantic content of the text with a natural language processing algorithm. Similarity is evaluated based on the output comparison result.

[0669] Step 4:

[0670] The server performs an evaluation based on the matching results. It uses evaluation tools to determine legal compliance and originality. The input is the matching results obtained in the previous step. The output is the evaluation results. If infringement is suspected, specific problem areas and necessary modifications are identified.

[0671] Step 5:

[0672] The server sends a notification to the user's terminal based on the evaluation results. The input is information about the evaluation results, and the output is a message to the user with specific correction instructions or warnings. Through the notification system, the user is informed in detail about the problem areas and which parts need to be corrected and how.

[0673] Step 6:

[0674] The user modifies the content on their device based on the notification. The input is the notification from the server. The user makes the necessary corrections and generates the corrected content as output. This corrected version is then sent back to the server for re-evaluation.

[0675] Step 7:

[0676] The server re-evaluates the revised content. The input is the revised content submitted by the user, and the output is the final evaluation result. The server performs another analysis to check for any problems. If the problems are resolved, the server notifies the user that they are approved to use the content.

[0677] (Application Example 1)

[0678] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0679] In recent years, content creation using generative AI has rapidly become widespread. However, the potential for the generated content to infringe on intellectual property rights such as copyrights poses a legal risk. Furthermore, there is a need for effective systems that can automatically detect similar content and provide appropriate improvement suggestions.

[0680] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0681] In this invention, the server includes an extraction means for extracting visual and textual information from content, a matching means for comparing the extracted visual and textual information with an existing information archive, a determination means for determining intellectual property infringement and originality of the information based on the matching results, and a suggestion means for making improvement suggestions based on the identified similarities. This reduces the legal risks of content generated by AI, while enabling users to confidently publish their own content.

[0682] An "information processing device" is a computer system used to process digital data and perform specific tasks.

[0683] "Visual information" refers to data that is visually recognized, such as images and shapes.

[0684] "Character information" refers to a series of characters, symbols, and other data that make up text data.

[0685] "Extraction means" refers to a process or apparatus for obtaining specific elements, such as visual or textual information, from content.

[0686] "Information aggregation" refers to a collection of existing information stored in a database, including data and knowledge bases collected in the past.

[0687] "Verification means" refers to a method or function for comparing newly obtained data with existing information collections.

[0688] A "determination means" is a mechanism or method for evaluating whether a particular criterion is met based on the results of a data comparison.

[0689] "Intellectual property infringement" refers to the act of illegally using or infringing on another person's intellectual property rights.

[0690] "Uniqueness" refers to possessing new and distinctive characteristics that clearly distinguish it from other products or content.

[0691] A "proposal method" is a function or process for presenting the optimal solution or improvement method under specific conditions.

[0692] In this invention, first, the user creates images and text information through a generation AI model using a content generation terminal. The user then transmits the generated content to an information processing device. The information processing device extracts visual and text information from the content using an extraction means. This extraction is performed through the steady cooperation of each component within the system. The extracted information is sent to a matching means and compared with an existing information aggregate. The information aggregate used here includes historical data and publicly available guidelines.

[0693] Based on the matching results, the server uses a determination tool to assess the potential for intellectual property infringement and the originality of the information. During the determination process, software libraries such as TensorFlow and SpaCy are used to effectively perform image recognition and natural language processing. If a problem is identified through this process, the user is notified of improvement suggestions using a suggestion tool.

[0694] Users receive improvement suggestions on their devices and revise the content. The revised content is then sent back to the information processing device for re-evaluation. This ensures that the improved content is legally compliant, allowing for confident use and publication.

[0695] As a concrete example, consider a scenario where a user creates digital art and it is determined to be similar to an existing painting. In this case, the server clarifies the similarities and suggests how to enhance the originality by modifying certain parts of the visual information. These suggestions might include changing the color tone or adding unique patterns.

[0696] As an example of a prompt, the AI model is given the prompt, "Determine whether this digital art is similar to other works, and tell me which parts need to be corrected and how to make those corrections." The AI then suggests the most suitable correction method.

[0697] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0698] Step 1:

[0699] The user creates content from their device using a generative AI model. The input consists of text and image data created by the user. This data is then generated by the generative AI model and stored on the device.

[0700] Step 2:

[0701] The terminal sends content data to the information processing device. The input includes the text and image data generated in step 1. The output is the data being passed to the information processing device. This action prepares the data for processing on the server.

[0702] Step 3:

[0703] The server uses extraction methods to extract visual and textual information from the content. Data sent from the terminal is taken into the server as input. The output is the extracted visual and textual information. The server uses image recognition algorithms to extract visual features and natural language processing to analyze the text content.

[0704] Step 4:

[0705] The server uses matching mechanisms to compare the extracted data with existing data sets. The input is the extraction results from step 3. The output is the similarity score and corresponding data points as the matching results. In this process, the data is compared with the information in the database and a similarity analysis is performed.

[0706] Step 5:

[0707] The server evaluates the possibility of intellectual property infringement based on the matching results through a determination mechanism. The input is the matching results from step 4. The output generates determination information regarding the possibility of infringement and originality. Here, the evaluation is performed according to intellectual property rules.

[0708] Step 6:

[0709] The server uses a suggestion mechanism to notify the user of improvement suggestions based on the judgment result. The judgment information from step 5 is used as input. The output is a notification of what needs to be improved and specific correction suggestions. The user receives this information on their terminal and is encouraged to make appropriate corrections.

[0710] Step 7:

[0711] The user modifies the content on their device and sends it back to the server. The input is the modified content data. The output is the modified data being sent back to the server. The revision is completed by the user's editing action.

[0712] Step 8:

[0713] The server re-evaluates the corrected content using the evaluation method. The input is the corrected data submitted in step 7. The output is the final evaluation result of the improved content. The server can then confirm that there are no legal issues and notify the user.

[0714] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0715] This invention provides a system that evaluates content generated by a generative AI and gives appropriate notifications to the user, by combining it with an emotion engine to provide interactions that take the user's emotions into consideration. In this system, the server is central and performs both emotion analysis and content matching.

[0716] The server receives content uploaded from the user's device and first extracts images and text. The server then compares this data with its internal database to determine potential copyright infringement and the novelty of the information. If the comparison reveals any issues, the server notifies the user.

[0717] Furthermore, this system is equipped with an emotion engine that can analyze the user's emotions. The emotion engine evaluates the user's emotions in real time based on data acquired from the camera, microphone, etc., on the user's device. This emotion data is sent to the server, and the notification content is adjusted according to the user's emotional state. For example, if the user is feeling stressed, the system will take an approach such as making the tone of the notification softer.

[0718] As a concrete example, consider a situation where a user creates a work using a generative AI and encounters copyright issues. In this case, the server uses an emotion engine to analyze the user's emotional state. If the user is feeling anxious, the server provides a gentle notification, offering not only corrections but also resources and support options to create a more user-friendly experience. Furthermore, the server performs a similar process when the user resubmits the content after making corrections, supporting the entire process.

[0719] By combining these emotion engines, we have created a system that improves the user experience and ensures both the legal safety of the content and the psychological comfort of the user.

[0720] The following describes the processing flow.

[0721] Step 1:

[0722] The user uses their device to request content creation from the AI generator and uploads the generated images and text to the system. The server receives this content.

[0723] Step 2:

[0724] The server extracts image data and text data from the received content. Image recognition algorithms analyze the features of the images, and OCR technology digitizes and retrieves the text.

[0725] Step 3:

[0726] The server compares the extracted images and text against an internal database. This database contains copyright information and known data; images are searched based on visual similarity, and text is compared using natural language processing techniques.

[0727] Step 4:

[0728] Based on the matching results, the server determines the potential for copyright infringement of the content and the novelty of the information. The results include a numerical analysis of discrepancies and similarities.

[0729] Step 5:

[0730] The server reviews the assessment results and, if a problem is detected, activates the emotion engine. Based on camera and microphone data acquired from the user's device, it evaluates the user's emotional state.

[0731] Step 6:

[0732] The server adjusts the notification content according to the user's emotional state. For example, if the user is feeling stressed, it will notify them of the problem and suggest improvements in a gentle tone.

[0733] Step 7:

[0734] The user receives a notification on their device and corrects the indicated content. Once the user has completed the correction, the updated content is resent to the server.

[0735] Step 8:

[0736] The server receives the corrected content, performs another check, and verifies that all issues have been resolved.

[0737] Step 9:

[0738] The server notifies the user that the modified content has been cleared, providing ultimate reassurance. This allows the user to confirm that the content is safe to use.

[0739] (Example 2)

[0740] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0741] In recent years, the creation of content using generative AI has increased. However, this has led to problems such as the risk of copyright infringement and a lack of originality in the created content. Furthermore, in order for users to respond appropriately to these risks, interactive notifications that take into account the user's emotional state are required. Therefore, a system is needed that guarantees the rights and originality of content while taking user emotions into consideration.

[0742] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0743] In this invention, the server includes extraction means for extracting images and text from content; matching means for comparing the extracted images and text with an existing database; determination means for determining copyright infringement of the content and novelty of the information based on the matching results; sentiment analysis means for analyzing the user's emotions; notification adjustment means for adjusting notification content based on the analyzed sentiment data; and notification means for providing notifications according to the user's emotional state. This makes it possible to provide users with information about the content in an appropriate and emotionally sensitive manner, thereby reducing legal risks and realizing a comfortable user experience.

[0744] A "generating device" refers to the entire system used to create content using generative AI.

[0745] A "data processing device" refers to the entire system of equipment that performs data processing, such as evaluating generated content and analyzing user sentiment.

[0746] "Extraction method" refers to a function that extracts image and text data from received content.

[0747] "Matching means" refers to a function that compares extracted image and text data with data in an existing database.

[0748] "Determination means" refers to a function that evaluates the possibility of copyright infringement and the novelty of content based on the results of the comparison means.

[0749] "Emotional analysis means" refers to a function that integrates software and hardware for analyzing a user's emotions.

[0750] "Notification adjustment mechanism" refers to a function that adjusts the content and tone of notifications sent to the user based on analyzed sentiment data.

[0751] "Notification means" refers to the entire set of communication technologies and interfaces used to provide information to users.

[0752] In this invention, a content generation device and a data processing device operate in conjunction. The server receives content generated using a generation AI model from the user's terminal. This processing uses the HTTP protocol based on data transfer technology. Image and text data are extracted from the received content by an extraction means. In this process, software such as "OpenCV" is used as an image analysis library, and "Tesseract OCR" is used for text extraction.

[0753] Next, the server uses matching means to compare the extracted data with an existing database. The database contains copyright information and data on similar existing content in the market, and this information is used to determine the potential for copyright infringement and the novelty of the content.

[0754] The server also performs sentiment analysis using camera and microphone data acquired from the user's device. This sentiment analysis utilizes technologies such as the "Emotion API" to evaluate the user's emotional state. The results of this evaluation are then used by a notification adjustment mechanism to flexibly modify the content and tone of notifications delivered to the user. These notifications are delivered in various forms, such as push notifications or email, with the receiving method tailored to the user's preference.

[0755] As a concrete example, suppose a user uses AI to create a new piece of art and uploads it. The server checks for similarity to existing works, and if a similar work exists, it analyzes the user's emotional state and, if it determines that the user is feeling anxious, sends a message in a gentle tone such as, "This work may be similar to an existing one. Please refer to this guide to try revising it. If you need assistance, please contact customer support."

[0756] A concrete example of a prompt message for a generative AI model would be: "When using generative AI to create a new piece of artwork, please help me identify how my work differs from existing works."

[0757] In this way, this system can provide an assessment of content copyright while taking into consideration the user's psychological comfort.

[0758] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0759] Step 1:

[0760] Users generate content using a generative AI model and upload it to the server. Input includes images and text data generated by the user. The server receives this data using the HTTP protocol. The output is a digital file of the content stored on the server.

[0761] Step 2:

[0762] The server extracts images and text from the received content. It receives image and text data contained in a digital file as input. The server processes the image data using an image analysis library (e.g., OpenCV) and extracts the text using a text analysis tool (e.g., Tesseract OCR). The output consists of the extracted image and text information.

[0763] Step 3:

[0764] The server uses matching mechanisms to compare extracted images and text information with existing databases. The input consists of extracted images and text information, as well as existing data stored in the database. The server performs a similarity calculation to evaluate how similar the content is to existing material. The output provides a determination of the content's potential for copyright infringement and its novelty.

[0765] Step 4:

[0766] The server analyzes the user's emotions using emotion analysis tools. It uses camera video and audio data acquired from the user's device as input. The server uses an emotion analysis API to evaluate the user's emotional state in real time from this data. The output is information about the user's emotional state.

[0767] Step 5:

[0768] The server uses a notification adjustment mechanism to adjust the content and tone of notifications according to the user's emotional state. The server takes the judgment result and emotional state information as input and generates the notification content. The adjusted notification message is output.

[0769] Step 6:

[0770] The server sends notifications to the user using various notification methods. Inputs include a tailored notification message and information about how the user receives notifications. The server provides information to the user via push notifications or email. Outputs include feedback on the content from the notification messages received by the user.

[0771] (Application Example 2)

[0772] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0773] It is necessary to simultaneously address legal issues and improve the user experience of content created by generative AI. However, conventional systems have been superficial in their legal criticism and have failed to provide feedback that takes user emotions into consideration. As a result, there has been a lack of means to alleviate anxiety and stress when users experience them.

[0774] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0775] In this invention, the server includes extraction means for extracting images and text from content, matching means for comparing the extracted data with an existing database, and adjustment means for adjusting the tone of feedback based on the analyzed user sentiment. This makes it possible to provide feedback from a legal perspective while taking user sentiment into consideration, thereby improving the user experience.

[0776] A "data processing device" is a device connected to a content generation device for analyzing and processing content data.

[0777] An "extraction means" is a means that has the function of identifying and extracting images and text from content.

[0778] "Matching means" refers to a method of comparing extracted images and text with an existing database to confirm matches or similarities.

[0779] "Determination method" refers to a means of determining whether content infringes copyright or whether the information is novel, based on the results of the comparison.

[0780] "Emotional analysis methods" refer to methods for analyzing a user's emotions based on data acquired from cameras and voice input devices.

[0781] "Adjustment methods" refer to means of changing the tone and content of the feedback provided in accordance with the analyzed emotions of the user.

[0782] The system for implementing this invention consists of multiple elements and realizes comprehensive data processing and user interaction. First, the server receives content uploaded by the user. The devices used here include internet-connected terminals such as smartphones and smart glasses.

[0783] The server first uses data analysis software to extract images and text from the content, identifying and removing specific parts of the content. This process may utilize tools such as the Google Cloud Vision API for image analysis.

[0784] Next, the server compares the extracted data with an existing internal database. This comparison process uses generative AI models such as OpenAI's GPT to verify copyright and determine the novelty of the information.

[0785] Furthermore, the server processes camera and microphone information sent from the user's device in real time to analyze the user's emotions. IBM Watson's emotion analysis API may be used for this analysis. Based on the analyzed emotion data, the server adjusts the content of notifications and feedback provided to ensure appropriate communication tailored to the user's psychological state.

[0786] As a concrete example, consider a scenario where a user creates content using a generation AI and encounters copyright issues. In this case, the server provides tailored feedback such as, "This section is similar to existing work. Here are some tips to enhance its originality. Relax and enjoy coming up with great ideas!"

[0787] An example of a prompt for a generative AI model might be: "Analyze user-generated content and generate copyright feedback in a gentle tone using sentiment data."

[0788] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0789] Step 1:

[0790] Users upload generated content to the system via devices such as smartphones or smart glasses. The input is content data from the user's device, which is sent to the server. The output is the content data received by the server.

[0791] Step 2:

[0792] The server extracts images and text from uploaded content. The input is the received content data, and the server uses image analysis software and text analysis algorithms to extract data based on this data. The output is the extracted image and text data.

[0793] Step 3:

[0794] The server compares the extracted images and text with an existing database. The input is the extracted data, which is compared with the information in the database using OpenAI's GPT model and image matching algorithms. The output is an evaluation of the degree of match and similarity as a result of the matching.

[0795] Step 4:

[0796] The server receives data in real time from the terminal's camera and microphone to analyze the user's emotional data. The input is the user's emotional information, and the server performs analysis using IBM Watson's emotional analysis API. The output is an evaluation of the user's current emotional state.

[0797] Step 5:

[0798] The server generates feedback for the user based on the matching results and emotional state. The inputs are the legal evaluation of the judged content and the user's emotional state. Accordingly, the server adjusts the tone of the feedback and, if necessary, uses a generative AI model to generate specific advice and suggestions. The output is the adjusted feedback message sent to the user.

[0799] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0800] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0801] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0802] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0803] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0804] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0805] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0806] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0807] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0808] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0809] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0810] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0811] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0812] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0813] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0814] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0815] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0816] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0817] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0818] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0819] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0820] The following is further disclosed regarding the embodiments described above.

[0821] (Claim 1)

[0822] In a data processing device connected to a content generation device,

[0823] An extraction means for extracting images and text from content,

[0824] A matching means for comparing extracted images and text with an existing database,

[0825] A system including a determination means for determining copyright infringement of content and novelty of information based on the matching results.

[0826] (Claim 2)

[0827] The system according to claim 1, further comprising a notification means for notifying a user if there is a possibility of copyright infringement.

[0828] (Claim 3)

[0829] The system according to claim 1, further comprising a means for re-evaluating content after it has been modified by the user.

[0830] "Example 1"

[0831] (Claim 1)

[0832] In a system connected to a content generation device,

[0833] A means for separating image data and text data from content,

[0834] A comparison means for comparing separated image data and text data with existing information integrators,

[0835] Based on the comparison results, an evaluation method is used to assess the legal infringement of content and the originality of data,

[0836] A means for analyzing visual features using an image recognition algorithm,

[0837] A means of interpreting the content of text using a language analysis algorithm,

[0838] A system that includes means of notifying users of instructions based on evaluation results.

[0839] (Claim 2)

[0840] The system according to claim 1, characterized in that, if the evaluation means detects a potential infringement of the law, it presents the user with visual and verbal correction instructions.

[0841] (Claim 3)

[0842] The system according to claim 1, further comprising a re-evaluation means for re-evaluating user-modified content and providing final approval or additional modification instructions.

[0843] "Application Example 1"

[0844] (Claim 1)

[0845] In a system connected to a content generation device,

[0846] An extraction means for extracting visual and textual information from content,

[0847] A matching means for comparing extracted visual and textual information with existing information aggregates,

[0848] A determination means for determining intellectual property infringement and originality of information based on the matching results,

[0849] A system that includes a suggestion mechanism for making improvement suggestions based on identified similarities.

[0850] (Claim 2)

[0851] The system according to claim 1, further comprising a notification means that notifies the user if there is a possibility of intellectual property infringement and provides suggestions for improvement.

[0852] (Claim 3)

[0853] The system according to claim 1, further comprising a means for re-evaluating the content after it has been modified by the user and for confirming the effectiveness of applying the proposed content.

[0854] "Example 2 of combining an emotion engine"

[0855] (Claim 1)

[0856] In a data processing device connected to a content generation device,

[0857] An extraction means for extracting images and text from content,

[0858] A matching means for comparing extracted images and text with an existing database,

[0859] A determination means for determining copyright infringement of content and novelty of information based on the matching results,

[0860] A means of analyzing user emotions,

[0861] A notification adjustment means that adjusts the content of notifications based on analyzed emotion data,

[0862] A notification method that provides notifications according to the user's emotional state,

[0863] A system that includes this.

[0864] (Claim 2)

[0865] The system according to claim 1, characterized in that the determination means notifies the user if there is a possibility of copyright infringement.

[0866] (Claim 3)

[0867] The system according to claim 1, further comprising a means for re-evaluating content after it has been modified by the user.

[0868] "Application example 2 when combining with an emotional engine"

[0869] (Claim 1)

[0870] In a data processing device connected to a content generation device,

[0871] An extraction means for extracting images and text from content,

[0872] A matching means for comparing extracted images and text with an existing database,

[0873] A determination means for determining copyright infringement of content and novelty of information based on the matching results,

[0874] An emotion analysis means that acquires data from a camera or voice input device in order to analyze the user's emotions,

[0875] An adjustment means for adjusting the tone of feedback based on the analyzed user's emotions,

[0876] A system that includes this.

[0877] (Claim 2)

[0878] The system according to claim 1, further comprising a notification means for notifying a user if there is a possibility of copyright infringement.

[0879] (Claim 3)

[0880] The system according to claim 1, further comprising a means for re-checking user-modified content and providing feedback appropriate to the user's sentiment. [Explanation of Symbols]

[0881] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. In a system connected to a content generation device, An extraction means for extracting visual and textual information from content, A matching means for comparing extracted visual and textual information with existing information aggregates, A determination means for determining intellectual property infringement and originality of information based on the matching results, A system that includes a suggestion mechanism for making improvement suggestions based on identified similarities.

2. The system according to claim 1, further comprising a notification means that notifies the user if there is a possibility of intellectual property infringement and provides suggestions for improvement.

3. The system according to claim 1, further comprising a means for re-evaluating the content after it has been modified by the user and for confirming the effectiveness of applying the proposed content.