system
A consultative body measures AI-generated image biases and provides ethical education to ensure fair and ethical use, improving credibility and bias mitigation in AI-generated content.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-13
- Publication Date
- 2026-06-25
AI Technical Summary
Artificial intelligence-generated images can contain cultural or social biases that may impair the evaluation and reliability of organizations, and there is a lack of effective mechanisms to address these biases ethically and fairly.
A consultative body involving various industries measures the discomfort index of AI-generated images, provides an evaluation system with authentication information, and offers ethical education through an AI agent to promote fair and ethical use.
The system enables organizations to evaluate and mitigate cultural and social biases in AI-generated images, enhancing credibility and ethical awareness, and provides certification for appropriate use.
Smart Images

Figure 2026104335000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, the method including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a character of the chatbot, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] There is a problem that the potential cultural or social biases of generated images by artificial intelligence may have inappropriate effects in a specific region or cultural circle. As a result, there is a possibility that the evaluation and reliability of an organization may be impaired. In addition, since the criteria and mechanisms for taking appropriate countermeasures against biases are insufficient, it is difficult to use artificial intelligence fairly and ethically. Therefore, it is required that each industry cooperate to measure the discomfort index and build an evaluation criterion and a certification system for reducing the influence of biases.
Means for Solving the Problems
[0005] This invention solves this problem by establishing a consultative body in which entities from various industries participate, measuring the discomfort index of artificial intelligence-generated images, and constructing a system that provides an indicator for appropriate use. Specifically, it integrates a feedback system having an algorithm that analyzes generated images and calculates the discomfort index, and provides authentication information to entities that have implemented appropriate bias countermeasures. Furthermore, by using an artificial intelligence agent to provide relevant ethical education content to entities, it promotes the improvement of ethical awareness. This makes it possible to appropriately evaluate the cultural and social impact of the use of generated images and improve the credibility of organizations.
[0006] A "consultative body" is an organization or platform in which entities from different industries participate for a common purpose and share information and technology.
[0007] "Images generated by artificial intelligence" refers to visual content digitally created using AI technology, often automatically generated by machine learning algorithms.
[0008] The "discomfort index" is a numerical indicator that quantifies the degree to which a generated image is perceived as inappropriate or offensive within a particular culture or society.
[0009] "Indicators for determining usability" refer to criteria and guidelines for evaluating whether the generated images are appropriate to use.
[0010] A "bias prevention system" refers to internal organizational mechanisms and measures designed to mitigate or eliminate potential biases and unfairness in generated images.
[0011] "Certification information" refers to a certificate or mark officially given to an entity that meets specific standards, indicating that those standards have been achieved.
[0012] An "artificial intelligence agent" is a program or system designed for a specific purpose and that performs tasks using its own judgment and learning capabilities.
[0013] "Ethical education content" refers to educational content that includes the knowledge and ways of thinking necessary to promote the appropriate use of artificial intelligence based on a specific culture or value system. [Brief explanation of the drawing]
[0014] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of the data processing device and smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, when an emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of a data processing system in Application Example 2 when a sentiment engine is combined.
Embodiments for Carrying Out the Invention
[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0016] First, the terms used in the following description will be explained.
[0017] In the following embodiments, a numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0018] In the following embodiments, a numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0019] In the following embodiments, a numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.
[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0022] [First Embodiment]
[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0035] This invention is a system for mitigating cultural and social biases associated with artificial intelligence-generated images and promoting fair and ethical use. This system is realized through the participation of diverse industries within the consultative body and the integration of knowledge and technology through a shared platform.
[0036] Specifically, a mechanism that evaluates the potential bias in the generated AI images and calculates an discomfort index plays a crucial role. First, the user uploads the generated AI image from their device to the server. Next, the server sends the image to an Asia-specific feedback system to calculate the discomfort index. Here, an image analysis algorithm is activated, generating numerical data on bias while considering regional culture and social context.
[0037] After the calculation is complete, the server creates an index indicating whether the generated image is usable based on the discomfort index. By sending this index to the terminal, companies and organizations can obtain information to determine whether the generated content is ethically appropriate. For example, when a Japanese manufacturer creates advertisements for overseas markets, they can use this system to preemptively eliminate inappropriate expressions in order to take into account the local cultural context.
[0038] Furthermore, this invention provides users with relevant ethical education content through the intervention of an artificial intelligence agent. The server collects knowledge gathered from experts in various countries, and the AI agent distributes this information to terminals as educational content, thereby promoting a sustained improvement in ethical awareness. As a result, companies can develop market strategies while taking measures to counter bias, thereby increasing social acceptance and credibility.
[0039] Finally, the council has established a process for granting certification information to companies that have implemented appropriate bias prevention measures. The server verifies that the standards have been met and issues a certification mark as a form of recognition. This certification serves as an important benchmark for consumers to confidently choose products and services.
[0040] The following describes the processing flow.
[0041] Step 1:
[0042] The user uploads the generated AI image from their device to the server. Initial checks to ensure the image file format and size are appropriate are also performed on the device.
[0043] Step 2:
[0044] The server transfers the received image data to the analysis system. Here, an Asia-specific feedback algorithm analyzes the content of the image and calculates a discomfort index.
[0045] Step 3:
[0046] The server creates an index that determines whether to permit or prohibit the use of the generated image based on the calculated discomfort index. This index is quantified as an evaluation that takes into account the cultural or social elements associated with the image.
[0047] Step 4:
[0048] The server sends the generated metrics to the terminal as an evaluation report. The report includes the reasons for the decision to grant permission for use, as well as specific numerical values for the discomfort index.
[0049] Step 5:
[0050] Users review the evaluation report received on their device. If necessary, they can edit the image and re-upload it to the server. They may also be prompted to receive further educational content.
[0051] Step 6:
[0052] The server automatically generates ethical education content via an artificial intelligence agent and delivers it to the terminal. This educational content is designed to deepen understanding of relevant cultural backgrounds and biases.
[0053] Step 7:
[0054] Based on the council's standards, the server evaluates the discomfort index and the appropriateness of the countermeasures in place, and issues authentication information to user companies that meet the criteria. This authentication can be accessed via terminals and used for marketing strategies.
[0055] (Example 1)
[0056] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0057] Visual information generated using artificial intelligence may contain diverse cultural and social biases, requiring careful consideration before use. However, current systems do not adequately automatically evaluate such biases and remove inappropriate elements. Furthermore, there is a lack of ethical consideration regarding the generated visual information, and means of providing appropriate education are needed. Therefore, the challenge lies in establishing a system that ensures stakeholders take appropriate bias prevention measures and verify that all visual information they encounter is appropriate.
[0058] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0059] This invention includes a server that transmits visual information to a feedback system that evaluates bias in visual information and calculates an discomfort index based on that evaluation; a server that provides criteria for determining whether or not to allow the use of the generated visual information based on the measurement results; and a server that grants credentials to relevant parties who have established appropriate bias prevention systems. This enables relevant parties to evaluate the cultural and social appropriateness of the generated visual information and use it as ethically safe content.
[0060] "Stakeholders from diverse fields" refers to individuals and organizations from different industries and areas of expertise who cooperate to achieve a common goal.
[0061] A "collaborative entity" is an organizational framework that brings together knowledge and technology to work towards a specific goal.
[0062] "Artificial intelligence" refers to systems and technologies that process large amounts of data and mimic human intelligence.
[0063] "Visual information" refers to digital data that can be seen with the eyes, such as images and videos.
[0064] An "unpleasantness index" is a numerical representation of how culturally or socially unpleasant the content of visual information is.
[0065] "Bias assessment" is the process of judging and measuring the degree of cultural or social bias contained in visual information.
[0066] "Ethical education" refers to activities that aim to raise awareness by providing information and knowledge about cultural and social ethics related to visual information.
[0067] This invention provides a system for evaluating the cultural and social appropriateness of visual information generated using artificial intelligence and for promoting ethically appropriate content use. This system consists of multiple components, including a user's terminal, a server, and an artificial intelligence agent.
[0068] Users upload visual information generated using their own devices to a server. The device uses either a standard internet browser or a dedicated application. Upon receiving the visual information from the user, the server sends the data to an Asia-specific feedback system. This system uses an image analysis algorithm to evaluate the presence or absence of bias and calculate an discomfort index. This algorithm extracts features related to the color, composition, and content of the visual information and quantifies the degree of bias by comparing them with existing cultural databases.
[0069] The server generates criteria for granting permission to use visual information based on the calculated discomfort index and sends them to the terminal. This allows users to verify whether the visual information is within ethically acceptable limits. The server also includes an ethics education function, providing users with appropriate educational content through an artificial intelligence agent. This content is a compilation of the knowledge of ethics experts from various countries and helps users raise their ethical awareness when using visual information.
[0070] As a concrete example, when a Japanese advertising production company creates advertisements for overseas markets, using this system will enable them to create appropriate content that takes into account the local cultural background. Another example of a prompt message is, "Analyze what cultural biases are contained in the generated AI image and calculate the discomfort index." Entering this prompt into the server will initiate the specified analysis process.
[0071] This system allows stakeholders to quickly and effectively determine whether the generated visual information is appropriate, thereby increasing the credibility and safety of the content for the public.
[0072] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0073] Step 1:
[0074] Users upload visual information generated using their own devices to the server. The device reads the visual information file selected by the user and sends it to the server's designated upload portal via the internet. The input is a visual information file, and the output is data transferred to the server.
[0075] Step 2:
[0076] The server receives visual information sent by the user. After receiving the information, the server checks the image format and size and performs a security scan. The input here is visual information data from the user, and the output is clean data ready to be passed to the feedback system.
[0077] Step 3:
[0078] The server sends clean visual information data to an Asia-specific feedback system. Here, an image analysis algorithm operates to evaluate the bias of the visual information. The input is the received image data, and the output is an unpleasantness index indicating the degree of bias. In this process, the system analyzes the cultural and social elements contained in the visual information and calculates a numerical value while comparing it with relevant databases.
[0079] Step 4:
[0080] The server generates criteria for granting permission to use visual information based on an unpleasantness index. The generated criteria are structured as indicators of specific availability. The input is the unpleasantness index, and the output is a criteria document available to the user.
[0081] Step 5:
[0082] The server sends the usage permission criteria to the user's device. The user reviews these criteria on their device and obtains information to determine whether the visual information will be used appropriately. The input is the usage permission criteria, and the output is the evaluation result displayed on the user's device.
[0083] Step 6:
[0084] The server delivers ethical education content through an artificial intelligence agent and displays it on the user's terminal. This educational content instructs users on how to use visual information in an ethically responsible manner. The input is educational data based on expert knowledge, and the output is educational content provided to the user.
[0085] This series of steps allows users to verify the ethical applicability of the generated visual information and use it in a socially desirable manner.
[0086] (Application Example 1)
[0087] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0088] Images generated by artificial intelligence can contain cultural and social biases, potentially resulting in inappropriate or misleading content. Such situations can lead to a decline in social credibility and ethical problems in the market. Therefore, there is a need for systems that can evaluate the ethical appropriateness of generated images and help mitigate bias.
[0089] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0090] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; means for providing an index for determining whether the generated image is usable based on the measurement results; means for assigning authentication information to entities that have established appropriate bias countermeasures; means for providing guidance information to support the reduction of cultural or social bias; and means equipped with a computing device for verifying the ethical appropriateness of the image in real time. This makes it possible to evaluate the ethical appropriateness of the generated content and develop appropriate market strategies.
[0091] A "consultative body" is an organizational structure in which entities from diverse industries participate to share knowledge and technology.
[0092] "Artificial intelligence" refers to a computer program or system that imitates human perception and reasoning.
[0093] The "image discomfort index" is a numerical indicator that quantifies the cultural and social biases and inappropriateness inherent in a generated image.
[0094] "Authentication information" refers to information that serves as proof of reliability, granted to entities that possess appropriate bias prevention systems.
[0095] "Cultural bias" refers to a state that includes elements that may be inappropriate or misleading in a particular culture or society.
[0096] "Social prejudice" refers to a state that reflects biases and preconceptions based on social background and values.
[0097] "Ethical compliance" refers to the state in which the generated content meets generally accepted ethical standards.
[0098] A "calculating device" is a device used to perform specific calculations quickly and accurately.
[0099] "Guidance information" refers to information that provides guidance or directions for a specific purpose.
[0100] This invention relates to a system for evaluating the ethical appropriateness of images generated by artificial intelligence and for implementing appropriate bias countermeasures. The following procedures and configurations are necessary to realize this system.
[0101] First, users upload AI-generated images created using their own devices to the server. Upon receiving these images, the server first evaluates the potential for cultural and social bias through image analysis. This is done using software libraries such as OpenCV and TENSORFLOW®. As a result of the analysis, an discomfort index is calculated, quantifying bias and inappropriate elements. Based on this discomfort index, the server creates an index to determine whether the image is usable and provides it to the user. This allows users to confirm whether the image is ethically appropriate.
[0102] The server also provides guidance information to help account for culture-specific biases. This information is generated from a knowledge database compiled by the consultative body. This allows users to verify whether the generated images are appropriate for a particular socio-cultural context.
[0103] Furthermore, the server uses artificial intelligence agents to provide ethics-related educational content. This content is based on insights gathered from experts in various countries and helps users continuously enhance their ethical awareness. This allows users to increase their social credibility in the market.
[0104] For example, when a Japanese advertising production company promotes a new product to the Asian market, it can use this system to evaluate the discomfort index and eliminate cultural biases, thereby developing an effective advertising strategy.
[0105] Examples of prompt messages are as follows:
[0106] "The target market for this advertisement is India. The goal is to respect India's cultural context as a commercial advertisement. Please analyze this generated AI image and provide an discomfort index. Please also provide guidelines for any necessary revisions."
[0107] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0108] Step 1:
[0109] The user uploads an AI-generated image from their device to the server. The AI image is provided as input, and the server receives the image file. The server holds this image and prepares it for subsequent processing.
[0110] Step 2:
[0111] The server analyzes the received AI images. At this stage, it processes the images using OpenCV or TensorFlow to extract features for detecting potential cultural and social biases. The input is the image data received in step 1, and the output is the feature data resulting from the analysis. The server then prepares to calculate the discomfort index based on this data.
[0112] Step 3:
[0113] The server calculates an discomfort index based on the analysis results from step 2. Using a numerical algorithm, it quantifies the extent to which specific cultural or social elements are present and generates an discomfort index. The input is feature data, and the output is numerical data as the discomfort index. The server uses this discomfort index to determine whether the image is usable.
[0114] Step 4:
[0115] The server uses the discomfort index to create an index indicating whether an image is usable. The input is a numerical discomfort index, and the output is visualized data as an index. The server notifies the user of this index to help them decide whether the image is ethically appropriate.
[0116] Step 5:
[0117] The server provides guidance information to mitigate cultural and social biases. This information is presented to the user from a knowledge database compiled by a consultative body. The input is an discomfort index and related databases, and the output is text data as guidance information. The user uses this information to modify the generated images.
[0118] Step 6:
[0119] The server delivers ethical education content through an artificial intelligence agent. This aims to continuously improve ethical awareness and supports education by providing users with relevant content. Input consists of data collected from experts in various countries, and output is presentation materials as educational content.
[0120] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0121] This invention is a system for mitigating bias and promoting ethical use of images generated by artificial intelligence, and further incorporates an emotion engine that recognizes and utilizes user emotions. Through a consultative body in which entities from various industries participate, this system measures the discomfort index of generated images, provides an index for determining permission to use based on the results, and grants authentication information to entities that have implemented appropriate bias countermeasures.
[0122] First, the user uploads an AI-generated image to the server using their device. The device incorporates an emotion engine that collects emotional data in real time from the user's facial expressions and voice. This emotional data is used to obtain subjective feedback on how the user perceives the image.
[0123] Upon receiving an uploaded image, the server first sends the image to a feedback system and calculates the discomfort index using an Asia-specific algorithm. Simultaneously, it integrates emotional data obtained from the emotion engine and, taking into account the user's emotional response, performs a more refined discomfort index evaluation. The influence of emotional data on the discomfort index is adjusted based on pre-set weighting parameters.
[0124] Based on the calculated discomfort index and sentiment analysis results, the server creates reasonable criteria for determining whether to permit or prohibit the use of the image, and sends this as a detailed evaluation report to the terminal. The evaluation report includes feedback derived from the user's emotional state, along with the applicability of the image.
[0125] This system also provides customized ethical education content based on user emotional data collected by the server using an emotion engine, for educational purposes. This content is delivered via an AI agent, providing users with learning opportunities to improve their ethical awareness.
[0126] As a concrete example, a Japanese advertising company could use this invention to pre-evaluate generated images when launching a campaign targeting the entire Asian region. Users can express their individual emotions towards the images through the emotion engine, and by incorporating these into the final evaluation, it becomes possible to select more culturally appropriate representations. This allows companies to make decisions that take diversity and acceptability into greater consideration in their market approach strategies.
[0127] The following describes the processing flow.
[0128] Step 1:
[0129] The user uploads image data generated by artificial intelligence from their device to the server. At this point, the device also activates an emotion engine, collecting emotional data in real time from the user's facial expressions and voice.
[0130] Step 2:
[0131] The server receives the uploaded image and sends it to a feedback system to calculate a discomfort index. This system analyzes the image content using a region-specific algorithm.
[0132] Step 3:
[0133] The server simultaneously receives user emotion data transmitted from the emotion engine and incorporates this data into the calculation of the discomfort index. This process quantifies the user's subjective evaluation, allowing for a more accurate assessment of the image's cultural appropriateness.
[0134] Step 4:
[0135] The server creates an index that determines whether an image is usable based on an integrated discomfort index. This index reflects the potential impact of the image and the user's emotional response.
[0136] Step 5:
[0137] The server delivers the generated metrics and analysis results to the terminal as an evaluation report. The terminal receives this report, and the user makes corrections or re-evaluates the images based on the report.
[0138] Step 6:
[0139] The server utilizes an emotion engine to generate ethical education content tailored to the user's emotional data and delivers it to the device via an AI agent. This content aims to promote cultural understanding and improve ethical awareness.
[0140] Step 7:
[0141] Users utilize the provided educational content to re-evaluate generated images and engage in creative production from an ethical perspective. This enables companies to choose culturally and cognitively appropriate approaches.
[0142] (Example 2)
[0143] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0144] As artificial intelligence technology for generating visual data advances, ethical issues inherent in the generated data, such as cultural bias and offensive elements, are increasing. Therefore, it is necessary to ethically audit the generated visual data, implement appropriate bias countermeasures, and promote the use of data that considers diversity and cultural sensitivity. Furthermore, there is a challenge in providing a healthier information environment through feedback that considers the individual feelings of users and through ethical education.
[0145] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0146] In this invention, the server includes means for calculating an discomfort index based on visual data through a consultative body in which entities from various industries participate, means for collecting user emotion data and reflecting that data in the evaluation of the discomfort index, and an information processing agent for providing educational information for ethical use. This enables the mitigation of ethical issues in the generation and use of information, adaptation to multicultural societies, and advanced ethical education.
[0147] "Visual data" refers to digital information in a visible form, including images generated by computers.
[0148] The "discomfort index" is an indicator that evaluates the degree of psychological discomfort that visual data causes to people.
[0149] "Emotional data" refers to information that indicates a user's psychological state, obtained from their facial expressions and voice.
[0150] An "information processing device" refers to a computer system that uses artificial intelligence and algorithms to analyze data and output specific results.
[0151] "Educational information" refers to informational materials that include knowledge and guidelines aimed at improving users' ethical awareness.
[0152] An "information processing agent" refers to a program that runs within an information processing device and automatically performs a specific task.
[0153] An "ethical audit" is a process of checking the ethical appropriateness of the generation and use of visual data and identifying any problems.
[0154] This invention is a system that utilizes artificial intelligence technology to support the ethical use of visual data. This system enables the appropriate use of generated visual data through collaborative data processing involving a server, terminal, and user.
[0155] First, the user uploads AI-generated visual data to the server using their own device. This device has an emotion engine built in, which collects the user's facial expressions and voice via the camera and microphone, and can analyze them in real time as emotion data.
[0156] The server receives visual data sent by the user and uses the collected emotional data to calculate the discomfort index. The algorithm used here employs an Asia-specific method that takes regional culture into account in its evaluation. The emotional data is adjusted to influence the discomfort index using weighting parameters.
[0157] Furthermore, the server provides users with ethical education content tailored to their needs, based on the collected emotional data. This content is delivered via an information processing agent, providing users with opportunities to improve their ethical awareness.
[0158] As a concrete example, consider a scenario where a company uses this system to evaluate the applicability of visual data generated when launching a campaign across a wide market. Users express their individual emotions through the emotion engine, and these are reflected in the final evaluation, enabling more culturally conscious content selection.
[0159] An example of a prompt to input into a generative AI model might be, "Please suggest ideas for generating images for an advertising campaign while taking into consideration the diverse cultures of Asia."
[0160] In this way, the system provides ethically and culturally appropriate methods throughout the entire process from the generation to the use of visual data, promoting fair and appropriate use by diverse stakeholders.
[0161] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0162] Step 1:
[0163] The user uploads AI-generated visual data to the server using their device. The input consists of image files and video data created by the AI model. The device is equipped with an emotion engine that uses the camera and microphone to record the user's facial expressions and voice in real time. The output is user emotion data, which is then used for subsequent processing.
[0164] Step 2:
[0165] The server processes the visual and emotional data received from the terminal. The inputs are the visual data obtained in step 1 and the user's emotional data. The server executes an algorithm to calculate a discomfort index based on the visual data. This algorithm uses criteria specific to Asian culture for evaluation. Emotional data influences the final discomfort index evaluation using weighting parameters. The output is the evaluation result, including the discomfort index.
[0166] Step 3:
[0167] The server creates an index to determine whether or not visual data can be used, based on the discomfort index evaluation results. The input for this step is the discomfort index evaluation results obtained from step 2. The server uses an algorithm to analyze the evaluation results and generates an index to determine the appropriateness of using the images. The output is an evaluation report that details the decision on whether or not to allow use and the reasoning behind it.
[0168] Step 4:
[0169] The server generates an evaluation report and sends it to the user's terminal. The input for this step is the usage decision indicators and evaluation report generated in step 3. The server organizes this information and outputs it as a report in a format that is easy for the user to understand. The output is an evaluation report that includes whether or not the images can be used.
[0170] Step 5:
[0171] The server provides ethics education content based on sentiment data through an information processing agent. The input for this step is the sentiment data collected in step 1. Based on this, the server generates customized educational content tailored to the user's learning needs. The output is the ethics education content for the user.
[0172] (Application Example 2)
[0173] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0174] Conventional image generation systems often fail to adequately verify the ethical and cultural appropriateness of the generated images, potentially leading to the use of inappropriate images that disregard user feelings. This poses a risk of cultural misunderstandings and negative reactions in various industries, including advertising. Furthermore, a lack of ethical education for users is also a challenge.
[0175] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0176] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; sentiment analysis means for analyzing user sentiment data in real time and reflecting it in the evaluation of the image; means for providing an index for determining whether the generated image is usable based on the measurement and analysis results; and means for assigning authentication information to entities that have established an appropriate bias prevention system. This makes it possible to appropriately evaluate the ethical and cultural appropriateness of the generated image and prevent the use of inappropriate images. Furthermore, it is possible to make suggestions for improving the advertising content based on user feedback and enhance ethical education for users.
[0177] A "consultative body" is an organization or group established to bring together entities from different industries and fields of expertise to cooperate in solving common problems.
[0178] "Artificial intelligence" refers to technologies and systems designed to mimic human intellectual behavior, possessing the ability to automatically perform data analysis and decision-making.
[0179] The "discomfort index" is a quantitative metric that evaluates how much discomfort generated images and content cause to users.
[0180] "Emotional analysis methods" refer to technologies and devices that analyze a user's facial expressions and voice to infer their emotional state.
[0181] "Indicators for determining usability" are standards and rules used to determine whether the generated image is ethically and culturally appropriate.
[0182] A "bias prevention system" refers to measures and procedures established to mitigate biases in generated images and data and to ensure fair processing.
[0183] "Authentication information" refers to digital or physical proof given to an entity to demonstrate that appropriate bias countermeasures have been implemented.
[0184] "Emotional data" refers to data that reflects the user's emotional state, and includes information obtained from facial expressions, voice, and other sources.
[0185] This invention provides a system for evaluating the ethical and cultural appropriateness of generated images. The system operates through the user's terminal and a server, performing real-time sentiment analysis and data evaluation.
[0186] The server receives images uploaded from the user's terminal and processes emotional data obtained from the user's facial expressions and voice using emotion analysis tools. Specifically, it captures facial images using video processing libraries such as OpenCV and analyzes the emotional data using an emotion recognition engine implemented in Python.
[0187] The analyzed sentiment data is integrated into an algorithm that calculates a discomfort index, which is used as an indicator to judge the cultural and ethical appropriateness of an image. Using an AI model, the usability of an image is determined based on the discomfort index, and the results are fed back to the user's device as a detailed evaluation report.
[0188] In this way, users can create and improve images for advertisements and products in a culturally appropriate manner. A specific application example is that advertising agencies can use this system to optimize fashion advertisements to better suit their target market.
[0189] An example of a prompt message for a generative AI model would be: "Analyze the emotions of users who view the following ad image and calculate the discomfort index. The image file is attached. Please also suggest ways to improve the ad, taking user feedback into consideration."
[0190] This system is expected to enable the provision of ethically and culturally high-quality content, thereby contributing to increased user satisfaction.
[0191] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0192] Step 1:
[0193] The user uploads images generated from their device to the server. The user's device has the function to select and send image files via a smartphone or computer. The device provides basic image information (resolution, file format, etc.) when sending.
[0194] Step 2:
[0195] The server receives uploaded images and simultaneously acquires emotional data from the user's device. The device uses its camera and microphone to capture the user's facial expressions and voice, extracting emotional data in real time. This data is obtained by analyzing emotions from facial expressions using facial recognition technology.
[0196] Step 3:
[0197] The server inputs the received image into an AI model and calculates the discomfort index of the generated image. Image analysis evaluates elements considered culturally inappropriate based on pixel data. The discomfort index is output as a numerical value as a result of the analysis.
[0198] Step 4:
[0199] The server integrates user emotion data obtained using emotion analysis tools with a discomfort index to generate an index for determining whether an image is usable. Here, an integration algorithm is used to analyze how emotion data affects the discomfort index and output the final index.
[0200] Step 5:
[0201] The server provides the user's device with indicators for determining whether an image can be used and a detailed evaluation report. This report provides the user with information to evaluate the cultural and ethical appropriateness of the image and make improvements as needed.
[0202] Step 6:
[0203] Users provide suggestions for improving images based on feedback reports. These reports include specific approaches for improvement and the next steps the user should take. This allows users to enhance the cultural relevance of the images they create.
[0204] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0205] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0206] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0207] [Second Embodiment]
[0208] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0209] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0210] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0211] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0212] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0213] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0214] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0215] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0216] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0217] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0218] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0219] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0220] This invention is a system for mitigating cultural and social biases associated with artificial intelligence-generated images and promoting fair and ethical use. This system is realized through the participation of diverse industries within the consultative body and the integration of knowledge and technology through a shared platform.
[0221] Specifically, a mechanism that evaluates the potential bias in the generated AI images and calculates an discomfort index plays a crucial role. First, the user uploads the generated AI image from their device to the server. Next, the server sends the image to an Asia-specific feedback system to calculate the discomfort index. Here, an image analysis algorithm is activated, generating numerical data on bias while considering regional culture and social context.
[0222] After the calculation is complete, the server creates an index indicating whether the generated image is usable based on the discomfort index. By sending this index to the terminal, companies and organizations can obtain information to determine whether the generated content is ethically appropriate. For example, when a Japanese manufacturer creates advertisements for overseas markets, they can use this system to preemptively eliminate inappropriate expressions in order to take into account the local cultural context.
[0223] Furthermore, this invention provides users with relevant ethical education content through the intervention of an artificial intelligence agent. The server collects knowledge gathered from experts in various countries, and the AI agent distributes this information to terminals as educational content, thereby promoting a sustained improvement in ethical awareness. As a result, companies can develop market strategies while taking measures to counter bias, thereby increasing social acceptance and credibility.
[0224] Finally, the council has established a process for granting certification information to companies that have implemented appropriate bias prevention measures. The server verifies that the standards have been met and issues a certification mark as a form of recognition. This certification serves as an important benchmark for consumers to confidently choose products and services.
[0225] The following describes the processing flow.
[0226] Step 1:
[0227] The user uploads the generated AI image from their device to the server. Initial checks to ensure the image file format and size are appropriate are also performed on the device.
[0228] Step 2:
[0229] The server transfers the received image data to the analysis system. Here, an Asia-specific feedback algorithm analyzes the content of the image and calculates a discomfort index.
[0230] Step 3:
[0231] The server creates an index that determines whether to permit or prohibit the use of the generated image based on the calculated discomfort index. This index is quantified as an evaluation that takes into account the cultural or social elements associated with the image.
[0232] Step 4:
[0233] The server sends the generated metrics to the terminal as an evaluation report. The report includes the reasons for the decision to grant permission for use, as well as specific numerical values for the discomfort index.
[0234] Step 5:
[0235] Users review the evaluation report received on their device. If necessary, they can edit the image and re-upload it to the server. They may also be prompted to receive further educational content.
[0236] Step 6:
[0237] The server automatically generates ethical education content via an artificial intelligence agent and delivers it to the terminal. This educational content is designed to deepen understanding of relevant cultural backgrounds and biases.
[0238] Step 7:
[0239] Based on the council's standards, the server evaluates the discomfort index and the appropriateness of the countermeasures in place, and issues authentication information to user companies that meet the criteria. This authentication can be accessed via terminals and used for marketing strategies.
[0240] (Example 1)
[0241] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0242] Visual information generated using artificial intelligence may contain diverse cultural and social biases, requiring careful consideration before use. However, current systems do not adequately automatically evaluate such biases and remove inappropriate elements. Furthermore, there is a lack of ethical consideration regarding the generated visual information, and means of providing appropriate education are needed. Therefore, the challenge lies in establishing a system that ensures stakeholders take appropriate bias prevention measures and verify that all visual information they encounter is appropriate.
[0243] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0244] This invention includes a server that transmits visual information to a feedback system that evaluates bias in visual information and calculates an discomfort index based on that evaluation; a server that provides criteria for determining whether or not to allow the use of the generated visual information based on the measurement results; and a server that grants credentials to relevant parties who have established appropriate bias prevention systems. This enables relevant parties to evaluate the cultural and social appropriateness of the generated visual information and use it as ethically safe content.
[0245] "Stakeholders from diverse fields" refers to individuals and organizations from different industries and areas of expertise who cooperate to achieve a common goal.
[0246] A "collaborative entity" is an organizational framework that brings together knowledge and technology to work towards a specific goal.
[0247] "Artificial intelligence" refers to systems and technologies that process large amounts of data and mimic human intelligence.
[0248] "Visual information" refers to digital data that can be seen with the eyes, such as images and videos.
[0249] An "unpleasantness index" is a numerical representation of how culturally or socially unpleasant the content of visual information is.
[0250] "Bias assessment" is the process of judging and measuring the degree of cultural or social bias contained in visual information.
[0251] "Ethical education" refers to activities that aim to raise awareness by providing information and knowledge about cultural and social ethics related to visual information.
[0252] This invention provides a system for evaluating the cultural and social appropriateness of visual information generated using artificial intelligence and for promoting ethically appropriate content use. This system consists of multiple components, including a user's terminal, a server, and an artificial intelligence agent.
[0253] Users upload visual information generated using their own devices to a server. The device uses either a standard internet browser or a dedicated application. Upon receiving the visual information from the user, the server sends the data to an Asia-specific feedback system. This system uses an image analysis algorithm to evaluate the presence or absence of bias and calculate an discomfort index. This algorithm extracts features related to the color, composition, and content of the visual information and quantifies the degree of bias by comparing them with existing cultural databases.
[0254] The server generates criteria for granting permission to use visual information based on the calculated discomfort index and sends them to the terminal. This allows users to verify whether the visual information is within ethically acceptable limits. The server also includes an ethics education function, providing users with appropriate educational content through an artificial intelligence agent. This content is a compilation of the knowledge of ethics experts from various countries and helps users raise their ethical awareness when using visual information.
[0255] As a concrete example, when a Japanese advertising production company creates advertisements for overseas markets, using this system will enable them to create appropriate content that takes into account the local cultural background. Another example of a prompt message is, "Analyze what cultural biases are contained in the generated AI image and calculate the discomfort index." Entering this prompt into the server will initiate the specified analysis process.
[0256] This system allows stakeholders to quickly and effectively determine whether the generated visual information is appropriate, thereby increasing the credibility and safety of the content for society.
[0257] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0258] Step 1:
[0259] Users upload visual information generated using their own devices to the server. The device reads the visual information file selected by the user and sends it to the server's designated upload portal via the internet. The input is a visual information file, and the output is data transferred to the server.
[0260] Step 2:
[0261] The server receives visual information sent by the user. After receiving the information, the server checks the image format and size and performs a security scan. The input here is visual information data from the user, and the output is clean data ready to be passed to the feedback system.
[0262] Step 3:
[0263] The server sends clean visual information data to an Asia-specific feedback system. Here, an image analysis algorithm operates to evaluate the bias of the visual information. The input is the received image data, and the output is an unpleasantness index indicating the degree of bias. In this process, the system analyzes the cultural and social elements contained in the visual information and calculates a numerical value while comparing it with relevant databases.
[0264] Step 4:
[0265] The server generates criteria for granting permission to use visual information based on an unpleasantness index. The generated criteria are structured as indicators of specific availability. The input is the unpleasantness index, and the output is a criteria document available to the user.
[0266] Step 5:
[0267] The server sends the usage permission criteria to the user's device. The user reviews these criteria on their device and obtains information to determine whether the visual information will be used appropriately. The input is the usage permission criteria, and the output is the evaluation result displayed on the user's device.
[0268] Step 6:
[0269] The server delivers ethical education content through an artificial intelligence agent and displays it on the user's terminal. This educational content instructs users on how to use visual information in an ethically responsible manner. The input is educational data based on expert knowledge, and the output is educational content provided to the user.
[0270] This series of steps allows users to verify the ethical applicability of the generated visual information and use it in a socially desirable manner.
[0271] (Application Example 1)
[0272] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0273] Images generated by artificial intelligence can contain cultural and social biases, potentially resulting in inappropriate or misleading content. Such situations can lead to a decline in social credibility and ethical problems in the market. Therefore, there is a need for systems that can evaluate the ethical appropriateness of generated images and help mitigate bias.
[0274] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0275] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; means for providing an index for determining whether the generated image is usable based on the measurement results; means for assigning authentication information to entities that have established appropriate bias countermeasures; means for providing guidance information to support the reduction of cultural or social bias; and means equipped with a computing device for verifying the ethical appropriateness of the image in real time. This makes it possible to evaluate the ethical appropriateness of the generated content and develop appropriate market strategies.
[0276] A "consultative body" is an organizational structure in which entities from diverse industries participate to share knowledge and technology.
[0277] "Artificial intelligence" refers to a computer program or system that imitates human perception and reasoning.
[0278] The "image discomfort index" is an indicator that quantifies the cultural and social prejudices and inappropriateness of the generated image.
[0279] "Authentication information" is information that serves as proof of reliability given to a subject with an appropriate bias countermeasure system.
[0280] "Cultural prejudice" is a state that includes elements that may be inappropriate or cause misunderstandings in a specific culture or society.
[0281] "Social prejudice" is a state that reflects biases and preconceptions based on social backgrounds and values.
[0282] "Ethical compliance" is a state in which the generated content meets generally accepted ethical standards.
[0283] "Computing device" is a device for performing specific calculations quickly and accurately.
[0284] "Guiding information" is information that provides guidance and directions according to a specific purpose.
[0285] This invention relates to a system for evaluating the ethical compliance of images generated by artificial intelligence and taking appropriate bias countermeasures. To realize this system, the following procedures and configurations are required.
[0286] First, the user uploads the AI image generated using their own terminal to the server. After receiving this image, the server first evaluates the possibility of cultural and social prejudices through image analysis. For this, software libraries such as OpenCV and TensorFlow are used. As an analysis result, a discomfort index is calculated, and prejudices and inappropriate elements are quantified. Based on this discomfort index, the server creates an indicator for judging the usability of the image and provides it to the user. Thereby, the user can confirm whether the image is ethically compliant.
[0287] The server also provides guidance information to help account for culture-specific biases. This information is generated from a knowledge database compiled by the consultative body. This allows users to verify whether the generated images are appropriate for a particular socio-cultural context.
[0288] Furthermore, the server uses artificial intelligence agents to provide ethics-related educational content. This content is based on insights gathered from experts in various countries and helps users continuously enhance their ethical awareness. This allows users to increase their social credibility in the market.
[0289] For example, when a Japanese advertising production company promotes a new product to the Asian market, it can use this system to evaluate the discomfort index and eliminate cultural biases, thereby developing an effective advertising strategy.
[0290] Examples of prompt messages are as follows:
[0291] "The target market for this advertisement is India. The goal is to respect India's cultural context as a commercial advertisement. Please analyze this generated AI image and provide an discomfort index. Please also provide guidelines for any necessary revisions."
[0292] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0293] Step 1:
[0294] The user uploads an AI-generated image from their device to the server. The AI image is provided as input, and the server receives the image file. The server holds this image and prepares it for subsequent processing.
[0295] Step 2:
[0296] The server analyzes the received AI images. At this stage, it processes the images using OpenCV or TensorFlow to extract features for detecting potential cultural and social biases. The input is the image data received in step 1, and the output is the feature data resulting from the analysis. The server then prepares to calculate the discomfort index based on this data.
[0297] Step 3:
[0298] The server calculates an discomfort index based on the analysis results from step 2. Using a numerical algorithm, it quantifies the extent to which specific cultural or social elements are present and generates an discomfort index. The input is feature data, and the output is numerical data as the discomfort index. The server uses this discomfort index to determine whether the image is usable.
[0299] Step 4:
[0300] The server uses the discomfort index to create an index indicating whether an image is usable. The input is a numerical discomfort index, and the output is visualized data as an index. The server notifies the user of this index to help them decide whether the image is ethically appropriate.
[0301] Step 5:
[0302] The server provides guidance information to mitigate cultural and social biases. This information is presented to the user from a knowledge database compiled by a consultative body. The input is an discomfort index and related databases, and the output is text data as guidance information. The user uses this information to modify the generated images.
[0303] Step 6:
[0304] The server distributes ethical education content through an artificial intelligence agent. This is aimed at continuously improving ethical awareness and supports education by providing relevant content to users. The input is data collected from experts in various countries, and the output is presentation materials as educational content.
[0305] Furthermore, an emotion engine for estimating the emotions of users may be combined. That is, the specific processing unit 290 may estimate the emotions of users using the emotion identification model 59 and perform specific processing using the emotions of users.
[0306] The present invention is a system for reducing bias and promoting ethical use related to images generated by artificial intelligence, and further incorporates an emotion engine that recognizes and utilizes the emotions of users. This system measures the discomfort index of the generated images through a consortium in which entities from various industries participate, provides an indicator for determining usage permission based on the results, and grants authentication information to entities that have arranged appropriate bias countermeasures.
[0307] First, the user uploads an image generated by AI to the server using their own terminal. At this time, an emotion engine is incorporated into the terminal, and emotion data can be collected in real time from the user's expression and voice. This emotion data is utilized to obtain subjective feedback on how the user feels about the image.
[0308] When the server receives the uploaded image, it first sends the image to a feedback system and calculates the discomfort index using an algorithm specialized for Asia. At the same time, the emotion data obtained from the emotion engine is also integrated, and a more refined evaluation of the discomfort index is performed while taking into account the user's emotional reaction. The influence of the emotion data on the discomfort index is adjusted based on preset weighting parameters.
[0309] Based on the calculated discomfort index and sentiment analysis results, the server creates reasonable criteria for determining whether to permit or prohibit the use of the image, and sends this as a detailed evaluation report to the terminal. The evaluation report includes feedback derived from the user's emotional state, along with the applicability of the image.
[0310] This system also provides customized ethical education content based on user emotional data collected by the server using an emotion engine, for educational purposes. This content is delivered via an AI agent, providing users with learning opportunities to improve their ethical awareness.
[0311] As a concrete example, a Japanese advertising company could use this invention to pre-evaluate generated images when launching a campaign targeting the entire Asian region. Users can express their individual emotions towards the images through the emotion engine, and by incorporating these into the final evaluation, it becomes possible to select more culturally appropriate representations. This allows companies to make decisions that take diversity and acceptability into greater consideration in their market approach strategies.
[0312] The following describes the processing flow.
[0313] Step 1:
[0314] The user uploads image data generated by artificial intelligence from their device to the server. At this point, the device also activates an emotion engine, collecting emotional data in real time from the user's facial expressions and voice.
[0315] Step 2:
[0316] The server receives the uploaded image and sends it to a feedback system to calculate a discomfort index. This system analyzes the image content using a region-specific algorithm.
[0317] Step 3:
[0318] The server simultaneously receives user emotion data transmitted from the emotion engine and incorporates this data into the calculation of the discomfort index. This process quantifies the user's subjective evaluation, allowing for a more accurate assessment of the image's cultural appropriateness.
[0319] Step 4:
[0320] The server creates an index that determines whether an image is usable based on an integrated discomfort index. This index reflects the potential impact of the image and the user's emotional response.
[0321] Step 5:
[0322] The server delivers the generated metrics and analysis results to the terminal as an evaluation report. The terminal receives this report, and the user makes corrections or re-evaluates the images based on the report.
[0323] Step 6:
[0324] The server utilizes an emotion engine to generate ethical education content tailored to the user's emotional data and delivers it to the device via an AI agent. This content aims to promote cultural understanding and improve ethical awareness.
[0325] Step 7:
[0326] Users utilize the provided educational content to re-evaluate generated images and engage in creative production from an ethical perspective. This enables companies to choose culturally and cognitively appropriate approaches.
[0327] (Example 2)
[0328] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0329] As artificial intelligence technology for generating visual data advances, ethical issues inherent in the generated data, such as cultural bias and offensive elements, are increasing. Therefore, it is necessary to ethically audit the generated visual data, implement appropriate bias countermeasures, and promote the use of data that considers diversity and cultural sensitivity. Furthermore, there is a challenge in providing a healthier information environment through feedback that considers the individual feelings of users and through ethical education.
[0330] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0331] In this invention, the server includes means for calculating an discomfort index based on visual data through a consultative body in which entities from various industries participate, means for collecting user emotion data and reflecting that data in the evaluation of the discomfort index, and an information processing agent for providing educational information for ethical use. This enables the mitigation of ethical issues in the generation and use of information, adaptation to multicultural societies, and advanced ethical education.
[0332] "Visual data" refers to digital information in a visible form, including images generated by computers.
[0333] The "discomfort index" is an indicator that evaluates the degree of psychological discomfort that visual data causes to people.
[0334] "Emotional data" refers to information that indicates a user's psychological state, obtained from their facial expressions and voice.
[0335] An "information processing device" refers to a computer system that uses artificial intelligence and algorithms to analyze data and output specific results.
[0336] "Educational information" refers to informational materials that include knowledge and guidelines aimed at improving users' ethical awareness.
[0337] An "information processing agent" refers to a program that runs within an information processing device and automatically performs a specific task.
[0338] An "ethical audit" is a process of checking the ethical appropriateness of the generation and use of visual data and identifying any problems.
[0339] This invention is a system that utilizes artificial intelligence technology to support the ethical use of visual data. This system enables the appropriate use of generated visual data through collaborative data processing involving a server, terminal, and user.
[0340] First, the user uploads AI-generated visual data to the server using their own device. This device has an emotion engine built in, which collects the user's facial expressions and voice via the camera and microphone, and can analyze them in real time as emotion data.
[0341] The server receives visual data sent by the user and uses the collected emotional data to calculate the discomfort index. The algorithm used here employs an Asia-specific method that takes regional culture into account in its evaluation. The emotional data is adjusted to influence the discomfort index using weighting parameters.
[0342] Furthermore, the server provides users with ethical education content tailored to their needs, based on the collected emotional data. This content is delivered via an information processing agent, providing users with opportunities to improve their ethical awareness.
[0343] As a concrete example, consider a scenario where a company uses this system to evaluate the applicability of visual data generated when launching a campaign across a wide market. Users express their individual emotions through the emotion engine, and these are reflected in the final evaluation, enabling more culturally conscious content selection.
[0344] An example of a prompt to input into a generative AI model might be, "Please suggest ideas for generating images for an advertising campaign while taking into account the diverse cultures of Asia."
[0345] In this way, the system provides ethically and culturally appropriate methods throughout the entire process from the generation to the use of visual data, promoting fair and appropriate use by diverse stakeholders.
[0346] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0347] Step 1:
[0348] The user uploads AI-generated visual data to the server using their device. The input consists of image files and video data created by the AI model. The device is equipped with an emotion engine that uses the camera and microphone to record the user's facial expressions and voice in real time. The output is user emotion data, which is then used for subsequent processing.
[0349] Step 2:
[0350] The server processes the visual and emotional data received from the terminal. The inputs are the visual data obtained in step 1 and the user's emotional data. The server executes an algorithm to calculate a discomfort index based on the visual data. This algorithm uses criteria specific to Asian culture for evaluation. Emotional data influences the final discomfort index evaluation using weighting parameters. The output is the evaluation result, including the discomfort index.
[0351] Step 3:
[0352] The server creates an index to determine whether or not visual data can be used, based on the discomfort index evaluation results. The input for this step is the discomfort index evaluation results obtained from step 2. The server uses an algorithm to analyze the evaluation results and generates an index to determine the appropriateness of using the images. The output is an evaluation report that details the decision on whether or not to allow use and the reasoning behind it.
[0353] Step 4:
[0354] The server generates an evaluation report and sends it to the user's terminal. The input for this step is the usage decision indicators and evaluation report generated in step 3. The server organizes this information and outputs it as a report in a format that is easy for the user to understand. The output is an evaluation report that includes whether or not the images can be used.
[0355] Step 5:
[0356] The server provides ethics education content based on sentiment data through an information processing agent. The input for this step is the sentiment data collected in step 1. Based on this, the server generates customized educational content tailored to the user's learning needs. The output is the ethics education content for the user.
[0357] (Application Example 2)
[0358] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the smart glasses 214 as the "terminal".
[0359] Conventional image generation systems often fail to adequately verify the ethical and cultural appropriateness of the generated images, potentially leading to the use of inappropriate images that disregard user feelings. This poses a risk of cultural misunderstandings and negative reactions in various industries, including advertising. Furthermore, a lack of ethical education for users is also a challenge.
[0360] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0361] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; sentiment analysis means for analyzing user sentiment data in real time and reflecting it in the evaluation of the image; means for providing an index for determining whether the generated image is usable based on the measurement and analysis results; and means for assigning authentication information to entities that have established an appropriate bias prevention system. This makes it possible to appropriately evaluate the ethical and cultural appropriateness of the generated image and prevent the use of inappropriate images. Furthermore, it is possible to make suggestions for improving the advertising content based on user feedback and enhance ethical education for users.
[0362] A "consultative body" is an organization or group established to bring together entities from different industries and fields of expertise to cooperate in solving common problems.
[0363] "Artificial intelligence" refers to technologies and systems designed to mimic human intellectual behavior, possessing the ability to automatically perform data analysis and decision-making.
[0364] The "discomfort index" is a quantitative metric that evaluates how much discomfort generated images and content cause to users.
[0365] "Emotional analysis tools" refer to technologies and devices that analyze a user's facial expressions and voice to infer their emotional state.
[0366] "Indicators for determining usability" are standards or rules used to determine whether a generated image is ethically and culturally appropriate.
[0367] A "bias prevention system" refers to measures and procedures established to mitigate biases in generated images and data and to ensure fair processing.
[0368] "Authentication information" refers to digital or physical proof given to an entity to demonstrate that appropriate bias countermeasures have been implemented.
[0369] "Emotional data" refers to data that reflects the user's emotional state, and includes information obtained from facial expressions, voice, and other sources.
[0370] This invention provides a system for evaluating the ethical and cultural appropriateness of generated images. The system operates through the user's terminal and a server, performing real-time sentiment analysis and data evaluation.
[0371] The server receives images uploaded from the user's terminal and processes emotional data obtained from the user's facial expressions and voice using emotion analysis tools. Specifically, it captures facial images using video processing libraries such as OpenCV and analyzes the emotional data using an emotion recognition engine implemented in Python.
[0372] The analyzed sentiment data is integrated into an algorithm that calculates a discomfort index, which is used as an indicator to judge the cultural and ethical appropriateness of an image. Using an AI model, the usability of an image is determined based on the discomfort index, and the results are fed back to the user's device as a detailed evaluation report.
[0373] In this way, users can create and improve images for advertisements and products in a culturally appropriate manner. A specific application example is that advertising agencies can use this system to optimize fashion advertisements to better suit their target market.
[0374] An example of a prompt message for a generative AI model would be: "Analyze the emotions of users who view the following ad image and calculate the discomfort index. The image file is attached. Please also suggest ways to improve the ad, taking user feedback into consideration."
[0375] This system is expected to enable the provision of ethically and culturally high-quality content, contributing to increased user satisfaction.
[0376] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0377] Step 1:
[0378] The user uploads images generated from their device to the server. The user's device has the function to select and send image files via a smartphone or computer. The device provides basic image information (resolution, file format, etc.) when sending.
[0379] Step 2:
[0380] The server receives uploaded images and simultaneously acquires emotional data from the user's device. The device uses its camera and microphone to capture the user's facial expressions and voice, extracting emotional data in real time. This data is obtained by analyzing emotions from facial expressions using facial recognition technology.
[0381] Step 3:
[0382] The server inputs the received image into an AI model and calculates the discomfort index of the generated image. Image analysis evaluates elements considered culturally inappropriate based on pixel data. The discomfort index is output as a numerical value as a result of the analysis.
[0383] Step 4:
[0384] The server integrates user emotion data obtained using emotion analysis tools with a discomfort index to generate an index for determining whether an image is usable. Here, an integration algorithm is used to analyze how emotion data affects the discomfort index and output the final index.
[0385] Step 5:
[0386] The server provides the user's device with indicators for determining whether an image can be used and a detailed evaluation report. This report provides the user with information to evaluate the cultural and ethical appropriateness of the image and make improvements as needed.
[0387] Step 6:
[0388] Users provide suggestions for improving images based on feedback reports. These reports include specific approaches for improvement and the next steps the user should take. This allows users to enhance the cultural relevance of the images they create.
[0389] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0390] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0391] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0392] [Third Embodiment]
[0393] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0394] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0395] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0396] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0397] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0398] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0399] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0400] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0401] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0402] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0403] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0404] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0405] This invention is a system for mitigating cultural and social biases associated with artificial intelligence-generated images and promoting fair and ethical use. This system is realized through the participation of diverse industries within the consultative body and the integration of knowledge and technology through a shared platform.
[0406] Specifically, a mechanism that evaluates the potential bias in the generated AI images and calculates an discomfort index plays a crucial role. First, the user uploads the generated AI image from their device to the server. Next, the server sends the image to an Asia-specific feedback system to calculate the discomfort index. Here, an image analysis algorithm is activated, generating numerical data on bias while considering regional culture and social context.
[0407] After the calculation is complete, the server creates an index indicating whether the generated image is usable based on the discomfort index. By sending this index to the terminal, companies and organizations can obtain information to determine whether the generated content is ethically appropriate. For example, when a Japanese manufacturer creates advertisements for overseas markets, they can use this system to preemptively eliminate inappropriate expressions in order to take into account the local cultural context.
[0408] Furthermore, this invention provides users with relevant ethical education content through the intervention of an artificial intelligence agent. The server collects knowledge gathered from experts in various countries, and the AI agent distributes this information to terminals as educational content, thereby promoting a sustained improvement in ethical awareness. As a result, companies can develop market strategies while taking measures to counter bias, thereby increasing social acceptance and credibility.
[0409] Finally, the council has established a process for granting certification information to companies that have implemented appropriate bias prevention measures. The server verifies that the standards have been met and issues a certification mark as a form of recognition. This certification serves as an important benchmark for consumers to confidently choose products and services.
[0410] The following describes the processing flow.
[0411] Step 1:
[0412] The user uploads the generated AI image from their device to the server. Initial checks to ensure the image file format and size are appropriate are also performed on the device.
[0413] Step 2:
[0414] The server transfers the received image data to the analysis system. Here, an Asia-specific feedback algorithm analyzes the content of the image and calculates a discomfort index.
[0415] Step 3:
[0416] The server creates an index that determines whether to permit or prohibit the use of the generated image based on the calculated discomfort index. This index is quantified as an evaluation that takes into account the cultural or social elements associated with the image.
[0417] Step 4:
[0418] The server sends the generated metrics to the terminal as an evaluation report. The report includes the reasons for the decision to grant permission for use, as well as specific numerical values for the discomfort index.
[0419] Step 5:
[0420] Users review the evaluation report received on their device. If necessary, they can edit the image and re-upload it to the server. They may also be prompted to receive further educational content.
[0421] Step 6:
[0422] The server automatically generates ethical education content via an artificial intelligence agent and delivers it to the terminal. This educational content is designed to deepen understanding of relevant cultural backgrounds and biases.
[0423] Step 7:
[0424] Based on the council's standards, the server evaluates the discomfort index and the appropriateness of the countermeasures in place, and issues authentication information to user companies that meet the criteria. This authentication can be accessed via terminals and used for marketing strategies.
[0425] (Example 1)
[0426] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0427] Visual information generated using artificial intelligence may contain diverse cultural and social biases, requiring careful consideration before use. However, current systems do not adequately automatically evaluate such biases and remove inappropriate elements. Furthermore, there is a lack of ethical consideration regarding the generated visual information, and means of providing appropriate education are needed. Therefore, the challenge lies in establishing a system that ensures stakeholders take appropriate bias prevention measures and verify that all visual information they encounter is appropriate.
[0428] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0429] This invention includes a server that transmits visual information to a feedback system that evaluates bias in visual information and calculates an discomfort index based on that evaluation; a server that provides criteria for determining whether or not to allow the use of the generated visual information based on the measurement results; and a server that grants credentials to relevant parties who have established appropriate bias prevention systems. This enables relevant parties to evaluate the cultural and social appropriateness of the generated visual information and use it as ethically safe content.
[0430] "Stakeholders from diverse fields" refers to individuals and organizations from different industries and areas of expertise who cooperate to achieve a common goal.
[0431] A "collaborative entity" is an organizational framework that brings together knowledge and technology to work towards a specific goal.
[0432] "Artificial intelligence" refers to systems and technologies that process large amounts of data and mimic human intelligence.
[0433] "Visual information" refers to digital data that can be seen with the eyes, such as images and videos.
[0434] An "unpleasantness index" is a numerical representation of how culturally or socially unpleasant the content of visual information is.
[0435] "Bias assessment" is the process of judging and measuring the degree of cultural or social bias contained in visual information.
[0436] "Ethical education" refers to activities that aim to raise awareness by providing information and knowledge about cultural and social ethics related to visual information.
[0437] This invention provides a system for evaluating the cultural and social appropriateness of visual information generated using artificial intelligence and for promoting ethically appropriate content use. This system consists of multiple components, including a user's terminal, a server, and an artificial intelligence agent.
[0438] Users upload visual information generated using their own devices to a server. The device uses either a standard internet browser or a dedicated application. Upon receiving the visual information from the user, the server sends the data to an Asia-specific feedback system. This system uses an image analysis algorithm to evaluate the presence or absence of bias and calculate an discomfort index. This algorithm extracts features related to the color, composition, and content of the visual information and quantifies the degree of bias by comparing them with existing cultural databases.
[0439] The server generates criteria for granting permission to use visual information based on the calculated discomfort index and sends them to the terminal. This allows users to verify whether the visual information is within ethically acceptable limits. The server also includes an ethics education function, providing users with appropriate educational content through an artificial intelligence agent. This content is a compilation of the knowledge of ethics experts from various countries and helps users raise their ethical awareness when using visual information.
[0440] As a concrete example, when a Japanese advertising production company creates advertisements for overseas markets, using this system will enable them to create appropriate content that takes into account the local cultural background. Another example of a prompt message is, "Analyze what cultural biases are contained in the generated AI image and calculate the discomfort index." Entering this prompt into the server will initiate the specified analysis process.
[0441] This system allows stakeholders to quickly and effectively determine whether the generated visual information is appropriate, thereby increasing the credibility and safety of the content for society.
[0442] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0443] Step 1:
[0444] Users upload visual information generated using their own devices to the server. The device reads the visual information file selected by the user and sends it to the server's designated upload portal via the internet. The input is a visual information file, and the output is data transferred to the server.
[0445] Step 2:
[0446] The server receives visual information sent by the user. After receiving the information, the server checks the image format and size and performs a security scan. The input here is visual information data from the user, and the output is clean data ready to be passed to the feedback system.
[0447] Step 3:
[0448] The server sends clean visual information data to an Asia-specific feedback system. Here, an image analysis algorithm operates to evaluate the bias of the visual information. The input is the received image data, and the output is an unpleasantness index indicating the degree of bias. In this process, the system analyzes the cultural and social elements contained in the visual information and calculates a numerical value while comparing it with relevant databases.
[0449] Step 4:
[0450] The server generates criteria for granting permission to use visual information based on an unpleasantness index. The generated criteria are structured as indicators of specific availability. The input is the unpleasantness index, and the output is a criteria document available to the user.
[0451] Step 5:
[0452] The server sends the usage permission criteria to the user's device. The user reviews these criteria on their device and obtains information to determine whether the visual information will be used appropriately. The input is the usage permission criteria, and the output is the evaluation result displayed on the user's device.
[0453] Step 6:
[0454] The server delivers ethical education content through an artificial intelligence agent and displays it on the user's terminal. This educational content instructs users on how to use visual information in an ethically responsible manner. The input is educational data based on expert knowledge, and the output is educational content provided to the user.
[0455] This series of steps allows users to verify the ethical applicability of the generated visual information and use it in a socially desirable manner.
[0456] (Application Example 1)
[0457] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0458] Images generated by artificial intelligence can contain cultural and social biases, potentially resulting in inappropriate or misleading content. Such situations can lead to a decline in social credibility and ethical problems in the market. Therefore, there is a need for systems that can evaluate the ethical appropriateness of generated images and help mitigate bias.
[0459] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0460] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; means for providing an index for determining whether the generated image is usable based on the measurement results; means for assigning authentication information to entities that have established appropriate bias countermeasures; means for providing guidance information to support the reduction of cultural or social bias; and means equipped with a computing device for verifying the ethical appropriateness of the image in real time. This makes it possible to evaluate the ethical appropriateness of the generated content and develop appropriate market strategies.
[0461] A "consultative body" is an organizational structure in which entities from diverse industries participate to share knowledge and technology.
[0462] "Artificial intelligence" refers to a computer program or system that imitates human perception and reasoning.
[0463] The "image discomfort index" is a numerical indicator that quantifies the cultural and social biases and inappropriateness inherent in a generated image.
[0464] "Authentication information" refers to information that serves as proof of reliability, granted to entities that possess appropriate bias prevention systems.
[0465] "Cultural bias" refers to a state that includes elements that may be inappropriate or misleading in a particular culture or society.
[0466] "Social prejudice" refers to a state that reflects biases and preconceptions based on social background and values.
[0467] "Ethical compliance" refers to the state in which the generated content meets generally accepted ethical standards.
[0468] A "calculating device" is a device used to perform specific calculations quickly and accurately.
[0469] "Guidance information" refers to information that provides guidance or directions for a specific purpose.
[0470] This invention relates to a system for evaluating the ethical appropriateness of images generated by artificial intelligence and for implementing appropriate bias countermeasures. The following procedures and configurations are necessary to realize this system.
[0471] First, the user uploads an AI-generated image created using their own device to the server. Upon receiving this image, the server first evaluates the potential for cultural and social bias through image analysis. This is done using software libraries such as OpenCV and TensorFlow. As a result of the analysis, an discomfort index is calculated, quantifying bias and inappropriate elements. Based on this discomfort index, the server creates an index to determine whether the image is usable and provides it to the user. This allows the user to confirm whether the image is ethically appropriate.
[0472] The server also provides guidance information to help account for culture-specific biases. This information is generated from a knowledge database compiled by the consultative body. This allows users to verify whether the generated images are appropriate for a particular socio-cultural context.
[0473] Furthermore, the server uses artificial intelligence agents to provide ethics-related educational content. This content is based on insights gathered from experts in various countries and helps users continuously enhance their ethical awareness. This allows users to increase their social credibility in the market.
[0474] For example, when a Japanese advertising production company promotes a new product to the Asian market, it can use this system to evaluate the discomfort index and eliminate cultural biases, thereby developing an effective advertising strategy.
[0475] Examples of prompt messages are as follows:
[0476] "The target market for this advertisement is India. The goal is to respect India's cultural context as a commercial advertisement. Please analyze this generated AI image and provide an discomfort index. Please also provide guidelines for any necessary revisions."
[0477] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0478] Step 1:
[0479] The user uploads an AI-generated image from their device to the server. The AI image is provided as input, and the server receives the image file. The server holds this image and prepares it for subsequent processing.
[0480] Step 2:
[0481] The server analyzes the received AI images. At this stage, it processes the images using OpenCV or TensorFlow to extract features for detecting potential cultural and social biases. The input is the image data received in step 1, and the output is the feature data resulting from the analysis. The server then prepares to calculate the discomfort index based on this data.
[0482] Step 3:
[0483] The server calculates an discomfort index based on the analysis results from step 2. Using a numerical algorithm, it quantifies the extent to which specific cultural or social elements are present and generates an discomfort index. The input is feature data, and the output is numerical data as the discomfort index. The server uses this discomfort index to determine whether the image is usable.
[0484] Step 4:
[0485] The server uses the discomfort index to create an index indicating whether an image is usable. The input is a numerical discomfort index, and the output is visualized data as an index. The server notifies the user of this index to help them decide whether the image is ethically appropriate.
[0486] Step 5:
[0487] The server provides guidance information to mitigate cultural and social biases. This information is presented to the user from a knowledge database compiled by a consultative body. The input is an discomfort index and related databases, and the output is text data as guidance information. The user uses this information to modify the generated images.
[0488] Step 6:
[0489] The server delivers ethical education content through an artificial intelligence agent. This aims to continuously improve ethical awareness and supports education by providing users with relevant content. Input consists of data collected from experts in various countries, and output is presentation materials as educational content.
[0490] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0491] This invention is a system for mitigating bias and promoting ethical use of images generated by artificial intelligence, and further incorporates an emotion engine that recognizes and utilizes user emotions. Through a consultative body in which entities from various industries participate, this system measures the discomfort index of generated images, provides an index for determining permission to use based on the results, and grants authentication information to entities that have implemented appropriate bias countermeasures.
[0492] First, the user uploads an AI-generated image to the server using their device. The device incorporates an emotion engine that collects emotional data in real time from the user's facial expressions and voice. This emotional data is used to obtain subjective feedback on how the user perceives the image.
[0493] Upon receiving an uploaded image, the server first sends the image to a feedback system and calculates the discomfort index using an Asia-specific algorithm. Simultaneously, it integrates emotional data obtained from the emotion engine and, taking into account the user's emotional response, performs a more refined discomfort index evaluation. The influence of emotional data on the discomfort index is adjusted based on pre-set weighting parameters.
[0494] Based on the calculated discomfort index and sentiment analysis results, the server creates reasonable criteria for determining whether to permit or prohibit the use of the image, and sends this as a detailed evaluation report to the terminal. The evaluation report includes feedback derived from the user's emotional state, along with the applicability of the image.
[0495] This system also provides customized ethical education content based on user emotional data collected by the server using an emotion engine, for educational purposes. This content is delivered via an AI agent, providing users with learning opportunities to improve their ethical awareness.
[0496] As a concrete example, a Japanese advertising company could use this invention to pre-evaluate generated images when launching a campaign targeting the entire Asian region. Users can express their individual emotions towards the images through the emotion engine, and by incorporating these into the final evaluation, it becomes possible to select more culturally appropriate representations. This allows companies to make decisions that take diversity and acceptability into greater consideration in their market approach strategies.
[0497] The following describes the processing flow.
[0498] Step 1:
[0499] The user uploads image data generated by artificial intelligence from their device to the server. At this point, the device also activates an emotion engine, collecting emotional data in real time from the user's facial expressions and voice.
[0500] Step 2:
[0501] The server receives the uploaded image and sends it to a feedback system to calculate a discomfort index. This system analyzes the image content using a region-specific algorithm.
[0502] Step 3:
[0503] The server simultaneously receives user emotion data transmitted from the emotion engine and incorporates this data into the calculation of the discomfort index. This process quantifies the user's subjective evaluation, allowing for a more accurate assessment of the image's cultural appropriateness.
[0504] Step 4:
[0505] The server creates an index that determines whether an image is usable based on an integrated discomfort index. This index reflects the potential impact of the image and the user's emotional response.
[0506] Step 5:
[0507] The server delivers the generated metrics and analysis results to the terminal as an evaluation report. The terminal receives this report, and the user makes corrections or re-evaluates the images based on the report.
[0508] Step 6:
[0509] The server utilizes an emotion engine to generate ethical education content tailored to the user's emotional data and delivers it to the device via an AI agent. This content aims to promote cultural understanding and improve ethical awareness.
[0510] Step 7:
[0511] Users utilize the provided educational content to re-evaluate generated images and engage in creative production from an ethical perspective. This enables companies to choose culturally and cognitively appropriate approaches.
[0512] (Example 2)
[0513] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0514] As artificial intelligence technology for generating visual data advances, ethical issues inherent in the generated data, such as cultural bias and offensive elements, are increasing. Therefore, it is necessary to ethically audit the generated visual data, implement appropriate bias countermeasures, and promote the use of data that considers diversity and cultural sensitivity. Furthermore, there is a challenge in providing a healthier information environment through feedback that considers the individual feelings of users and through ethical education.
[0515] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0516] In this invention, the server includes means for calculating an discomfort index based on visual data through a consultative body in which entities from various industries participate, means for collecting user emotion data and reflecting that data in the evaluation of the discomfort index, and an information processing agent for providing educational information for ethical use. This enables the mitigation of ethical issues in the generation and use of information, adaptation to multicultural societies, and advanced ethical education.
[0517] "Visual data" refers to digital information in a visible form, including images generated by computers.
[0518] The "discomfort index" is an indicator that evaluates the degree of psychological discomfort that visual data causes to people.
[0519] "Emotional data" refers to information that indicates a user's psychological state, obtained from their facial expressions and voice.
[0520] An "information processing device" refers to a computer system that uses artificial intelligence and algorithms to analyze data and output specific results.
[0521] "Educational information" refers to informational materials that include knowledge and guidelines aimed at improving users' ethical awareness.
[0522] An "information processing agent" refers to a program that runs within an information processing device and automatically performs a specific task.
[0523] An "ethical audit" is a process of checking the ethical appropriateness of the generation and use of visual data and identifying any problems.
[0524] This invention is a system that utilizes artificial intelligence technology to support the ethical use of visual data. This system enables the appropriate use of generated visual data through collaborative data processing involving a server, terminal, and user.
[0525] First, the user uploads AI-generated visual data to the server using their own device. This device has an emotion engine built in, which collects the user's facial expressions and voice via the camera and microphone, and can analyze them in real time as emotion data.
[0526] The server receives visual data sent by the user and uses the collected emotional data to calculate the discomfort index. The algorithm used here employs an Asia-specific method that takes regional culture into account in its evaluation. The emotional data is adjusted to influence the discomfort index using weighting parameters.
[0527] Furthermore, the server provides users with ethical education content tailored to their needs, based on the collected emotional data. This content is delivered via an information processing agent, providing users with opportunities to improve their ethical awareness.
[0528] As a concrete example, consider a scenario where a company uses this system to evaluate the applicability of visual data generated when launching a campaign across a wide market. Users express their individual emotions through the emotion engine, and these are reflected in the final evaluation, enabling more culturally conscious content selection.
[0529] An example of a prompt to input into a generative AI model might be, "Please suggest ideas for generating images for an advertising campaign while taking into account the diverse cultures of Asia."
[0530] In this way, the system provides ethically and culturally appropriate methods throughout the entire process from the generation to the use of visual data, promoting fair and appropriate use by diverse stakeholders.
[0531] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0532] Step 1:
[0533] The user uploads AI-generated visual data to the server using their device. The input consists of image files and video data created by the AI model. The device is equipped with an emotion engine that uses the camera and microphone to record the user's facial expressions and voice in real time. The output is user emotion data, which is then used for subsequent processing.
[0534] Step 2:
[0535] The server processes the visual and emotional data received from the terminal. The inputs are the visual data obtained in step 1 and the user's emotional data. The server executes an algorithm to calculate a discomfort index based on the visual data. This algorithm uses criteria specific to Asian culture for evaluation. Emotional data influences the final discomfort index evaluation using weighting parameters. The output is the evaluation result, including the discomfort index.
[0536] Step 3:
[0537] The server creates an index to determine whether or not visual data can be used, based on the discomfort index evaluation results. The input for this step is the discomfort index evaluation results obtained from step 2. The server uses an algorithm to analyze the evaluation results and generates an index to determine the appropriateness of using the images. The output is an evaluation report that details the decision on whether or not to allow use and the reasoning behind it.
[0538] Step 4:
[0539] The server generates an evaluation report and sends it to the user's terminal. The input for this step is the usage decision indicators and evaluation report generated in step 3. The server organizes this information and outputs it as a report in a format that is easy for the user to understand. The output is an evaluation report that includes whether or not the images can be used.
[0540] Step 5:
[0541] The server provides ethics education content based on sentiment data through an information processing agent. The input for this step is the sentiment data collected in step 1. Based on this, the server generates customized educational content tailored to the user's learning needs. The output is the ethics education content for the user.
[0542] (Application Example 2)
[0543] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0544] Conventional image generation systems often fail to adequately verify the ethical and cultural appropriateness of the generated images, potentially leading to the use of inappropriate images that disregard user feelings. This poses a risk of cultural misunderstandings and negative reactions in various industries, including advertising. Furthermore, a lack of ethical education for users is also a challenge.
[0545] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0546] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; sentiment analysis means for analyzing user sentiment data in real time and reflecting it in the evaluation of the image; means for providing an index for determining whether the generated image is usable based on the measurement and analysis results; and means for assigning authentication information to entities that have established an appropriate bias prevention system. This makes it possible to appropriately evaluate the ethical and cultural appropriateness of the generated image and prevent the use of inappropriate images. Furthermore, it is possible to make suggestions for improving the advertising content based on user feedback and enhance ethical education for users.
[0547] A "consultative body" is an organization or group established to bring together entities from different industries and fields of expertise to cooperate in solving common problems.
[0548] "Artificial intelligence" refers to technologies and systems designed to mimic human intellectual behavior, possessing the ability to automatically perform data analysis and decision-making.
[0549] The "discomfort index" is a quantitative metric that evaluates how much discomfort generated images and content cause to users.
[0550] "Emotional analysis tools" refer to technologies and devices that analyze a user's facial expressions and voice to infer their emotional state.
[0551] "Indicators for determining usability" are standards or rules used to determine whether a generated image is ethically and culturally appropriate.
[0552] A "bias prevention system" refers to measures and procedures established to mitigate biases in generated images and data and to ensure fair processing.
[0553] "Authentication information" refers to digital or physical proof given to an entity to demonstrate that appropriate bias countermeasures have been implemented.
[0554] "Emotional data" refers to data that reflects the user's emotional state, and includes information obtained from facial expressions, voice, and other sources.
[0555] This invention provides a system for evaluating the ethical and cultural appropriateness of generated images. The system operates through the user's terminal and a server, performing real-time sentiment analysis and data evaluation.
[0556] The server receives images uploaded from the user's terminal and processes emotional data obtained from the user's facial expressions and voice using emotion analysis tools. Specifically, it captures facial images using video processing libraries such as OpenCV and analyzes the emotional data using an emotion recognition engine implemented in Python.
[0557] The analyzed sentiment data is integrated into an algorithm that calculates a discomfort index, which is used as an indicator to judge the cultural and ethical appropriateness of an image. Using an AI model, the usability of an image is determined based on the discomfort index, and the results are fed back to the user's device as a detailed evaluation report.
[0558] In this way, users can create and improve images for advertisements and products in a culturally appropriate manner. A specific application example is that advertising agencies can use this system to optimize fashion advertisements to better suit their target market.
[0559] An example of a prompt message for a generative AI model would be: "Analyze the emotions of users who view the following ad image and calculate the discomfort index. The image file is attached. Please also suggest ways to improve the ad, taking user feedback into consideration."
[0560] This system is expected to enable the provision of ethically and culturally high-quality content, contributing to increased user satisfaction.
[0561] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0562] Step 1:
[0563] The user uploads images generated from their device to the server. The user's device has the function to select and send image files via a smartphone or computer. The device provides basic image information (resolution, file format, etc.) when sending.
[0564] Step 2:
[0565] The server receives uploaded images and simultaneously acquires emotional data from the user's device. The device uses its camera and microphone to capture the user's facial expressions and voice, extracting emotional data in real time. This data is obtained by analyzing emotions from facial expressions using facial recognition technology.
[0566] Step 3:
[0567] The server inputs the received image into an AI model and calculates the discomfort index of the generated image. Image analysis evaluates elements considered culturally inappropriate based on pixel data. The discomfort index is output as a numerical value as a result of the analysis.
[0568] Step 4:
[0569] The server integrates user emotion data obtained using emotion analysis tools with a discomfort index to generate an index for determining whether an image is usable. Here, an integration algorithm is used to analyze how emotion data affects the discomfort index and output the final index.
[0570] Step 5:
[0571] The server provides the user's device with indicators for determining whether an image can be used and a detailed evaluation report. This report provides the user with information to evaluate the cultural and ethical appropriateness of the image and make improvements as needed.
[0572] Step 6:
[0573] Users provide suggestions for improving images based on feedback reports. These reports include specific approaches for improvement and the next steps the user should take. This allows users to enhance the cultural relevance of the images they create.
[0574] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0575] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0576] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0577] [Fourth Embodiment]
[0578] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0579] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0580] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0581] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0582] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0583] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0584] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0585] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0586] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0587] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0588] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0589] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0590] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0591] This invention is a system for mitigating cultural and social biases associated with artificial intelligence-generated images and promoting fair and ethical use. This system is realized through the participation of diverse industries within the consultative body and the integration of knowledge and technology through a shared platform.
[0592] Specifically, a mechanism that evaluates the potential bias in the generated AI images and calculates an discomfort index plays a crucial role. First, the user uploads the generated AI image from their device to the server. Next, the server sends the image to an Asia-specific feedback system to calculate the discomfort index. Here, an image analysis algorithm is activated, generating numerical data on bias while considering regional culture and social context.
[0593] After the calculation is complete, the server creates an index indicating whether the generated image is usable based on the discomfort index. By sending this index to the terminal, companies and organizations can obtain information to determine whether the generated content is ethically appropriate. For example, when a Japanese manufacturer creates advertisements for overseas markets, they can use this system to preemptively eliminate inappropriate expressions in order to take into account the local cultural context.
[0594] Furthermore, this invention provides users with relevant ethical education content through the intervention of an artificial intelligence agent. The server collects knowledge gathered from experts in various countries, and the AI agent distributes this information to terminals as educational content, thereby promoting a sustained improvement in ethical awareness. As a result, companies can develop market strategies while taking measures to counter bias, thereby increasing social acceptance and credibility.
[0595] Finally, the council has established a process for granting certification information to companies that have implemented appropriate bias prevention measures. The server verifies that the standards have been met and issues a certification mark as a form of recognition. This certification serves as an important benchmark for consumers to confidently choose products and services.
[0596] The following describes the processing flow.
[0597] Step 1:
[0598] The user uploads the generated AI image from their device to the server. Initial checks to ensure the image file format and size are appropriate are also performed on the device.
[0599] Step 2:
[0600] The server transfers the received image data to the analysis system. Here, an Asia-specific feedback algorithm analyzes the content of the image and calculates a discomfort index.
[0601] Step 3:
[0602] The server creates an index that determines whether to permit or prohibit the use of the generated image based on the calculated discomfort index. This index is quantified as an evaluation that takes into account the cultural or social elements associated with the image.
[0603] Step 4:
[0604] The server sends the generated metrics to the terminal as an evaluation report. The report includes the reasons for the decision to grant permission for use, as well as specific numerical values for the discomfort index.
[0605] Step 5:
[0606] Users review the evaluation report received on their device. If necessary, they can edit the image and re-upload it to the server. They may also be prompted to receive further educational content.
[0607] Step 6:
[0608] The server automatically generates ethical education content via an artificial intelligence agent and delivers it to the terminal. This educational content is designed to deepen understanding of relevant cultural backgrounds and biases.
[0609] Step 7:
[0610] Based on the council's standards, the server evaluates the discomfort index and the appropriateness of the countermeasures in place, and issues authentication information to user companies that meet the criteria. This authentication can be accessed via terminals and used for marketing strategies.
[0611] (Example 1)
[0612] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0613] Visual information generated using artificial intelligence may contain diverse cultural and social biases, requiring careful consideration before use. However, current systems do not adequately automatically evaluate such biases and remove inappropriate elements. Furthermore, there is a lack of ethical consideration regarding the generated visual information, and means of providing appropriate education are needed. Therefore, the challenge lies in establishing a system that ensures stakeholders take appropriate bias prevention measures and verify that all visual information they encounter is appropriate.
[0614] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0615] This invention includes a server that transmits visual information to a feedback system that evaluates bias in visual information and calculates an discomfort index based on that evaluation; a server that provides criteria for determining whether or not to allow the use of the generated visual information based on the measurement results; and a server that grants credentials to relevant parties who have established appropriate bias prevention systems. This enables relevant parties to evaluate the cultural and social appropriateness of the generated visual information and use it as ethically safe content.
[0616] "Stakeholders from diverse fields" refers to individuals and organizations from different industries and areas of expertise who cooperate to achieve a common goal.
[0617] A "collaborative entity" is an organizational framework that brings together knowledge and technology to work towards a specific goal.
[0618] "Artificial intelligence" refers to systems and technologies that process large amounts of data and mimic human intelligence.
[0619] "Visual information" refers to digital data that can be seen with the eyes, such as images and videos.
[0620] An "unpleasantness index" is a numerical representation of how culturally or socially unpleasant the content of visual information is.
[0621] "Bias assessment" is the process of judging and measuring the degree of cultural or social bias contained in visual information.
[0622] "Ethical education" refers to activities that aim to raise awareness by providing information and knowledge about cultural and social ethics related to visual information.
[0623] This invention provides a system for evaluating the cultural and social appropriateness of visual information generated using artificial intelligence and for promoting ethically appropriate content use. This system consists of multiple components, including a user's terminal, a server, and an artificial intelligence agent.
[0624] Users upload visual information generated using their own devices to a server. The device uses either a standard internet browser or a dedicated application. Upon receiving the visual information from the user, the server sends the data to an Asia-specific feedback system. This system uses an image analysis algorithm to evaluate the presence or absence of bias and calculate an discomfort index. This algorithm extracts features related to the color, composition, and content of the visual information and quantifies the degree of bias by comparing them with existing cultural databases.
[0625] The server generates criteria for granting permission to use visual information based on the calculated discomfort index and sends them to the terminal. This allows users to verify whether the visual information is within ethically acceptable limits. The server also includes an ethics education function, providing users with appropriate educational content through an artificial intelligence agent. This content is a compilation of the knowledge of ethics experts from various countries and helps users raise their ethical awareness when using visual information.
[0626] As a concrete example, when a Japanese advertising production company creates advertisements for overseas markets, using this system will enable them to create appropriate content that takes into account the local cultural background. Another example of a prompt message is, "Analyze what cultural biases are contained in the generated AI image and calculate the discomfort index." Entering this prompt into the server will initiate the specified analysis process.
[0627] This system allows stakeholders to quickly and effectively determine whether the generated visual information is appropriate, thereby increasing the credibility and safety of the content for society.
[0628] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0629] Step 1:
[0630] Users upload visual information generated using their own devices to the server. The device reads the visual information file selected by the user and sends it to the server's designated upload portal via the internet. The input is a visual information file, and the output is data transferred to the server.
[0631] Step 2:
[0632] The server receives visual information sent by the user. After receiving the information, the server checks the image format and size and performs a security scan. The input here is visual information data from the user, and the output is clean data ready to be passed to the feedback system.
[0633] Step 3:
[0634] The server sends clean visual information data to an Asia-specific feedback system. Here, an image analysis algorithm operates to evaluate the bias of the visual information. The input is the received image data, and the output is an unpleasantness index indicating the degree of bias. In this process, the system analyzes the cultural and social elements contained in the visual information and calculates a numerical value while comparing it with relevant databases.
[0635] Step 4:
[0636] The server generates criteria for granting permission to use visual information based on an unpleasantness index. The generated criteria are structured as indicators of specific availability. The input is the unpleasantness index, and the output is a criteria document available to the user.
[0637] Step 5:
[0638] The server sends the usage permission criteria to the user's device. The user reviews these criteria on their device and obtains information to determine whether the visual information will be used appropriately. The input is the usage permission criteria, and the output is the evaluation result displayed on the user's device.
[0639] Step 6:
[0640] The server delivers ethical education content through an artificial intelligence agent and displays it on the user's terminal. This educational content instructs users on how to use visual information in an ethically responsible manner. The input is educational data based on expert knowledge, and the output is educational content provided to the user.
[0641] This series of steps allows users to verify the ethical applicability of the generated visual information and use it in a socially desirable manner.
[0642] (Application Example 1)
[0643] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0644] Images generated by artificial intelligence can contain cultural and social biases, potentially resulting in inappropriate or misleading content. Such situations can lead to a decline in social credibility and ethical problems in the market. Therefore, there is a need for systems that can evaluate the ethical appropriateness of generated images and help mitigate bias.
[0645] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0646] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; means for providing an index for determining whether the generated image is usable based on the measurement results; means for assigning authentication information to entities that have established appropriate bias countermeasures; means for providing guidance information to support the reduction of cultural or social bias; and means equipped with a computing device for verifying the ethical appropriateness of the image in real time. This makes it possible to evaluate the ethical appropriateness of the generated content and develop appropriate market strategies.
[0647] A "consultative body" is an organizational structure in which entities from diverse industries participate to share knowledge and technology.
[0648] "Artificial intelligence" refers to a computer program or system that mimics human perception and reasoning.
[0649] The "image discomfort index" is a numerical indicator that quantifies the cultural and social biases and inappropriateness inherent in a generated image.
[0650] "Authentication information" refers to information that serves as proof of reliability, granted to entities that possess appropriate bias prevention systems.
[0651] "Cultural bias" refers to a state that includes elements that may be inappropriate or misleading in a particular culture or society.
[0652] "Social prejudice" refers to a state that reflects biases and preconceptions based on social background and values.
[0653] "Ethical compliance" refers to the state in which the generated content meets generally accepted ethical standards.
[0654] A "calculating device" is a device used to perform specific calculations quickly and accurately.
[0655] "Guidance information" refers to information that provides guidance or directions for a specific purpose.
[0656] This invention relates to a system for evaluating the ethical appropriateness of images generated by artificial intelligence and for implementing appropriate bias countermeasures. The following procedures and configurations are necessary to realize this system.
[0657] First, the user uploads an AI-generated image created using their own device to the server. Upon receiving this image, the server first evaluates the potential for cultural and social bias through image analysis. This is done using software libraries such as OpenCV and TensorFlow. As a result of the analysis, an discomfort index is calculated, quantifying bias and inappropriate elements. Based on this discomfort index, the server creates an index to determine whether the image is usable and provides it to the user. This allows the user to confirm whether the image is ethically appropriate.
[0658] The server also provides guidance information to help account for culture-specific biases. This information is generated from a knowledge database compiled by the consultative body. This allows users to verify whether the generated images are appropriate for a particular socio-cultural context.
[0659] Furthermore, the server uses artificial intelligence agents to provide ethics-related educational content. This content is based on insights gathered from experts in various countries and helps users continuously enhance their ethical awareness. This allows users to increase their social credibility in the market.
[0660] For example, when a Japanese advertising production company promotes a new product to the Asian market, it can use this system to evaluate the discomfort index and eliminate cultural biases, thereby developing an effective advertising strategy.
[0661] Examples of prompt messages are as follows:
[0662] "The target market for this advertisement is India. The goal is to respect India's cultural context as a commercial advertisement. Please analyze this generated AI image and provide an discomfort index. Please also provide guidelines for any necessary revisions."
[0663] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0664] Step 1:
[0665] The user uploads an AI-generated image from their device to the server. The AI image is provided as input, and the server receives the image file. The server holds this image and prepares it for subsequent processing.
[0666] Step 2:
[0667] The server analyzes the received AI images. At this stage, it processes the images using OpenCV or TensorFlow to extract features for detecting potential cultural and social biases. The input is the image data received in step 1, and the output is the feature data resulting from the analysis. The server then prepares to calculate the discomfort index based on this data.
[0668] Step 3:
[0669] The server calculates an discomfort index based on the analysis results from step 2. Using a numerical algorithm, it quantifies the extent to which specific cultural or social elements are present and generates an discomfort index. The input is feature data, and the output is numerical data as the discomfort index. The server uses this discomfort index to determine whether the image is usable.
[0670] Step 4:
[0671] The server uses the discomfort index to create an index indicating whether an image is usable. The input is a numerical discomfort index, and the output is visualized data as an index. The server notifies the user of this index to help them decide whether the image is ethically appropriate.
[0672] Step 5:
[0673] The server provides guidance information to mitigate cultural and social biases. This information is presented to the user from a knowledge database compiled by a consultative body. The input is an discomfort index and related databases, and the output is text data as guidance information. The user uses this information to modify the generated images.
[0674] Step 6:
[0675] The server delivers ethical education content through an artificial intelligence agent. This aims to continuously improve ethical awareness and supports education by providing users with relevant content. Input consists of data collected from experts in various countries, and output is presentation materials as educational content.
[0676] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0677] This invention is a system for mitigating bias and promoting ethical use of images generated by artificial intelligence, and further incorporates an emotion engine that recognizes and utilizes user emotions. Through a consultative body in which entities from various industries participate, this system measures the discomfort index of generated images, provides an index for determining permission to use based on the results, and grants authentication information to entities that have implemented appropriate bias countermeasures.
[0678] First, the user uploads an AI-generated image to the server using their device. The device incorporates an emotion engine that collects emotional data in real time from the user's facial expressions and voice. This emotional data is used to obtain subjective feedback on how the user perceives the image.
[0679] Upon receiving an uploaded image, the server first sends the image to a feedback system and calculates the discomfort index using an Asia-specific algorithm. Simultaneously, it integrates emotional data obtained from the emotion engine and, taking into account the user's emotional response, performs a more refined discomfort index evaluation. The influence of emotional data on the discomfort index is adjusted based on pre-set weighting parameters.
[0680] Based on the calculated discomfort index and sentiment analysis results, the server creates reasonable criteria for determining whether to permit or prohibit the use of the image, and sends this as a detailed evaluation report to the terminal. The evaluation report includes feedback derived from the user's emotional state, along with the applicability of the image.
[0681] This system also provides customized ethical education content based on user emotional data collected by the server using an emotion engine, for educational purposes. This content is delivered via an AI agent, providing users with learning opportunities to improve their ethical awareness.
[0682] As a concrete example, a Japanese advertising company could use this invention to pre-evaluate generated images when launching a campaign targeting the entire Asian region. Users can express their individual emotions towards the images through the emotion engine, and by incorporating these into the final evaluation, it becomes possible to select more culturally appropriate representations. This allows companies to make decisions that take diversity and acceptability into greater consideration in their market approach strategies.
[0683] The following describes the processing flow.
[0684] Step 1:
[0685] The user uploads image data generated by artificial intelligence from their device to the server. At this point, the device also activates an emotion engine, collecting emotional data in real time from the user's facial expressions and voice.
[0686] Step 2:
[0687] The server receives the uploaded image and sends it to a feedback system to calculate a discomfort index. This system analyzes the image content using a region-specific algorithm.
[0688] Step 3:
[0689] The server simultaneously receives user emotion data transmitted from the emotion engine and incorporates this data into the calculation of the discomfort index. This process quantifies the user's subjective evaluation, allowing for a more accurate assessment of the image's cultural appropriateness.
[0690] Step 4:
[0691] The server creates an index that determines whether an image is usable based on an integrated discomfort index. This index reflects the potential impact of the image and the user's emotional response.
[0692] Step 5:
[0693] The server delivers the generated metrics and analysis results to the terminal as an evaluation report. The terminal receives this report, and the user makes corrections or re-evaluates the images based on the report.
[0694] Step 6:
[0695] The server utilizes an emotion engine to generate ethical education content tailored to the user's emotional data and delivers it to the device via an AI agent. This content aims to promote cultural understanding and improve ethical awareness.
[0696] Step 7:
[0697] Users utilize the provided educational content to re-evaluate generated images and engage in creative production from an ethical perspective. This enables companies to choose culturally and cognitively appropriate approaches.
[0698] (Example 2)
[0699] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0700] As artificial intelligence technology for generating visual data advances, ethical issues inherent in the generated data, such as cultural bias and offensive elements, are increasing. Therefore, it is necessary to ethically audit the generated visual data, implement appropriate bias countermeasures, and promote the use of data that considers diversity and cultural sensitivity. Furthermore, there is a challenge in providing a healthier information environment through feedback that considers the individual feelings of users and through ethical education.
[0701] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0702] In this invention, the server includes means for calculating an discomfort index based on visual data through a consultative body in which entities from various industries participate, means for collecting user emotion data and reflecting that data in the evaluation of the discomfort index, and an information processing agent for providing educational information for ethical use. This enables the mitigation of ethical issues in the generation and use of information, adaptation to multicultural societies, and advanced ethical education.
[0703] "Visual data" refers to digital information in a visible form, including images generated by computers.
[0704] The "discomfort index" is an indicator that evaluates the degree of psychological discomfort that visual data causes to people.
[0705] "Emotional data" refers to information that indicates a user's psychological state, obtained from their facial expressions and voice.
[0706] An "information processing device" refers to a computer system that uses artificial intelligence and algorithms to analyze data and output specific results.
[0707] "Educational information" refers to informational materials that include knowledge and guidelines aimed at improving users' ethical awareness.
[0708] An "information processing agent" refers to a program that runs within an information processing device and automatically performs a specific task.
[0709] An "ethical audit" is a process of checking the ethical appropriateness of the generation and use of visual data and identifying any problems.
[0710] This invention is a system that utilizes artificial intelligence technology to support the ethical use of visual data. This system enables the appropriate use of generated visual data through collaborative data processing involving a server, terminal, and user.
[0711] First, the user uploads AI-generated visual data to the server using their own device. This device has an emotion engine built in, which collects the user's facial expressions and voice via the camera and microphone, and can analyze them in real time as emotion data.
[0712] The server receives visual data sent by the user and uses the collected emotional data to calculate the discomfort index. The algorithm used here employs an Asia-specific method that takes regional culture into account in its evaluation. The emotional data is adjusted to influence the discomfort index using weighting parameters.
[0713] Furthermore, the server provides users with ethical education content tailored to their needs, based on the collected emotional data. This content is delivered via an information processing agent, providing users with opportunities to improve their ethical awareness.
[0714] As a concrete example, consider a scenario where a company uses this system to evaluate the applicability of visual data generated when launching a campaign across a wide market. Users express their individual emotions through the emotion engine, and these are reflected in the final evaluation, enabling more culturally conscious content selection.
[0715] An example of a prompt to input into a generative AI model might be, "Please suggest ideas for generating images for an advertising campaign while taking into account the diverse cultures of Asia."
[0716] In this way, the system provides ethically and culturally appropriate methods throughout the entire process from the generation to the use of visual data, promoting fair and appropriate use by diverse stakeholders.
[0717] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0718] Step 1:
[0719] The user uploads AI-generated visual data to the server using their device. The input consists of image files and video data created by the AI model. The device is equipped with an emotion engine that uses the camera and microphone to record the user's facial expressions and voice in real time. The output is user emotion data, which is then used for subsequent processing.
[0720] Step 2:
[0721] The server processes the visual and emotional data received from the terminal. The inputs are the visual data obtained in step 1 and the user's emotional data. The server executes an algorithm to calculate a discomfort index based on the visual data. This algorithm uses criteria specific to Asian culture for evaluation. Emotional data influences the final discomfort index evaluation using weighting parameters. The output is the evaluation result, including the discomfort index.
[0722] Step 3:
[0723] The server creates an index to determine whether or not visual data can be used, based on the discomfort index evaluation results. The input for this step is the discomfort index evaluation results obtained from step 2. The server uses an algorithm to analyze the evaluation results and generates an index to determine the appropriateness of using the images. The output is an evaluation report that details the decision on whether or not to allow use and the reasoning behind it.
[0724] Step 4:
[0725] The server generates an evaluation report and sends it to the user's terminal. The input for this step is the usage decision indicators and evaluation report generated in step 3. The server organizes this information and outputs it as a report in a format that is easy for the user to understand. The output is an evaluation report that includes whether or not the images can be used.
[0726] Step 5:
[0727] The server provides ethics education content based on sentiment data through an information processing agent. The input for this step is the sentiment data collected in step 1. Based on this, the server generates customized educational content tailored to the user's learning needs. The output is the ethics education content for the user.
[0728] (Application Example 2)
[0729] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0730] Conventional image generation systems often fail to adequately verify the ethical and cultural appropriateness of the generated images, potentially leading to the use of inappropriate images that disregard user feelings. This poses a risk of cultural misunderstandings and negative reactions in various industries, including advertising. Furthermore, a lack of ethical education for users is also a challenge.
[0731] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0732] In this invention, the server includes means for measuring an discomfort index corresponding to an image generated by artificial intelligence through a consultative body in which entities from various industries participate; sentiment analysis means for analyzing user sentiment data in real time and reflecting it in the evaluation of the image; means for providing an index for determining whether the generated image is usable based on the measurement and analysis results; and means for assigning authentication information to entities that have established an appropriate bias prevention system. This makes it possible to appropriately evaluate the ethical and cultural appropriateness of the generated image and prevent the use of inappropriate images. Furthermore, it is possible to make suggestions for improving the advertising content based on user feedback and enhance ethical education for users.
[0733] A "consultative body" is an organization or group established to bring together entities from different industries and fields of expertise to cooperate in solving common problems.
[0734] "Artificial intelligence" refers to technologies and systems designed to mimic human intellectual behavior, possessing the ability to automatically perform data analysis and decision-making.
[0735] The "discomfort index" is a quantitative metric that evaluates how much discomfort generated images and content cause to users.
[0736] "Emotional analysis tools" refer to technologies and devices that analyze a user's facial expressions and voice to infer their emotional state.
[0737] "Indicators for determining usability" are standards or rules used to determine whether a generated image is ethically and culturally appropriate.
[0738] A "bias prevention system" refers to measures and procedures established to mitigate biases in generated images and data and to ensure fair processing.
[0739] "Authentication information" refers to digital or physical proof given to an entity to demonstrate that appropriate bias countermeasures have been implemented.
[0740] "Emotional data" refers to data that reflects the user's emotional state, and includes information obtained from facial expressions, voice, and other sources.
[0741] This invention provides a system for evaluating the ethical and cultural appropriateness of generated images. The system operates through the user's terminal and a server, performing real-time sentiment analysis and data evaluation.
[0742] The server receives images uploaded from the user's terminal and processes emotional data obtained from the user's facial expressions and voice using emotion analysis tools. Specifically, it captures facial images using video processing libraries such as OpenCV and analyzes the emotional data using an emotion recognition engine implemented in Python.
[0743] The analyzed sentiment data is integrated into an algorithm that calculates a discomfort index, which is used as an indicator to judge the cultural and ethical appropriateness of an image. Using an AI model, the usability of an image is determined based on the discomfort index, and the results are fed back to the user's device as a detailed evaluation report.
[0744] In this way, users can create and improve images for advertisements and products in a culturally appropriate manner. A specific application example is that advertising agencies can use this system to optimize fashion advertisements to better suit their target market.
[0745] An example of a prompt message for a generative AI model would be: "Analyze the emotions of users who view the following ad image and calculate the discomfort index. The image file is attached. Please also suggest ways to improve the ad, taking user feedback into consideration."
[0746] This system is expected to enable the provision of ethically and culturally high-quality content, contributing to increased user satisfaction.
[0747] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0748] Step 1:
[0749] The user uploads images generated from their device to the server. The user's device has the function to select and send image files via a smartphone or computer. The device provides basic image information (resolution, file format, etc.) when sending.
[0750] Step 2:
[0751] The server receives uploaded images and simultaneously acquires emotional data from the user's device. The device uses its camera and microphone to capture the user's facial expressions and voice, extracting emotional data in real time. This data is obtained by analyzing emotions from facial expressions using facial recognition technology.
[0752] Step 3:
[0753] The server inputs the received image into an AI model and calculates the discomfort index of the generated image. Image analysis evaluates elements considered culturally inappropriate based on pixel data. The discomfort index is output as a numerical value as a result of the analysis.
[0754] Step 4:
[0755] The server integrates user emotion data obtained using emotion analysis tools with a discomfort index to generate an index for determining whether an image is usable. Here, an integration algorithm is used to analyze how emotion data affects the discomfort index and output the final index.
[0756] Step 5:
[0757] The server provides the user's device with indicators for determining whether an image can be used and a detailed evaluation report. This report provides the user with information to evaluate the cultural and ethical appropriateness of the image and make improvements as needed.
[0758] Step 6:
[0759] Users provide suggestions for improving images based on feedback reports. These reports include specific approaches for improvement and the next steps the user should take. This allows users to enhance the cultural relevance of the images they create.
[0760] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0761] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0762] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0763] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0764] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0765] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0766] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0767] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0768] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0769] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0770] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0771] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0772] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0773] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0774] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0775] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0776] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0777] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0778] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0779] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0780] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted as being incorporated by reference.
[0781] The following is further disclosed regarding the embodiments described above.
[0782] (Claim 1)
[0783] Through a consultative body involving entities from diverse industries, a means for measuring discomfort in response to images generated by artificial intelligence,
[0784] A means for providing an index for determining whether an image generated based on measurement results is usable,
[0785] A means of granting authentication information to entities that have established appropriate bias prevention systems,
[0786] A system that includes this.
[0787] (Claim 2)
[0788] The system according to claim 1, comprising an algorithm for calculating the discomfort index of the generated image.
[0789] (Claim 3)
[0790] The system according to claim 1, comprising an artificial intelligence agent for providing ethical education content regarding the generated images to the relevant entity.
[0791] "Example 1"
[0792] (Claim 1)
[0793] Through a collaborative body involving stakeholders from diverse fields, a means for measuring discomfort indices related to visual information generated by artificial intelligence,
[0794] A means of providing criteria for determining whether or not to permit the use of visual information generated based on measurement results,
[0795] A means of providing qualification information to those who have established appropriate anti-bias systems,
[0796] A means for transmitting visual information to a feedback system that evaluates bias in visual information and calculates a discomfort index based on that evaluation,
[0797] Means for operating an artificial intelligence agent to provide relevant ethical education information based on the aforementioned discomfort index,
[0798] ...
[0799] A system that includes this.
[0800] (Claim 2)
[0801] The system according to claim 1, comprising an analysis algorithm for calculating an index of discomfort level of generated visual information.
[0802] (Claim 3)
[0803] The system according to claim 1, which has the function of providing ethical education information to relevant parties in relation to the generated visual information.
[0804] "Application Example 1"
[0805] (Claim 1)
[0806] Through a consultative body involving entities from diverse industries, a means for measuring discomfort in response to images generated by artificial intelligence,
[0807] A means for providing an index for determining whether an image generated based on measurement results is usable,
[0808] A means of granting authentication information to entities that have established appropriate bias prevention systems,
[0809] Means of providing guidance information to help reduce cultural or social prejudice,
[0810] A means equipped with a computing device for verifying the ethical appropriateness of an image in real time,
[0811] A system that includes this.
[0812] (Claim 2)
[0813] The system according to claim 1, comprising an algorithm for calculating the discomfort index of the generated image.
[0814] (Claim 3)
[0815] The system according to claim 1, comprising an artificial intelligence agent for providing ethical education content regarding the generated images to the relevant entity.
[0816] "Example 2 of combining an emotion engine"
[0817] (Claim 1)
[0818] Through a consultative body involving entities from diverse industries, a means for calculating a discomfort index corresponding to visual data generated by an information processing device,
[0819] A means of providing an index for determining whether or not the visual data generated based on the calculation results can be used,
[0820] A means of collecting user emotional data and reflecting that data in the evaluation of the discomfort index,
[0821] Means of providing educational information for ethical use,
[0822] A means of granting authentication information to entities that have established appropriate bias prevention systems,
[0823] A system that includes this.
[0824] (Claim 2)
[0825] The system according to claim 1, comprising an algorithm that analyzes the user's emotions in real time and evaluates the results by weighting them with a discomfort index.
[0826] (Claim 3)
[0827] The system according to claim 1, comprising an information processing agent for providing ethical education content to relevant entities based on collected emotional data.
[0828] "Application example 2 when combining with an emotional engine"
[0829] (Claim 1)
[0830] Through a consultative body involving entities from diverse industries, a means for measuring discomfort in response to images generated by artificial intelligence,
[0831] A sentiment analysis method that analyzes user sentiment data in real time and reflects it in image evaluation,
[0832] A means for providing an index for determining whether an image generated based on measurement and analysis results is usable,
[0833] A means of granting authentication information to entities that have established appropriate bias prevention systems,
[0834] A system that includes this.
[0835] (Claim 2)
[0836] The system according to claim 1, comprising an algorithm for calculating an discomfort index of a generated image and an algorithm for integrating analysis data based on the user's emotions.
[0837] (Claim 3)
[0838] The system according to claim 1, further comprising an artificial intelligence agent for providing ethical education content regarding generated images to relevant entities, and further providing suggestions for improving advertising content based on user feedback. [Explanation of Symbols]
[0839] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. Through a consultative body involving entities from diverse industries, a means for measuring discomfort in response to images generated by artificial intelligence, A means for providing an index for determining whether an image generated based on measurement results is usable, A means of granting authentication information to entities that have established appropriate bias prevention systems, Means of providing guidance information to help reduce cultural or social prejudice, A means equipped with a computing device for verifying the ethical appropriateness of an image in real time, A system that includes this.
2. The system according to claim 1, comprising an algorithm for calculating the discomfort index of the generated image.
3. The system according to claim 1, comprising an artificial intelligence agent for providing ethical education content regarding the generated images to the relevant entity.