system
A system analyzes facial features to suggest personalized makeup methods and products, addressing the challenge of selecting suitable cosmetics by using image analysis and feedback mechanisms for enhanced accuracy.
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- SOFTBANK GROUP CORP
- Filing Date
- 2024-12-16
- Publication Date
- 2026-06-26
AI Technical Summary
Selecting an optimal makeup method suitable for each individual user is difficult without professional knowledge, and finding an ideal cosmetic product from a vast array of options is challenging.
A system that acquires a user's facial image, extracts facial features using image analysis technology, compares them with a past database to suggest suitable makeup methods, and provides instructions on how to purchase relevant cosmetics, including visual guides and feedback mechanisms to improve accuracy.
Enables users to easily find personalized makeup methods and products, enhancing their beauty routine by providing tailored suggestions and improving accuracy through user feedback.
Smart Images

Figure 2026105512000001_ABST
Abstract
Description
Technical Field
[0001] The technology of the present disclosure relates to a system.
Background Art
[0002] Patent Document 1 discloses a method for controlling a persona chatbot, which is performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.
Prior Art Documents
Patent Documents
[0003]
Patent Document 1
Summary of the Invention
Problems to be Solved by the Invention
[0004] Selecting an optimal makeup method suitable for each individual user is difficult for ordinary people without professional knowledge. Also, in the process of selecting cosmetics that the user actually uses, it is not easy to find an ideal product from a huge number of options. There is a need for means to solve such problems so that users can easily find a makeup method suitable for themselves and efficiently purchase related products.
Means for Solving the Problems
[0005] This invention provides a system that acquires a user's facial image, extracts facial features using image analysis technology, and automatically suggests suitable makeup methods by comparing the extracted features with a past database. Furthermore, it selects relevant cosmetics based on these suggestions and presents instructions on how to purchase them, making it easy for the user to acquire the products. The system also includes a function to generate a visual guide for the suggested makeup methods and means to improve the accuracy of the suggestions by incorporating user feedback into the database.
[0006] "Users" refers to individuals who use this system to analyze their facial images and receive suggestions for makeup techniques.
[0007] "Face image" refers to digital image data that captures the entire or a part of a user's face.
[0008] "Image analysis" refers to the technique of extracting features from facial images and processing them as numerical data using specific algorithms.
[0009] "Features" refer to information extracted from facial images, such as the shape of the face, skin color, and the position of the eyes and mouth, which are then quantified as data.
[0010] A "database" is a collection of information that has been accumulated from past cosmetic methods and product information, and is used in the matching process.
[0011] "Makeup techniques" refer to makeup methods and styles suggested based on the user's facial features.
[0012] "Products" refers to cosmetics and beauty-related products used in connection with the proposed makeup method.
[0013] "Purchase method" refers to information regarding the means of obtaining the proposed product and the purchase procedure.
[0014] "Visual guide" means media content that provides visual support for implementing the proposed makeup method.
[0015] "Feedback" refers to evaluations and opinions on the proposed content provided by users, and is data that can be used to improve the accuracy of the system.
Brief Description of Drawings
[0016] [Figure 1] It is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] It is a conceptual diagram showing an example of the main functions of a data processing device and a smart device according to the first embodiment. [Figure 3] It is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] It is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] It is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] It is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] It is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13]It is a sequence diagram showing the processing flow of the data processing system in Embodiment 2 when the emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when the emotion engine is combined.
Mode for Carrying Out the Invention
[0017] Hereinafter, an example of an embodiment of the system according to the technology of the present disclosure will be described with reference to the accompanying drawings.
[0018] First, the terms used in the following description will be explained.
[0019] In the following embodiments, the labeled processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.
[0020] In the following embodiments, the labeled RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.
[0021] In the following embodiments, the labeled storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, etc.
[0022] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).
[0023] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."
[0024] [First Embodiment]
[0025] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.
[0026] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.
[0027] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0028] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.
[0029] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.
[0030] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.
[0031] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.
[0032] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.
[0033] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.
[0034] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0035] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0036] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0037] This invention is a system that uses AI to analyze a user's facial image and proposes a makeup method tailored to their individual facial characteristics. The system consists of a server and a user's terminal, and the specific processing is carried out as follows.
[0038] First, the user takes a picture of their face using their device's camera and uploads it to the system. This face image is then sent from the device to the server. The server receives this image, analyzes it using an AI model, and extracts facial features. These features are numerical representations of the shape of the face, skin tone, and the relative positions of features such as the eyes and mouth.
[0039] Next, the server compares the extracted features with existing data in the database to match the user's face with suitable makeup techniques. The database contains past makeup styles and success stories based on various facial features. Based on the matching results, the server suggests the most suitable makeup technique for the user.
[0040] This suggestion includes specific makeup steps and a list of recommended cosmetics. It also provides links to purchase the suggested cosmetics, allowing users to easily buy the products. Furthermore, to make the suggestions clearer, the server generates visual guides and explains the makeup process with videos and images.
[0041] After trying out the suggested makeup, users can provide feedback on the results and their impressions using their device. The server uses this feedback to update the database and further improve matching accuracy. This system makes it easier for users to discover the perfect makeup for themselves, enriching their daily beauty routine.
[0042] For example, when a user accesses the system and submits a facial image, the server can determine that the face has a round shape and a light ochre skin tone. Based on this, suggestions such as "natural beige foundation" and "light pink blush" are made, and links to related products are displayed. In this way, it is possible to customize makeup suggestions for each user and provide a high level of satisfaction.
[0043] The following describes the processing flow.
[0044] Step 1:
[0045] The user takes a picture of their face with the device's camera and uploads it to the system. The device checks the image format and resolution, and prepares it for transmission to the server.
[0046] Step 2:
[0047] The device sends the user's facial image data to the server. The transmitted data is protected through secure communication.
[0048] Step 3:
[0049] The server preprocesses the received facial images, standardizing the image resolution and removing noise. This prepares the images for improved facial recognition accuracy.
[0050] Step 4:
[0051] The server performs AI-based image analysis to extract facial features. These features include face shape, skin tone, and the position of the eyes and lips.
[0052] Step 5:
[0053] The server compares the extracted features with existing makeup technique data in the database. The database contains records of diverse aesthetic styles, and the server matches the most suitable makeup technique.
[0054] Step 6:
[0055] The server suggests the most suitable makeup routine for the user based on the matching results. The suggestion includes text and visual guides, including the cosmetics to be used and their application procedures.
[0056] Step 7:
[0057] The terminal displays makeup suggestions received from the server to the user. In addition, it provides links to purchase the suggested cosmetics online, giving the user an easy way to acquire them.
[0058] Step 8:
[0059] After the user performs the makeup application suggested by the system, they provide feedback on their experience and areas for improvement. This information is sent to the server via the terminal.
[0060] Step 9:
[0061] The server collects user feedback and updates the database to improve the accuracy of future suggestions. This allows the system to continuously evolve and provide a more personalized experience.
[0062] (Example 1)
[0063] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0064] In today's world, there is a demand for simple and effective beauty methods tailored to individual users. To achieve this, it is necessary to accurately analyze the user's facial characteristics and provide optimal beauty suggestions based on that analysis. Furthermore, it is crucial that these suggestions are visually understandable and that the accuracy of these suggestions is continuously improved by utilizing user feedback.
[0065] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0066] In this invention, the server includes means for acquiring a user's facial image, means for analyzing the acquired facial image and extracting features by quantifying the shape of the face, skin tone, and the positions of the eyes and mouth, and means for comparing the extracted features with past information and proposing a suitable beauty method. This makes it possible to propose personalized beauty methods to the user. Furthermore, by using a generative AI model with natural language processing technology to effectively generate proposals and create visual guides, it becomes possible to make the makeup procedure easier for the user to understand. In addition, the information can be updated based on feedback obtained from the user, improving the accuracy of the suggestions.
[0067] A "user" refers to an individual who uses this system to analyze their facial image and receive suggestions for beauty treatments.
[0068] A "face image" refers to digital image data of a user's face, which is used for analysis.
[0069] "Features" refer to numerical data extracted from facial images, such as facial shape, skin tone, and the position of eyes and mouth.
[0070] "Past information" refers to data on past makeup styles and success stories accumulated in the database.
[0071] "Beauty methods" refer to specific makeup procedures and combinations of cosmetics that are suggested based on individual characteristics.
[0072] "Natural language processing technology" refers to the technology used to analyze, understand, and generate human language using computers.
[0073] A "generative AI model" refers to a program model that uses AI technology to generate new text or suggestions from data.
[0074] A "visual guide" refers to videos or images created to clearly explain a proposed beauty method.
[0075] "Feedback" refers to information about the results and impressions of makeup application provided by users, and this information is used to improve the system.
[0076] This invention is a system that uses AI technology to analyze a user's facial image and propose a beauty treatment method tailored to their individual needs. The system mainly consists of a server and user terminals, and processing progresses through communication between each terminal and the server.
[0077] First, the user takes a picture of their face using the camera on their device. The device then uses communication technology to send the captured face image to the server. Wi-Fi or mobile data is typically used for data transmission, and HTTP or HTTPS are commonly used protocols.
[0078] The server receives the transmitted facial image and analyzes the image data. This analysis applies an AI model built using libraries such as TENSORFLOW® and PyTorch to extract features such as facial shape, skin tone, and the position of the eyes and mouth.
[0079] After features are extracted, the server compares them with historical information stored in a database. SQL or NoSQL databases are used in this comparison process. Based on the comparison results, the server suggests the most suitable beauty routine for the user. This suggestion is generated using a generative AI model and is expressed in natural language, including specific makeup steps and a list of recommended cosmetics.
[0080] To provide visually clear suggestions, the server generates video and image-based visual guides. Video editing software such as Adobe Premiere Pro and Final Cut Pro are used to create these visual guides. The created guides are provided to the user as links.
[0081] As a concrete example, when a user submits an image of their face to the system, the server analyzes that the face shape is round and the skin tone is light ochre. Based on this analysis, it recommends cosmetics such as "natural beige foundation" and "light pink blush." In this way, it is possible to provide makeup suggestions tailored to the user. The following prompt is used for the generating AI model: "Generate beauty recommendations based on features extracted from the user's face image. These features include a round face shape and light ochre skin tone."
[0082] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0083] Step 1:
[0084] The user takes a picture of their face using the device's camera. The input is the face image acquired through the camera. Specifically, the user launches the camera app and correctly frames their face on the screen to take a picture. This image must be high resolution as it will be used in subsequent analysis processing.
[0085] Step 2:
[0086] The device sends the captured facial image to the server. The input is the captured facial image, and the output is the digital image data sent to the server. The device uploads the image to the server via the internet using the HTTP or HTTPS protocol. This transmission process is initiated when the user taps the "Send" button.
[0087] Step 3:
[0088] The server analyzes the received facial images. The input is a facial image sent from the terminal, and the output is digitized features. The server uses an AI model to analyze the image and extract features such as facial shape, skin tone, and the position of the eyes and mouth. The specific techniques used here are image processing algorithms using libraries such as TensorFlow and PyTorch.
[0089] Step 4:
[0090] The server compares the extracted features with historical information in the database. The input is the features, and the output is a suggestion of suitable beauty methods. The server uses SQL or NoSQL databases to search for past makeup styles with similar features and refers to particularly successful cases. Through this process, the user receives the most suitable beauty suggestions.
[0091] Step 5:
[0092] The server uses a generative AI model to generate suggestions as text and provide them to the user. The input is the result of matching with a database, and the output is a suggestion of beauty treatments expressed in natural language. The server inputs a prompt sentence into the generative AI model and generates a suggestion in the format of, "Generate a beauty treatment based on features extracted from the user's facial image. The features include a round face shape and a light ochre skin tone."
[0093] Step 6:
[0094] The server creates visual guides based on the suggestions, using videos and images. The input is the suggested beauty method, and the output is a digital guide showing the makeup process that the user can view. Adobe Premiere Pro and Final Cut Pro are used for video editing. The generated guides are provided as links for the user to use.
[0095] Step 7:
[0096] Users try out makeup based on beauty suggestions received from the server and send their results and impressions as feedback to the server. The input is the user's actions, and the output is feedback data. Users enter their results into a dedicated feedback form and tap the "Submit" button to return the information to the server.
[0097] Step 8:
[0098] The server receives feedback from users and updates the database. The input is the feedback data, and the output is the updated database. The feedback is used as further training data for the AI model, contributing to improved accuracy of suggestions. This improves the overall system performance.
[0099] (Application Example 1)
[0100] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."
[0101] In today's beauty market, it is difficult for consumers to find makeup techniques and beauty products that suit them, and opportunities to actually try out cosmetics in stores to see which ones are right for them are limited. This problem contributes to decreased consumer satisfaction. Furthermore, the lack of convenient ways to try out and purchase suggested makeup techniques and beauty products means that consumer purchasing intent is not being fully stimulated.
[0102] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0103] In this invention, the server includes a device for acquiring a user's facial image, a device for analyzing the acquired facial image and extracting facial features, a device for comparing the extracted features with past data and proposing a suitable beauty method, a device for selecting consumer goods related to the proposed beauty method and providing means for purchasing them, and means for trying out the proposed consumer goods in a store. This enables the user to find the makeup method and beauty products that are best suited to them, effectively try them out, and then make an appropriate purchasing decision.
[0104] An "apparatus" is a machine or device configured to perform a specific function.
[0105] A "server" is a computer system that provides information and services to other computers via a network.
[0106] "User" refers to an individual or group that uses the system or service.
[0107] A "face image" is a still image or video data of a person's face.
[0108] A "feature" is a numerical value or indicator that represents a specific pattern or trend in data analysis.
[0109] "Matching" is the process of comparing one dataset with another to identify similarities and differences.
[0110] "Beauty treatments" refer to a series of procedures and techniques used to improve the appearance of the face and body according to specific purposes.
[0111] "Consumer goods" are products that consumers purchase for personal use or consumption.
[0112] "Trial use" refers to the act of actually using a product or service before purchasing it to check its effectiveness and how it feels to use.
[0113] "Means of purchase" refers to the methods and processes used to buy goods or services.
[0114] To implement this invention, a system is constructed that primarily utilizes a user terminal, a server, and an AI model. The user terminal is equipped with a camera and used for image acquisition. The user takes a picture of their face using the terminal's camera and uploads it to the server via the internet.
[0115] The server analyzes the received facial images using an AI model. The AI model is built on frameworks such as TensorFlow and PyTorch, and it extracts features from the facial images. This model extracts elements such as facial shape, skin tone, and the relative positions of facial features such as eyes and mouth as numerical data.
[0116] The server compares the extracted features with a pre-stored database. This database contains past makeup styles and success stories, and uses this information to suggest appropriate beauty methods. These suggestions include specific makeup steps and recommended cosmetics. Furthermore, the server generates links to provide purchasing options, allowing users to easily buy the suggested cosmetics.
[0117] Furthermore, when used in stores, it allows users to actually try out the suggested consumer goods, improving the convenience of making immediate purchases.
[0118] For example, when a user accesses the system and sends a facial image, the server immediately analyzes the facial contours and skin tone and suggests products such as "light beige foundation" and "peach pink lipstick." This information is visually displayed on the user's device, allowing consumers to try it immediately. An example of a prompt message would be: "Based on the user's facial data, please suggest the most suitable makeup method. Please provide detailed instructions on the cosmetics to use and the application procedure, taking into account skin tone, face shape, and eye features."
[0119] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0120] Step 1:
[0121] The user uses the device's camera to capture an image of their face. At this point, the input is the camera image, and the output is the facial image data. The user saves this facial image and prepares to upload it to the system.
[0122] Step 2:
[0123] The terminal sends the acquired facial image to the server. The input is the user's facial image, and the output is the image data transferred to the server. This image data reaches the server via the internet.
[0124] Step 3:
[0125] The server analyzes the received image data by running it through an AI model. The input is the user's facial image data, and the output is facial features. Specifically, it analyzes the facial contour, eye position, and skin tone, and extracts them as numerical data. An AI generative model is used for this analysis.
[0126] Step 4:
[0127] The server compares the extracted features with existing data. The input is facial features, and the output is a profile of suitable beauty treatments. In this process, the current features are compared with past beauty profiles in the database to select the most suitable beauty suggestion.
[0128] Step 5:
[0129] The server suggests beauty methods to the user and generates data to display them visually. The input is a profile of the beauty method, and the output is a visual guide and a list of cosmetics. The visualized suggestions are sent to the user's terminal, which the user receives and confirms.
[0130] Step 6:
[0131] Based on the presented beauty methods, users can try out consumer goods within the store. The input is a visual guide and information about the consumer goods, and the output is the trial result. Through this process, users can gain a feel for actually using the consumer goods.
[0132] Step 7:
[0133] The server receives user feedback and updates the database. Input is user satisfaction and feedback based on usability, and output is the updated database. This feedback contributes to improving the accuracy of future suggestions.
[0134] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0135] This invention is a system that uses AI technology to analyze a user's facial image and expressions, determine their emotional state, and suggest the optimal makeup application. The system consists of a user's terminal and a server, and includes an emotion engine to provide a more personalized experience.
[0136] First, the user takes a picture of their face with the device's camera and uploads it to the system. The device then sends the image data to the server. At this point, the image may be tagged with a digital tag to identify the facial attributes.
[0137] Next, the server receives the facial image, performs image analysis using an AI model, and extracts facial features. Simultaneously, the emotion engine performs facial expression analysis to identify the user's emotional state. This emotional state provides a means to determine how the user is feeling, whether they are preparing for a specific event, etc.
[0138] The server then simultaneously considers the extracted features and emotional states and compares them with the database. The database contains past success stories along with makeup techniques based on diverse facial features and emotions. Based on this, the server suggests the most suitable makeup technique for the user.
[0139] The suggestions include specific cosmetic product selections and application procedures, tailored to the user's emotional state. Visual guides are also included to help users understand the makeup process more easily.
[0140] Furthermore, the device displays makeup suggestions received from the server and provides links that allow users to easily purchase the suggested cosmetics. Through these links, users can obtain related products without any hassle.
[0141] After trying out the makeup, users can provide feedback on their experience and impressions. This feedback is sent to the server via their device, and the server uses this information to improve the accuracy of its suggestions by updating its database. This cycle allows the system to continuously learn and enhance the value it provides to users.
[0142] For example, when a user accesses the system to prepare for a date, the server's emotion engine recognizes that the user is in an "excited" state. In this case, the system can meet the user's needs by suggesting makeup techniques appropriate for the situation, such as "glamorous eyeshadow" or "calm-toned lipstick," and providing access to related products.
[0143] The following describes the processing flow.
[0144] Step 1:
[0145] The user takes a picture of their face with the device's camera and uploads it to the system. The device then provides an interface where the user can select a situation or purpose along with the face image.
[0146] Step 2:
[0147] The device sends facial image data and user selection information to the server. The data is encrypted during transmission to ensure secure transmission.
[0148] Step 3:
[0149] The server processes the received facial images and extracts facial features using an AI model. Specifically, the shape of the contours, skin tone, and the position of facial features are analyzed.
[0150] Step 4:
[0151] The server simultaneously activates an emotion engine and recognizes the user's emotional state from their facial expressions through image analysis. The emotional state is quantified into categories such as smile, anger, and surprise.
[0152] Step 5:
[0153] The server compares the extracted facial features with the recognized emotional state against a database. The database contains various emotional states and makeup techniques suited to different face shapes.
[0154] Step 6:
[0155] Based on the matching results, the server adjusts and suggests the most suitable makeup routine for the user. This suggestion reflects appropriate color choices and styles according to the user's mood and is provided along with a list of specific cosmetics.
[0156] Step 7:
[0157] The terminal displays makeup suggestions from the server to the user, along with a visual makeup guide. It also provides links to purchase the suggested cosmetics.
[0158] Step 8:
[0159] The user performs the makeup application and enters feedback about the results into the terminal. The terminal then sends this feedback to the server.
[0160] Step 9:
[0161] The server analyzes the received feedback and updates the database. The system continues to learn by utilizing the feedback to improve the accuracy of suggestions to other users.
[0162] (Example 2)
[0163] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".
[0164] Conventional makeup application suggestion systems often failed to provide personalized suggestions based on the user's emotional state, tending instead to offer generic advice. This made it difficult to provide advice optimized for the user's specific situation or mood. Furthermore, the system lacked effective mechanisms for suggesting related products, making it difficult for users to easily select and purchase them.
[0165] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0166] In this invention, the server includes means for acquiring the user's visual information, means for analyzing the acquired visual information to extract facial attributes, and means for identifying the emotional state based on the extracted attributes. This makes it possible to propose an optimal makeup method that simultaneously considers the user's facial features and emotional state.
[0167] "Visual information" refers to digital information such as images and videos of the user's face.
[0168] "Facial attributes" refer to various data that describe facial features, including age, gender, and the shape of the eyes and mouth.
[0169] "Emotional state" refers to a psychological state obtained by analyzing the user's facial expressions, and includes emotions such as joy, sadness, and surprise.
[0170] "Knowledge" refers to the contents of databases collected and accumulated in the past, and includes information on makeup techniques and their effects.
[0171] "Makeup techniques" refer to the methods and procedures for applying makeup to the face, including the cosmetics used and the order in which they are applied.
[0172] "General-purpose products" refer to commercially available cosmetics and related products that are proposed for specific situations.
[0173] "Evaluation" refers to users' opinions and impressions of the proposed makeup methods, and includes feedback information used to improve the system.
[0174] This system provides technology to analyze a user's facial image, determine their emotional state, and suggest the most suitable makeup application. First, the user takes a picture of their face with their device's camera and uploads it to the system via a dedicated application or web interface. When the device sends this facial image to the server, it may attach a digital tag to identify facial attributes. The transmitted image data is received and analyzed by the server.
[0175] The server performs image analysis using advanced AI models (for example, face recognition models using TensorFlow or PyTorch). In this process, facial features are extracted, and an emotion engine further analyzes facial expressions to identify the user's emotional state. The server then compares the extracted facial features and emotional state with a database. The database contains a wealth of makeup techniques based on diverse facial features and emotions, and the server suggests the optimal makeup technique by comparing it with past success stories.
[0176] The suggestions include specific cosmetic product selections and application procedures, which are displayed to the user via their device. The suggestions also include visual guides to help users understand the makeup process. Furthermore, links are provided for easy purchase of the suggested cosmetics, allowing users to acquire related products on the spot.
[0177] For example, when a user uses the system to prepare for an event, the server identifies a "highly anticipated" state through its emotion engine. Based on this information, the server makes suggestions such as "glamorous eyeshadow" or "calm-toned lipstick," and provides purchase links for related products on the terminal, thus quickly and easily meeting the user's needs.
[0178] An example of a prompt for a generative AI model would be: "Please specify a suitable makeup routine when the user is identified as being in an excited state in preparation for an event. Please include specific cosmetics and their application steps." This would allow the system to prompt for appropriate makeup suggestions.
[0179] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0180] Step 1:
[0181] The user takes a picture of their face using the device's camera and uploads the image data to the system via a dedicated application or web interface. Here, the input is the face image, and the output is the image data uploaded to the system. This image is then sent to a server for post-processing.
[0182] Step 2:
[0183] The device uses a basic face detection algorithm to locate faces in uploaded facial images and adds digital tags. The input is a user's facial image, and the output is image data with facial location information added. This data is then transmitted to the server using the communication function.
[0184] Step 3:
[0185] The server receives image data sent from the terminal and performs image analysis using an advanced AI model. The input is tagged image data received from the terminal, and the output is facial features obtained through image analysis. Specifically, it performs feature extraction using an AI model (for example, a TensorFlow-based facial recognition model).
[0186] Step 4:
[0187] The server analyzes facial expressions using an emotion engine based on facial features to identify the user's emotional state. The input is facial features, and the output is the identified emotional state. This process allows for the identification of the user's psychological state.
[0188] Step 5:
[0189] The server accesses an internal database to match facial features with identified emotional states. The input consists of facial features and emotional states, and the output is a suggestion of the optimal makeup application based on the matching results. The database contains data on past success stories and recommended makeup applications.
[0190] Step 6:
[0191] The server generates a suggested makeup application method deemed optimal and prepares its details (selection of cosmetics and application procedures). Input is database information as a result of matching, and output is the specific suggested makeup application method, including visual guidance.
[0192] Step 7:
[0193] The terminal displays makeup suggestions received from the server to the user. The input is makeup suggestion data from the server, and the output is specific makeup instructions and purchase links shown to the user. A user-friendly design is used for the display.
[0194] Step 8:
[0195] Users try out makeup based on the suggested methods and input their results and impressions as feedback. The input is the user's feedback, and the output is its transmission to the server. This feedback is used to update the database and improve the accuracy of suggestions.
[0196] (Application Example 2)
[0197] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".
[0198] Currently, makeup application systems offer fixed suggestions based solely on facial features, making it difficult to provide personalized recommendations that take into account the user's emotional state. Furthermore, feedback systems for improving the accuracy and applicability of these suggestions are insufficient. Therefore, there is a need for more sophisticated makeup application methods that respond to user needs and emotions.
[0199] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0200] In this invention, the server includes means for acquiring user facial information, means for analyzing the acquired facial information to identify emotional states, means for generating adaptive makeup methods based on emotional states, means for selecting products related to the generated makeup methods and providing methods for acquiring them, and means for presenting a visual guide to the makeup methods using a visual imitation device. This makes it possible to suggest more personalized makeup methods that are tailored to the user's emotions and individual characteristics. Furthermore, the accuracy of the suggestions can be improved through a feedback function, thereby enhancing the quality of the user experience.
[0201] "User facial information" refers to data related to the user's facial image and expressions, which is acquired through cameras and sensors.
[0202] "Identifying emotional states" refers to the process of analyzing and identifying the user's feelings and mood from acquired facial information.
[0203] "Generating adaptive makeup techniques" refers to calculating and determining the optimal makeup techniques and product selections based on the user's emotional state.
[0204] "Selecting products and providing methods for obtaining them" means selecting suitable products based on the generated cosmetic formula and providing users with information on how to obtain those products.
[0205] A "visual imitation device" is a device that visually reproduces the steps and results of a makeup application in real time and presents them to the user in an easy-to-understand manner.
[0206] The system necessary to implement this invention mainly consists of a terminal, a server, and a visual imitation device. First, the terminal is equipped with a high-precision camera and sensors, which are used to acquire the user's facial information. The facial information is collected in real time and then transmitted to the server.
[0207] The server uses a TensorFlow-based facial expression recognition model to process the received facial information. This process identifies the user's emotional state through data analysis. Based on the recognized emotional state, the server generates optimal suggestions from a variety of makeup techniques stored in the database.
[0208] Furthermore, products corresponding to the generated makeup method are selected, and how users can obtain those products is indicated. Product information is often provided as links to e-commerce sites such as online stores.
[0209] Furthermore, the server provides a visual guide to the generated makeup techniques using a visual imitation device equipped with AR technology such as MirageXR, to visually demonstrate the techniques to the user. This allows the user to easily understand and practice the proposed makeup techniques.
[0210] As a concrete example, when a user takes a photo of their face with their device before attending a social event, the server recognizes it as an "expression of joy." In this case, the server suggests "makeup that gives a glamorous and positive impression" and selects cosmetics for it. The visual imitation device demonstrates the makeup technique using an avatar, and the user can proceed with the makeup application according to the steps.
[0211] An example of a prompt to input into a generative AI model would be, "What makeup technique is appropriate when the user's facial expression is 'joyful'?"
[0212] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0213] Step 1:
[0214] The device uses a high-precision camera to acquire the user's facial information. It takes a real-time image of the user's face as input and sends this as image data to the server.
[0215] Step 2:
[0216] The server analyzes the received face image using a TensorFlow-based facial expression recognition model. The input is the face image data sent in step 1, and the data is processed to extract facial features in order to identify the emotional state. The output is the identification result indicating the user's emotional state.
[0217] Step 3:
[0218] The server generates adaptive makeup techniques from a database based on identified emotional states. It uses the identified emotional state as input and performs data calculations by matching it against various makeup patterns in the database. The output is a suggestion of the optimal makeup technique.
[0219] Step 4:
[0220] The server selects products related to the generated cosmetic formula and sends the product information to the terminal. Based on the generated cosmetic formula data as input, it searches the online store database and performs data processing to select product information, including how to obtain it. The output is a proposal that includes product information links.
[0221] Step 5:
[0222] The visual imitation device uses AR technology such as MirageXR to present the user with a visual guide to makeup application. It takes suggested makeup application content as input and performs real-time video processing for visualization. The output is an AR-based makeup application guide presented to the user.
[0223] Step 6:
[0224] The user selects their desired cosmetic product based on the presented product information and proceeds with the purchase. Product information links are used as input, and the selection and purchase process takes place on the online store. The output is confirmation information for the purchased product.
[0225] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.
[0226] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0227] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.
[0228] [Second Embodiment]
[0229] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.
[0230] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.
[0231] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0232] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.
[0233] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0234] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0235] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0236] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0237] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0238] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0239] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0240] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0241] This invention is a system that uses AI to analyze a user's facial image and proposes a makeup method tailored to their individual facial characteristics. The system consists of a server and a user's terminal, and the specific processing is carried out as follows.
[0242] First, the user takes a picture of their face using their device's camera and uploads it to the system. This face image is then sent from the device to the server. The server receives this image, analyzes it using an AI model, and extracts facial features. These features are numerical representations of the shape of the face, skin tone, and the relative positions of features such as the eyes and mouth.
[0243] Next, the server compares the extracted features with existing data in the database to match the user's face with suitable makeup techniques. The database contains past makeup styles and success stories based on various facial features. Based on the matching results, the server suggests the most suitable makeup technique for the user.
[0244] This suggestion includes specific makeup steps and a list of recommended cosmetics. It also provides links to purchase the suggested cosmetics, allowing users to easily buy the products. Furthermore, to make the suggestions clearer, the server generates visual guides and explains the makeup process with videos and images.
[0245] After trying out the suggested makeup, users can provide feedback on the results and their impressions using their device. The server uses this feedback to update the database and further improve matching accuracy. This system makes it easier for users to discover the perfect makeup for themselves, enriching their daily beauty routine.
[0246] For example, when a user accesses the system and submits a facial image, the server can determine that the face has a round shape and a light ochre skin tone. Based on this, suggestions such as "natural beige foundation" and "light pink blush" are made, and links to related products are displayed. In this way, it is possible to customize makeup suggestions for each user and provide a high level of satisfaction.
[0247] The following describes the processing flow.
[0248] Step 1:
[0249] The user takes a picture of their face with the device's camera and uploads it to the system. The device checks the image format and resolution, and prepares it for transmission to the server.
[0250] Step 2:
[0251] The device sends the user's facial image data to the server. The transmitted data is protected through secure communication.
[0252] Step 3:
[0253] The server preprocesses the received facial images, standardizing the image resolution and removing noise. This prepares the images for improved facial recognition accuracy.
[0254] Step 4:
[0255] The server performs AI-based image analysis to extract facial features. These features include face shape, skin tone, and the position of the eyes and lips.
[0256] Step 5:
[0257] The server compares the extracted features with existing makeup technique data in the database. The database contains records of diverse aesthetic styles, and the server matches the most suitable makeup technique.
[0258] Step 6:
[0259] The server suggests the most suitable makeup routine for the user based on the matching results. The suggestion includes text and visual guides, including the cosmetics to be used and their application procedures.
[0260] Step 7:
[0261] The terminal displays makeup suggestions received from the server to the user. In addition, it provides links to purchase the suggested cosmetics online, giving the user an easy way to acquire them.
[0262] Step 8:
[0263] After the user performs the makeup application suggested by the system, they provide feedback on their experience and areas for improvement. This information is sent to the server via the terminal.
[0264] Step 9:
[0265] The server collects user feedback and updates the database to improve the accuracy of future suggestions. This allows the system to continuously evolve and provide a more personalized experience.
[0266] (Example 1)
[0267] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0268] In today's world, there is a demand for simple and effective beauty methods tailored to individual users. To achieve this, it is necessary to accurately analyze the user's facial characteristics and provide optimal beauty suggestions based on that analysis. Furthermore, it is crucial that these suggestions are visually understandable and that the accuracy of these suggestions is continuously improved by utilizing user feedback.
[0269] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0270] In this invention, the server includes means for acquiring a user's facial image, means for analyzing the acquired facial image and extracting features by quantifying the shape of the face, skin tone, and the positions of the eyes and mouth, and means for comparing the extracted features with past information and proposing a suitable beauty method. This makes it possible to propose personalized beauty methods to the user. Furthermore, by using a generative AI model with natural language processing technology to effectively generate proposals and create visual guides, it becomes possible to make the makeup procedure easier for the user to understand. In addition, the information can be updated based on feedback obtained from the user, improving the accuracy of the suggestions.
[0271] A "user" refers to an individual who uses this system to analyze their facial image and receive suggestions for beauty treatments.
[0272] A "face image" refers to digital image data of a user's face, which is used for analysis.
[0273] "Features" refer to numerical data extracted from facial images, such as facial shape, skin tone, and the position of eyes and mouth.
[0274] "Past information" refers to data on past makeup styles and success stories accumulated in the database.
[0275] "Beauty methods" refer to specific makeup procedures and combinations of cosmetics that are suggested based on individual characteristics.
[0276] "Natural language processing technology" refers to the technology used to analyze, understand, and generate human language using computers.
[0277] A "generative AI model" refers to a program model that uses AI technology to generate new text or suggestions from data.
[0278] A "visual guide" refers to videos or images created to clearly explain a proposed beauty method.
[0279] "Feedback" refers to information about the results and impressions of makeup application provided by users, and this information is used to improve the system.
[0280] This invention is a system that uses AI technology to analyze a user's facial image and propose a beauty treatment method tailored to their individual needs. The system mainly consists of a server and user terminals, and processing progresses through communication between each terminal and the server.
[0281] First, the user takes a facial image using the camera of their own terminal. The terminal uses communication technology to send the captured facial image to the server. For data transmission, Wi-Fi or mobile data is generally used, and HTTP or HTTPS is commonly used as the protocol.
[0282] The server receives the sent facial image and analyzes the image data. For this analysis, an AI model built using libraries such as TensorFlow or PyTorch is applied to extract feature quantities such as the shape of the face, skin tone, and the positions of eyes and mouth.
[0283] After the feature quantities are extracted, the server compares them with the past information stored in the database. In this comparison process, SQL or NoSQL databases are used. Based on the comparison results, the server proposes the optimal beauty method to the user. This proposal is formed in natural language using a generative AI model and includes specific makeup procedures and a list of recommended cosmetics.
[0284] To make a visually understandable proposal, the server generates visual guides in the form of videos or images. Video editing software such as Adobe Premiere Pro or Final Cut Pro is used to create the visual guides. The created guides are provided to the user as links.
[0285] As a specific example, when the user sends their own facial image to the system, the server analyzes that the face shape is round and the skin color is light ochre. As a result, cosmetics such as "natural beige foundation" and "light pink blush" are recommended. In this way, it is possible to make makeup proposals that suit the user. The following prompt sentence is used for the generative AI model: "Please generate a beauty method based on the features extracted from the user's facial image. The features include a round face contour and light ochre skin color."
[0286] The flow of the specific process in Example 1 will be described using FIG. 11.
[0287] Step 1:
[0288] The user uses the camera of the terminal to take a picture of their own face image. The input is the face image obtained through the camera. As a specific operation, the user launches the camera app and takes a picture with the face correctly framed on the screen. This image must be of high resolution because it will be used in subsequent analysis processing.
[0289] Step 2:
[0290] The terminal sends the captured face image to the server. The input is the captured face image, and the output is the digital image data sent to the server. The terminal uploads the image to the server via the Internet using the HTTP or HTTPS protocol. This transmission process is initiated when the user taps the "Send" button.
[0291] Step 3:
[0292] The server analyzes the received face image. The input is the face image sent from the terminal, and the output is the quantified feature values. The server uses an AI model to analyze the image and extracts feature values such as the shape of the face, skin tone, and the positions of the eyes and mouth. The specific technology used here is the image processing algorithm based on libraries such as TensorFlow and PyTorch.
[0293] Step 4:
[0294] The server compares the extracted feature values with the past information in the database. The input is the feature values, and the output is a proposal for a suitable beauty method. The server uses a SQL or NoSQL database to search for past makeup styles with similar feature values and refers to particularly successful cases. Through this process, an optimal beauty proposal is obtained for the user.
[0295] Step 5:
[0296] The server uses a generative AI model to generate suggestions as text and provide them to the user. The input is the result of matching with a database, and the output is a suggestion of beauty treatments expressed in natural language. The server inputs a prompt sentence into the generative AI model and generates a suggestion in the format of, "Generate a beauty treatment based on features extracted from the user's facial image. The features include a round face shape and a light ochre skin tone."
[0297] Step 6:
[0298] The server creates visual guides based on the suggestions, using videos and images. The input is the suggested beauty method, and the output is a digital guide showing the makeup process that the user can view. Adobe Premiere Pro and Final Cut Pro are used for video editing. The generated guides are provided as links for the user to use.
[0299] Step 7:
[0300] Users try out makeup based on beauty suggestions received from the server and send their results and impressions as feedback to the server. The input is the user's actions, and the output is feedback data. Users enter their results into a dedicated feedback form and tap the "Submit" button to return the information to the server.
[0301] Step 8:
[0302] The server receives feedback from users and updates the database. The input is the feedback data, and the output is the updated database. The feedback is used as further training data for the AI model, contributing to improved accuracy of suggestions. This improves the overall system performance.
[0303] (Application Example 1)
[0304] Next, Application Example 1 will be described. In the following description, the data processing device 12 is referred to as a "server", and the smart glasses 214 are referred to as a "terminal".
[0305] In the modern beauty market, it is difficult for users to find makeup methods and beauty products suitable for themselves, and the opportunity to actually try out which cosmetics are suitable for themselves in the store is limited. This problem has caused a decrease in user satisfaction. In addition, since there is no means to easily try out and purchase the proposed makeup methods and beauty products, there is an issue that the purchasing desire of consumers cannot be fully stimulated.
[0306] The specific processing by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0307] In this invention, the server includes a device for acquiring a face image of a user, a device for analyzing the acquired face image to extract face feature amounts, a device for comparing the extracted feature amounts with past data to propose a suitable beauty method, a device for selecting consumer goods related to the proposed beauty method and providing a purchasing means therefor, and a means for trying out the proposed consumer goods in the store. Thereby, the user can find the most suitable makeup method and beauty products for themselves, effectively try them out, and then make an appropriate purchasing decision.
[0308] A "device" is a machine or instrument configured to realize a specific function.
[0309] A "server" is a computer system that provides information and services to other computers through a network.
[0310] A "user" is an individual or group that uses a system or service.
[0311] A "face image" is still image or video data obtained by photographing a person's face.
[0312] A "feature" is a numerical value or indicator that represents a specific pattern or trend in data analysis.
[0313] "Matching" is the process of comparing one dataset with another to identify similarities and differences.
[0314] "Beauty treatments" refer to a series of procedures and techniques used to improve the appearance of the face and body according to specific purposes.
[0315] "Consumer goods" are products that consumers purchase for personal use or consumption.
[0316] "Trial use" refers to the act of actually using a product or service before purchasing it to check its effectiveness and how it feels to use.
[0317] "Means of purchase" refers to the methods and processes used to buy goods or services.
[0318] To implement this invention, a system is constructed that primarily utilizes a user terminal, a server, and an AI model. The user terminal is equipped with a camera and used for image acquisition. The user takes a picture of their face using the terminal's camera and uploads it to the server via the internet.
[0319] The server analyzes the received facial images using an AI model. The AI model is built on frameworks such as TensorFlow and PyTorch, and it extracts features from the facial images. This model extracts elements such as facial shape, skin tone, and the relative positions of facial features such as eyes and mouth as numerical data.
[0320] The server compares the extracted features with a pre-stored database. This database contains past makeup styles and success stories, and uses this information to suggest appropriate beauty methods. These suggestions include specific makeup steps and recommended cosmetics. Furthermore, the server generates links to provide purchasing options, allowing users to easily buy the suggested cosmetics.
[0321] Furthermore, when used in stores, it allows users to actually try out the suggested consumer goods, improving the convenience of making immediate purchases.
[0322] For example, when a user accesses the system and sends a facial image, the server immediately analyzes the facial contours and skin tone and suggests products such as "light beige foundation" and "peach pink lipstick." This information is visually displayed on the user's device, allowing consumers to try it immediately. An example of a prompt message would be: "Based on the user's facial data, please suggest the most suitable makeup method. Please provide detailed instructions on the cosmetics to use and the application procedure, taking into account skin tone, face shape, and eye features."
[0323] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0324] Step 1:
[0325] The user uses the device's camera to capture an image of their face. At this point, the input is the camera image, and the output is the facial image data. The user saves this facial image and prepares to upload it to the system.
[0326] Step 2:
[0327] The terminal sends the acquired facial image to the server. The input is the user's facial image, and the output is the image data transferred to the server. This image data reaches the server via the internet.
[0328] Step 3:
[0329] The server analyzes the received image data by running it through an AI model. The input is the user's facial image data, and the output is facial features. Specifically, it analyzes the facial contour, eye position, and skin tone, and extracts them as numerical data. An AI generative model is used for this analysis.
[0330] Step 4:
[0331] The server compares the extracted features with existing data. The input is facial features, and the output is a profile of suitable beauty treatments. In this process, the current features are compared with past beauty profiles in the database to select the most suitable beauty suggestion.
[0332] Step 5:
[0333] The server suggests beauty methods to the user and generates data to display them visually. The input is a profile of the beauty method, and the output is a visual guide and a list of cosmetics. The visualized suggestions are sent to the user's terminal, which the user receives and confirms.
[0334] Step 6:
[0335] Based on the presented beauty methods, users can try out consumer goods within the store. The input is a visual guide and information about the consumer goods, and the output is the trial result. Through this process, users can gain a feel for actually using the consumer goods.
[0336] Step 7:
[0337] The server receives user feedback and updates the database. Input is user satisfaction and feedback based on usability, and output is the updated database. This feedback contributes to improving the accuracy of future suggestions.
[0338] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0339] This invention is a system that uses AI technology to analyze a user's facial image and expressions, determine their emotional state, and suggest the optimal makeup application. The system consists of a user's terminal and a server, and includes an emotion engine to provide a more personalized experience.
[0340] First, the user takes a picture of their face with the device's camera and uploads it to the system. The device then sends the image data to the server. At this point, the image may be tagged with a digital tag to identify the facial attributes.
[0341] Next, the server receives the facial image, performs image analysis using an AI model, and extracts facial features. Simultaneously, the emotion engine performs facial expression analysis to identify the user's emotional state. This emotional state provides a means to determine how the user is feeling, whether they are preparing for a specific event, etc.
[0342] The server then simultaneously considers the extracted features and emotional states and compares them with the database. The database contains past success stories along with makeup techniques based on diverse facial features and emotions. Based on this, the server suggests the most suitable makeup technique for the user.
[0343] The suggestions include specific cosmetic product selections and application procedures, tailored to the user's emotional state. Visual guides are also included to help users understand the makeup process more easily.
[0344] Furthermore, the device displays makeup suggestions received from the server and provides links that allow users to easily purchase the suggested cosmetics. Through these links, users can obtain related products without any hassle.
[0345] After trying out the makeup, users can provide feedback on their experience and impressions. This feedback is sent to the server via their device, and the server uses this information to improve the accuracy of its suggestions by updating its database. This cycle allows the system to continuously learn and enhance the value it provides to users.
[0346] For example, when a user accesses the system to prepare for a date, the server's emotion engine recognizes that the user is in an "excited" state. In this case, the system can meet the user's needs by suggesting makeup techniques appropriate for the situation, such as "glamorous eyeshadow" or "calm-toned lipstick," and providing access to related products.
[0347] The following describes the processing flow.
[0348] Step 1:
[0349] The user takes a picture of their face with the device's camera and uploads it to the system. The device then provides an interface where the user can select a situation or purpose along with the face image.
[0350] Step 2:
[0351] The device sends facial image data and user selection information to the server. The data is encrypted during transmission to ensure secure transmission.
[0352] Step 3:
[0353] The server processes the received facial images and extracts facial features using an AI model. Specifically, the shape of the contours, skin tone, and the position of facial features are analyzed.
[0354] Step 4:
[0355] The server simultaneously activates an emotion engine and recognizes the user's emotional state from their facial expressions through image analysis. The emotional state is quantified into categories such as smile, anger, and surprise.
[0356] Step 5:
[0357] The server compares the extracted facial features with the recognized emotional state against a database. The database contains various emotional states and makeup techniques suited to different face shapes.
[0358] Step 6:
[0359] Based on the matching results, the server adjusts and suggests the most suitable makeup routine for the user. This suggestion reflects appropriate color choices and styles according to the user's mood and is provided along with a list of specific cosmetics.
[0360] Step 7:
[0361] The terminal displays makeup suggestions from the server to the user, along with a visual makeup guide. It also provides links to purchase the suggested cosmetics.
[0362] Step 8:
[0363] The user performs the makeup application and enters feedback about the results into the terminal. The terminal then sends this feedback to the server.
[0364] Step 9:
[0365] The server analyzes the received feedback and updates the database. The system continues to learn by utilizing the feedback to improve the accuracy of suggestions to other users.
[0366] (Example 2)
[0367] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".
[0368] Conventional makeup application suggestion systems often failed to provide personalized suggestions based on the user's emotional state, tending instead to offer generic advice. This made it difficult to provide advice optimized for the user's specific situation or mood. Furthermore, the system lacked effective mechanisms for suggesting related products, making it difficult for users to easily select and purchase them.
[0369] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0370] In this invention, the server includes means for acquiring the user's visual information, means for analyzing the acquired visual information to extract facial attributes, and means for identifying the emotional state based on the extracted attributes. This makes it possible to propose an optimal makeup method that simultaneously considers the user's facial features and emotional state.
[0371] "Visual information" refers to digital information such as images and videos of the user's face.
[0372] "Facial attributes" refer to various data that describe facial features, including age, gender, and the shape of the eyes and mouth.
[0373] "Emotional state" refers to a psychological state obtained by analyzing the user's facial expressions, and includes emotions such as joy, sadness, and surprise.
[0374] "Knowledge" refers to the contents of databases collected and accumulated in the past, and includes information on makeup techniques and their effects.
[0375] "Makeup techniques" refer to the methods and procedures for applying makeup to the face, including the cosmetics used and the order in which they are applied.
[0376] "General-purpose products" refer to commercially available cosmetics and related products that are proposed for specific situations.
[0377] "Evaluation" refers to users' opinions and impressions of the proposed makeup methods, and includes feedback information used to improve the system.
[0378] This system provides technology to analyze a user's facial image, determine their emotional state, and suggest the most suitable makeup application. First, the user takes a picture of their face with their device's camera and uploads it to the system via a dedicated application or web interface. When the device sends this facial image to the server, it may attach a digital tag to identify facial attributes. The transmitted image data is received and analyzed by the server.
[0379] The server performs image analysis using advanced AI models (for example, face recognition models using TensorFlow or PyTorch). In this process, facial features are extracted, and an emotion engine further analyzes facial expressions to identify the user's emotional state. The server then compares the extracted facial features and emotional state with a database. The database contains a wealth of makeup techniques based on diverse facial features and emotions, and the server suggests the optimal makeup technique by comparing it with past success stories.
[0380] The suggestions include specific cosmetic product selections and application procedures, which are displayed to the user via their device. The suggestions also include visual guides to help users understand the makeup process. Furthermore, links are provided for easy purchase of the suggested cosmetics, allowing users to acquire related products on the spot.
[0381] For example, when a user uses the system to prepare for an event, the server identifies a "highly anticipated" state through its emotion engine. Based on this information, the server makes suggestions such as "glamorous eyeshadow" or "calm-toned lipstick," and provides purchase links for related products on the terminal, thus quickly and easily meeting the user's needs.
[0382] An example of a prompt for a generative AI model would be: "Please specify a suitable makeup routine when the user is identified as being in an excited state in preparation for an event. Please include specific cosmetics and their application steps." This would allow the system to prompt for appropriate makeup suggestions.
[0383] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0384] Step 1:
[0385] The user takes a picture of their face using the device's camera and uploads the image data to the system via a dedicated application or web interface. Here, the input is the face image, and the output is the image data uploaded to the system. This image is then sent to a server for post-processing.
[0386] Step 2:
[0387] The device uses a basic face detection algorithm to locate faces in uploaded facial images and adds digital tags. The input is a user's facial image, and the output is image data with facial location information added. This data is then transmitted to the server using the communication function.
[0388] Step 3:
[0389] The server receives image data sent from the terminal and performs image analysis using an advanced AI model. The input is tagged image data received from the terminal, and the output is facial features obtained through image analysis. Specifically, it performs feature extraction using an AI model (for example, a TensorFlow-based facial recognition model).
[0390] Step 4:
[0391] The server analyzes facial expressions using an emotion engine based on facial features to identify the user's emotional state. The input is facial features, and the output is the identified emotional state. This process allows for the identification of the user's psychological state.
[0392] Step 5:
[0393] The server accesses an internal database to match facial features with identified emotional states. The input consists of facial features and emotional states, and the output is a suggestion of the optimal makeup application based on the matching results. The database contains data on past success stories and recommended makeup applications.
[0394] Step 6:
[0395] The server generates a suggested makeup application method deemed optimal and prepares its details (selection of cosmetics and application procedures). Input is database information as a result of matching, and output is the specific suggested makeup application method, including visual guidance.
[0396] Step 7:
[0397] The terminal displays makeup suggestions received from the server to the user. The input is makeup suggestion data from the server, and the output is specific makeup instructions and purchase links shown to the user. A user-friendly design is used for the display.
[0398] Step 8:
[0399] Users try out makeup based on the suggested methods and input their results and impressions as feedback. The input is the user's feedback, and the output is its transmission to the server. This feedback is used to update the database and improve the accuracy of suggestions.
[0400] (Application Example 2)
[0401] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."
[0402] Currently, makeup application systems simply offer fixed suggestions based on facial features, making it difficult to provide personalized suggestions that take into account the user's emotional state. Furthermore, feedback systems for improving the accuracy and applicability of these suggestions are insufficient. Therefore, there is a need for more sophisticated makeup application methods that respond to user needs and emotions.
[0403] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0404] In this invention, the server includes means for acquiring user facial information, means for analyzing the acquired facial information to identify emotional states, means for generating adaptive makeup methods based on emotional states, means for selecting products related to the generated makeup methods and providing methods for acquiring them, and means for presenting a visual guide to the makeup methods using a visual imitation device. This makes it possible to suggest more personalized makeup methods that are tailored to the user's emotions and individual characteristics. Furthermore, the accuracy of the suggestions can be improved through a feedback function, thereby enhancing the quality of the user experience.
[0405] "User facial information" refers to data related to the user's facial image and expressions, which is acquired through cameras and sensors.
[0406] "Identifying emotional states" refers to the process of analyzing and identifying the user's feelings and mood from acquired facial information.
[0407] "Generating adaptive makeup techniques" refers to calculating and determining the optimal makeup techniques and product selections based on the user's emotional state.
[0408] "Selecting products and providing methods for obtaining them" means selecting suitable products based on the generated cosmetic formula and providing users with information on how to obtain those products.
[0409] A "visual imitation device" is a device that visually reproduces the steps and results of a makeup application in real time and presents them to the user in an easy-to-understand manner.
[0410] The system necessary to implement this invention mainly consists of a terminal, a server, and a visual imitation device. First, the terminal is equipped with a high-precision camera and sensors, which are used to acquire the user's facial information. The facial information is collected in real time and then transmitted to the server.
[0411] The server uses a TensorFlow-based facial expression recognition model to process the received facial information. This process identifies the user's emotional state through data analysis. Based on the recognized emotional state, the server generates optimal suggestions from a variety of makeup techniques stored in the database.
[0412] Furthermore, products corresponding to the generated makeup method are selected, and how users can obtain those products is indicated. Product information is often provided as links to e-commerce sites such as online stores.
[0413] Furthermore, the server provides a visual guide to the generated makeup techniques using a visual imitation device equipped with AR technology such as MirageXR, to visually demonstrate the techniques to the user. This allows the user to easily understand and practice the proposed makeup techniques.
[0414] As a concrete example, when a user takes a photo of their face with their device before attending a social event, the server recognizes it as an "expression of joy." In this case, the server suggests "makeup that gives a glamorous and positive impression" and selects cosmetics for it. The visual imitation device demonstrates the makeup technique using an avatar, and the user can proceed with the makeup application according to the steps.
[0415] An example of a prompt to input into a generative AI model would be, "What makeup technique is appropriate when the user's facial expression is 'joyful'?"
[0416] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0417] Step 1:
[0418] The device uses a high-precision camera to acquire the user's facial information. It takes a real-time image of the user's face as input and sends this as image data to the server.
[0419] Step 2:
[0420] The server analyzes the received face image using a TensorFlow-based facial expression recognition model. The input is the face image data sent in step 1, and the data is processed to extract facial features in order to identify the emotional state. The output is the identification result indicating the user's emotional state.
[0421] Step 3:
[0422] The server generates adaptive makeup techniques from a database based on identified emotional states. It uses the identified emotional state as input and performs data calculations by matching it against various makeup patterns in the database. The output is a suggestion of the optimal makeup technique.
[0423] Step 4:
[0424] The server selects products related to the generated cosmetic formula and sends the product information to the terminal. Based on the generated cosmetic formula data as input, it searches the online store database and performs data processing to select product information, including how to obtain it. The output is a proposal that includes product information links.
[0425] Step 5:
[0426] The visual imitation device uses AR technology such as MirageXR to present the user with a visual guide to makeup application. It takes suggested makeup application content as input and performs real-time video processing for visualization. The output is an AR-based makeup application guide presented to the user.
[0427] Step 6:
[0428] The user selects their desired cosmetic product based on the presented product information and proceeds with the purchase. Product information links are used as input, and the selection and purchase process takes place on the online store. The output is confirmation information for the purchased product.
[0429] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0430] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0431] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.
[0432] [Third Embodiment]
[0433] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.
[0434] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.
[0435] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0436] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.
[0437] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0438] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0439] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0440] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0441] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0442] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0443] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0444] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".
[0445] This invention is a system that uses AI to analyze a user's facial image and proposes a makeup method tailored to their individual facial characteristics. The system consists of a server and a user's terminal, and the specific processing is carried out as follows.
[0446] First, the user takes a picture of their face using their device's camera and uploads it to the system. This face image is then sent from the device to the server. The server receives this image, analyzes it using an AI model, and extracts facial features. These features are numerical representations of the shape of the face, skin tone, and the relative positions of features such as the eyes and mouth.
[0447] Next, the server compares the extracted features with existing data in the database to match the user's face with suitable makeup techniques. The database contains past makeup styles and success stories based on various facial features. Based on the matching results, the server suggests the most suitable makeup technique for the user.
[0448] This suggestion includes specific makeup steps and a list of recommended cosmetics. It also provides links to purchase the suggested cosmetics, allowing users to easily buy the products. Furthermore, to make the suggestions clearer, the server generates visual guides and explains the makeup process with videos and images.
[0449] After trying out the suggested makeup, users can provide feedback on the results and their impressions using their device. The server uses this feedback to update the database and further improve matching accuracy. This system makes it easier for users to discover the perfect makeup for themselves, enriching their daily beauty routine.
[0450] For example, when a user accesses the system and submits a facial image, the server can determine that the face has a round shape and a light ochre skin tone. Based on this, suggestions such as "natural beige foundation" and "light pink blush" are made, and links to related products are displayed. In this way, it is possible to customize makeup suggestions for each user and provide a high level of satisfaction.
[0451] The following describes the processing flow.
[0452] Step 1:
[0453] The user takes a picture of their face with the device's camera and uploads it to the system. The device checks the image format and resolution, and prepares it for transmission to the server.
[0454] Step 2:
[0455] The device sends the user's facial image data to the server. The transmitted data is protected through secure communication.
[0456] Step 3:
[0457] The server preprocesses the received facial images, standardizing the image resolution and removing noise. This prepares the images for improved facial recognition accuracy.
[0458] Step 4:
[0459] The server performs AI-based image analysis to extract facial features. These features include face shape, skin tone, and the position of the eyes and lips.
[0460] Step 5:
[0461] The server compares the extracted features with existing makeup technique data in the database. The database contains records of diverse aesthetic styles, and the server matches the most suitable makeup technique.
[0462] Step 6:
[0463] The server suggests the most suitable makeup routine for the user based on the matching results. The suggestion includes text and visual guides, including the cosmetics to be used and their application procedures.
[0464] Step 7:
[0465] The terminal displays makeup suggestions received from the server to the user. In addition, it provides links to purchase the suggested cosmetics online, giving the user an easy way to acquire them.
[0466] Step 8:
[0467] After the user performs the makeup application suggested by the system, they provide feedback on their experience and areas for improvement. This information is sent to the server via the terminal.
[0468] Step 9:
[0469] The server collects user feedback and updates the database to improve the accuracy of future suggestions. This allows the system to continuously evolve and provide a more personalized experience.
[0470] (Example 1)
[0471] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0472] In today's world, there is a demand for simple and effective beauty methods tailored to individual users. To achieve this, it is necessary to accurately analyze the user's facial characteristics and provide optimal beauty suggestions based on that analysis. Furthermore, it is crucial that these suggestions are visually understandable and that the accuracy of these suggestions is continuously improved by utilizing user feedback.
[0473] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0474] In this invention, the server includes means for acquiring a user's facial image, means for analyzing the acquired facial image and extracting features by quantifying the shape of the face, skin tone, and the positions of the eyes and mouth, and means for comparing the extracted features with past information and proposing a suitable beauty method. This makes it possible to propose personalized beauty methods to the user. Furthermore, by using a generative AI model with natural language processing technology to effectively generate proposals and create visual guides, it becomes possible to make the makeup procedure easier for the user to understand. In addition, the information can be updated based on feedback obtained from the user, improving the accuracy of the suggestions.
[0475] A "user" refers to an individual who uses this system to analyze their facial image and receive suggestions for beauty treatments.
[0476] A "face image" refers to digital image data of a user's face, which is used for analysis.
[0477] "Features" refer to numerical data extracted from facial images, such as facial shape, skin tone, and the position of eyes and mouth.
[0478] "Past information" refers to data on past makeup styles and success stories accumulated in the database.
[0479] "Beauty methods" refer to specific makeup procedures and combinations of cosmetics that are suggested based on individual characteristics.
[0480] "Natural language processing technology" refers to the technology used to analyze, understand, and generate human language using computers.
[0481] A "generative AI model" refers to a program model that uses AI technology to generate new text or suggestions from data.
[0482] A "visual guide" refers to videos or images created to clearly explain a proposed beauty method.
[0483] "Feedback" refers to information about the results and impressions of makeup application provided by users, and this information is used to improve the system.
[0484] This invention is a system that uses AI technology to analyze a user's facial image and propose a beauty treatment method tailored to their individual needs. The system mainly consists of a server and user terminals, and processing progresses through communication between each terminal and the server.
[0485] First, the user takes a picture of their face using the camera on their device. The device then uses communication technology to send the captured face image to the server. Wi-Fi or mobile data is typically used for data transmission, and HTTP or HTTPS are commonly used protocols.
[0486] The server receives the transmitted facial image and analyzes the image data. This analysis involves applying an AI model built using libraries such as TensorFlow and PyTorch to extract features such as facial shape, skin tone, and the positions of the eyes and mouth.
[0487] After features are extracted, the server compares them with historical information stored in a database. SQL or NoSQL databases are used in this comparison process. Based on the comparison results, the server suggests the most suitable beauty routine for the user. This suggestion is generated using a generative AI model and is expressed in natural language, including specific makeup steps and a list of recommended cosmetics.
[0488] To provide visually clear suggestions, the server generates video and image-based visual guides. Video editing software such as Adobe Premiere Pro and Final Cut Pro are used to create these visual guides. The created guides are provided to the user as links.
[0489] As a concrete example, when a user submits an image of their face to the system, the server analyzes that the face shape is round and the skin tone is light ochre. Based on this analysis, it recommends cosmetics such as "natural beige foundation" and "light pink blush." In this way, it is possible to provide makeup suggestions tailored to the user. The following prompt is used for the generating AI model: "Generate beauty recommendations based on features extracted from the user's face image. These features include a round face shape and light ochre skin tone."
[0490] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0491] Step 1:
[0492] The user takes a picture of their face using the device's camera. The input is the face image acquired through the camera. Specifically, the user launches the camera app and correctly frames their face on the screen to take a picture. This image must be high resolution as it will be used in subsequent analysis processing.
[0493] Step 2:
[0494] The device sends the captured facial image to the server. The input is the captured facial image, and the output is the digital image data sent to the server. The device uploads the image to the server via the internet using the HTTP or HTTPS protocol. This transmission process is initiated when the user taps the "Send" button.
[0495] Step 3:
[0496] The server analyzes the received facial images. The input is a facial image sent from the terminal, and the output is digitized features. The server uses an AI model to analyze the image and extract features such as facial shape, skin tone, and the position of the eyes and mouth. The specific techniques used here are image processing algorithms using libraries such as TensorFlow and PyTorch.
[0497] Step 4:
[0498] The server compares the extracted features with historical information in the database. The input is the features, and the output is a suggestion of suitable beauty methods. The server uses SQL or NoSQL databases to search for past makeup styles with similar features and refers to particularly successful cases. Through this process, the user receives the most suitable beauty suggestions.
[0499] Step 5:
[0500] The server uses a generative AI model to generate suggestions as text and provide them to the user. The input is the result of matching with a database, and the output is a suggestion of beauty treatments expressed in natural language. The server inputs a prompt sentence into the generative AI model and generates a suggestion in the format of, "Generate a beauty treatment based on features extracted from the user's facial image. The features include a round face shape and a light ochre skin tone."
[0501] Step 6:
[0502] The server creates visual guides based on the suggestions, using videos and images. The input is the suggested beauty method, and the output is a digital guide showing the makeup process that the user can view. Adobe Premiere Pro and Final Cut Pro are used for video editing. The generated guides are provided as links for the user to use.
[0503] Step 7:
[0504] Users try out makeup based on beauty suggestions received from the server and send their results and impressions as feedback to the server. The input is the user's actions, and the output is feedback data. Users enter their results into a dedicated feedback form and tap the "Submit" button to return the information to the server.
[0505] Step 8:
[0506] The server receives feedback from users and updates the database. The input is the feedback data, and the output is the updated database. The feedback is used as further training data for the AI model, contributing to improved accuracy of suggestions. This improves the overall system performance.
[0507] (Application Example 1)
[0508] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0509] In today's beauty market, it is difficult for consumers to find makeup techniques and beauty products that suit them, and opportunities to actually try out cosmetics in stores to see which ones are right for them are limited. This problem contributes to decreased consumer satisfaction. Furthermore, the lack of convenient ways to try out and purchase suggested makeup techniques and beauty products means that consumer purchasing intent is not being fully stimulated.
[0510] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0511] In this invention, the server includes a device for acquiring a user's facial image, a device for analyzing the acquired facial image and extracting facial features, a device for comparing the extracted features with past data and proposing a suitable beauty method, a device for selecting consumer goods related to the proposed beauty method and providing means for purchasing them, and means for trying out the proposed consumer goods in a store. This enables the user to find the makeup method and beauty products that are best suited to them, effectively try them out, and then make an appropriate purchasing decision.
[0512] An "apparatus" is a machine or device configured to perform a specific function.
[0513] A "server" is a computer system that provides information and services to other computers via a network.
[0514] "User" refers to an individual or group that uses the system or service.
[0515] A "face image" is a still image or video data of a person's face.
[0516] A "feature" is a numerical value or indicator that represents a specific pattern or trend in data analysis.
[0517] "Matching" is the process of comparing one dataset with another to identify similarities and differences.
[0518] "Beauty treatments" refer to a series of procedures and techniques used to improve the appearance of the face and body according to specific purposes.
[0519] "Consumer goods" are products that consumers purchase for personal use or consumption.
[0520] "Trial use" refers to the act of actually using a product or service before purchasing it to check its effectiveness and how it feels to use.
[0521] "Means of purchase" refers to the methods and processes used to buy goods or services.
[0522] To implement this invention, a system is constructed that primarily utilizes a user terminal, a server, and an AI model. The user terminal is equipped with a camera and used for image acquisition. The user takes a picture of their face using the terminal's camera and uploads it to the server via the internet.
[0523] The server analyzes the received facial images using an AI model. The AI model is built on frameworks such as TensorFlow and PyTorch, and it extracts features from the facial images. This model extracts elements such as facial shape, skin tone, and the relative positions of facial features such as eyes and mouth as numerical data.
[0524] The server compares the extracted features with a pre-stored database. This database contains past makeup styles and success stories, and uses this information to suggest appropriate beauty methods. These suggestions include specific makeup steps and recommended cosmetics. Furthermore, the server generates links to provide purchasing options, allowing users to easily buy the suggested cosmetics.
[0525] Furthermore, when used in stores, it allows users to actually try out the suggested consumer goods, improving the convenience of making immediate purchases.
[0526] For example, when a user accesses the system and sends a facial image, the server immediately analyzes the facial contours and skin tone and suggests products such as "light beige foundation" and "peach pink lipstick." This information is visually displayed on the user's device, allowing consumers to try it immediately. An example of a prompt message would be: "Based on the user's facial data, please suggest the most suitable makeup method. Please provide detailed instructions on the cosmetics to use and the application procedure, taking into account skin tone, face shape, and eye features."
[0527] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0528] Step 1:
[0529] The user uses the device's camera to capture an image of their face. At this point, the input is the camera image, and the output is the facial image data. The user saves this facial image and prepares to upload it to the system.
[0530] Step 2:
[0531] The terminal sends the acquired facial image to the server. The input is the user's facial image, and the output is the image data transferred to the server. This image data reaches the server via the internet.
[0532] Step 3:
[0533] The server analyzes the received image data by running it through an AI model. The input is the user's facial image data, and the output is facial features. Specifically, it analyzes the facial contour, eye position, and skin tone, and extracts them as numerical data. An AI generative model is used for this analysis.
[0534] Step 4:
[0535] The server compares the extracted features with existing data. The input is facial features, and the output is a profile of suitable beauty treatments. In this process, the current features are compared with past beauty profiles in the database to select the most suitable beauty suggestion.
[0536] Step 5:
[0537] The server suggests beauty methods to the user and generates data to display them visually. The input is a profile of the beauty method, and the output is a visual guide and a list of cosmetics. The visualized suggestions are sent to the user's terminal, which the user receives and confirms.
[0538] Step 6:
[0539] Based on the presented beauty methods, users can try out consumer goods within the store. The input is a visual guide and information about the consumer goods, and the output is the trial result. Through this process, users can gain a feel for actually using the consumer goods.
[0540] Step 7:
[0541] The server receives user feedback and updates the database. Input is user satisfaction and feedback based on usability, and output is the updated database. This feedback contributes to improving the accuracy of future suggestions.
[0542] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0543] This invention is a system that uses AI technology to analyze a user's facial image and expressions, determine their emotional state, and suggest the optimal makeup application. The system consists of a user's terminal and a server, and includes an emotion engine to provide a more personalized experience.
[0544] First, the user takes a picture of their face with the device's camera and uploads it to the system. The device then sends the image data to the server. At this point, the image may be tagged with a digital tag to identify the facial attributes.
[0545] Next, the server receives the facial image, performs image analysis using an AI model, and extracts facial features. Simultaneously, the emotion engine performs facial expression analysis to identify the user's emotional state. This emotional state provides a means to determine how the user is feeling, whether they are preparing for a specific event, etc.
[0546] The server then simultaneously considers the extracted features and emotional states and compares them with the database. The database contains past success stories along with makeup techniques based on diverse facial features and emotions. Based on this, the server suggests the most suitable makeup technique for the user.
[0547] The suggestions include specific cosmetic product selections and application procedures, tailored to the user's emotional state. Visual guides are also included to help users understand the makeup process more easily.
[0548] Furthermore, the device displays makeup suggestions received from the server and provides links that allow users to easily purchase the suggested cosmetics. Through these links, users can obtain related products without any hassle.
[0549] After trying out the makeup, users can provide feedback on their experience and impressions. This feedback is sent to the server via their device, and the server uses this information to improve the accuracy of its suggestions by updating its database. This cycle allows the system to continuously learn and enhance the value it provides to users.
[0550] For example, when a user accesses the system to prepare for a date, the server's emotion engine recognizes that the user is in an "excited" state. In this case, the system can meet the user's needs by suggesting makeup techniques appropriate for the situation, such as "glamorous eyeshadow" or "calm-toned lipstick," and providing access to related products.
[0551] The following describes the processing flow.
[0552] Step 1:
[0553] The user takes a picture of their face with the device's camera and uploads it to the system. The device then provides an interface where the user can select a situation or purpose along with the face image.
[0554] Step 2:
[0555] The device sends facial image data and user selection information to the server. The data is encrypted during transmission to ensure secure transmission.
[0556] Step 3:
[0557] The server processes the received facial images and extracts facial features using an AI model. Specifically, the shape of the contours, skin tone, and the position of facial features are analyzed.
[0558] Step 4:
[0559] The server simultaneously activates an emotion engine and recognizes the user's emotional state from their facial expressions through image analysis. The emotional state is quantified into categories such as smile, anger, and surprise.
[0560] Step 5:
[0561] The server compares the extracted facial features with the recognized emotional state against a database. The database contains various emotional states and makeup techniques suited to different face shapes.
[0562] Step 6:
[0563] Based on the matching results, the server adjusts and suggests the most suitable makeup routine for the user. This suggestion reflects appropriate color choices and styles according to the user's mood and is provided along with a list of specific cosmetics.
[0564] Step 7:
[0565] The terminal displays makeup suggestions from the server to the user, along with a visual makeup guide. It also provides links to purchase the suggested cosmetics.
[0566] Step 8:
[0567] The user performs the makeup application and enters feedback about the results into the terminal. The terminal then sends this feedback to the server.
[0568] Step 9:
[0569] The server analyzes the received feedback and updates the database. The system continues to learn by utilizing the feedback to improve the accuracy of suggestions to other users.
[0570] (Example 2)
[0571] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0572] Conventional makeup application suggestion systems often failed to provide personalized suggestions based on the user's emotional state, tending instead to offer generic advice. This made it difficult to provide advice optimized for the user's specific situation or mood. Furthermore, the system lacked effective mechanisms for suggesting related products, making it difficult for users to easily select and purchase them.
[0573] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0574] In this invention, the server includes means for acquiring the user's visual information, means for analyzing the acquired visual information to extract facial attributes, and means for identifying the emotional state based on the extracted attributes. This makes it possible to propose an optimal makeup method that simultaneously considers the user's facial features and emotional state.
[0575] "Visual information" refers to digital information such as images and videos of the user's face.
[0576] "Facial attributes" refer to various data that describe facial features, including age, gender, and the shape of the eyes and mouth.
[0577] "Emotional state" refers to a psychological state obtained by analyzing the user's facial expressions, and includes emotions such as joy, sadness, and surprise.
[0578] "Knowledge" refers to the contents of databases collected and accumulated in the past, and includes information on makeup techniques and their effects.
[0579] "Makeup techniques" refer to the methods and procedures for applying makeup to the face, including the cosmetics used and the order in which they are applied.
[0580] "General-purpose products" refer to commercially available cosmetics and related products that are proposed for specific situations.
[0581] "Evaluation" refers to users' opinions and impressions of the proposed makeup methods, and includes feedback information used to improve the system.
[0582] This system provides technology to analyze a user's facial image, determine their emotional state, and suggest the most suitable makeup application. First, the user takes a picture of their face with their device's camera and uploads it to the system via a dedicated application or web interface. When the device sends this facial image to the server, it may attach a digital tag to identify facial attributes. The transmitted image data is received and analyzed by the server.
[0583] The server performs image analysis using advanced AI models (for example, face recognition models using TensorFlow or PyTorch). In this process, facial features are extracted, and an emotion engine further analyzes facial expressions to identify the user's emotional state. The server then compares the extracted facial features and emotional state with a database. The database contains a wealth of makeup techniques based on diverse facial features and emotions, and the server suggests the optimal makeup technique by comparing it with past success stories.
[0584] The suggestions include specific cosmetic product selections and application procedures, which are displayed to the user via their device. The suggestions also include visual guides to help users understand the makeup process. Furthermore, links are provided for easy purchase of the suggested cosmetics, allowing users to acquire related products on the spot.
[0585] For example, when a user uses the system to prepare for an event, the server identifies a "highly anticipated" state through its emotion engine. Based on this information, the server makes suggestions such as "glamorous eyeshadow" or "calm-toned lipstick," and provides purchase links for related products on the terminal, thus quickly and easily meeting the user's needs.
[0586] An example of a prompt for a generative AI model would be: "Please specify a suitable makeup routine when the user is identified as being in an excited state in preparation for an event. Please include specific cosmetics and their application steps." This would allow the system to prompt for appropriate makeup suggestions.
[0587] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0588] Step 1:
[0589] The user takes a picture of their face using the device's camera and uploads the image data to the system via a dedicated application or web interface. Here, the input is the face image, and the output is the image data uploaded to the system. This image is then sent to a server for post-processing.
[0590] Step 2:
[0591] The device uses a basic face detection algorithm to locate faces in uploaded facial images and adds digital tags. The input is a user's facial image, and the output is image data with facial location information added. This data is then transmitted to the server using the communication function.
[0592] Step 3:
[0593] The server receives image data sent from the terminal and performs image analysis using an advanced AI model. The input is tagged image data received from the terminal, and the output is facial features obtained through image analysis. Specifically, it performs feature extraction using an AI model (for example, a TensorFlow-based facial recognition model).
[0594] Step 4:
[0595] The server analyzes facial expressions using an emotion engine based on facial features to identify the user's emotional state. The input is facial features, and the output is the identified emotional state. This process allows for the identification of the user's psychological state.
[0596] Step 5:
[0597] The server accesses an internal database to match facial features with identified emotional states. The input consists of facial features and emotional states, and the output is a suggestion of the optimal makeup application based on the matching results. The database contains data on past success stories and recommended makeup applications.
[0598] Step 6:
[0599] The server generates a suggested makeup application method deemed optimal and prepares its details (selection of cosmetics and application procedures). Input is database information as a result of matching, and output is the specific suggested makeup application method, including visual guidance.
[0600] Step 7:
[0601] The terminal displays makeup suggestions received from the server to the user. The input is makeup suggestion data from the server, and the output is specific makeup instructions and purchase links shown to the user. A user-friendly design is used for the display.
[0602] Step 8:
[0603] Users try out makeup based on the suggested methods and input their results and impressions as feedback. The input is the user's feedback, and the output is its transmission to the server. This feedback is used to update the database and improve the accuracy of suggestions.
[0604] (Application Example 2)
[0605] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."
[0606] Currently, makeup application systems simply offer fixed suggestions based on facial features, making it difficult to provide personalized suggestions that take into account the user's emotional state. Furthermore, feedback systems for improving the accuracy and applicability of these suggestions are insufficient. Therefore, there is a need for more sophisticated makeup application methods that respond to user needs and emotions.
[0607] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0608] In this invention, the server includes means for acquiring user facial information, means for analyzing the acquired facial information to identify emotional states, means for generating adaptive makeup methods based on emotional states, means for selecting products related to the generated makeup methods and providing methods for acquiring them, and means for presenting a visual guide to the makeup methods using a visual imitation device. This makes it possible to suggest more personalized makeup methods that are tailored to the user's emotions and individual characteristics. Furthermore, the accuracy of the suggestions can be improved through a feedback function, thereby enhancing the quality of the user experience.
[0609] "User facial information" refers to data related to the user's facial image and expressions, which is acquired through cameras and sensors.
[0610] "Identifying emotional states" refers to the process of analyzing and identifying the user's feelings and mood from acquired facial information.
[0611] "Generating adaptive makeup techniques" refers to calculating and determining the optimal makeup techniques and product selections based on the user's emotional state.
[0612] "Selecting products and providing methods for obtaining them" means selecting suitable products based on the generated cosmetic formula and providing users with information on how to obtain those products.
[0613] A "visual imitation device" is a device that visually reproduces the steps and results of a makeup application in real time and presents them to the user in an easy-to-understand manner.
[0614] The system necessary to implement this invention mainly consists of a terminal, a server, and a visual imitation device. First, the terminal is equipped with a high-precision camera and sensors, which are used to acquire the user's facial information. The facial information is collected in real time and then transmitted to the server.
[0615] The server uses a TensorFlow-based facial expression recognition model to process the received facial information. This process identifies the user's emotional state through data analysis. Based on the recognized emotional state, the server generates optimal suggestions from a variety of makeup techniques stored in the database.
[0616] Furthermore, products corresponding to the generated makeup method are selected, and how users can obtain those products is indicated. Product information is often provided as links to e-commerce sites such as online stores.
[0617] Furthermore, the server provides a visual guide to the generated makeup techniques using a visual imitation device equipped with AR technology such as MirageXR, to visually demonstrate the techniques to the user. This allows the user to easily understand and practice the proposed makeup techniques.
[0618] As a concrete example, when a user takes a photo of their face with their device before attending a social event, the server recognizes it as an "expression of joy." In this case, the server suggests "makeup that gives a glamorous and positive impression" and selects cosmetics for it. The visual imitation device demonstrates the makeup technique using an avatar, and the user can proceed with the makeup application according to the steps.
[0619] An example of a prompt to input into a generative AI model would be, "What makeup technique is appropriate when the user's facial expression is 'joyful'?"
[0620] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0621] Step 1:
[0622] The device uses a high-precision camera to acquire the user's facial information. It takes a real-time image of the user's face as input and sends this as image data to the server.
[0623] Step 2:
[0624] The server analyzes the received face image using a TensorFlow-based facial expression recognition model. The input is the face image data sent in step 1, and the data is processed to extract facial features in order to identify the emotional state. The output is the identification result indicating the user's emotional state.
[0625] Step 3:
[0626] The server generates adaptive makeup techniques from a database based on identified emotional states. It uses the identified emotional state as input and performs data calculations by matching it against various makeup patterns in the database. The output is a suggestion of the optimal makeup technique.
[0627] Step 4:
[0628] The server selects products related to the generated cosmetic formula and sends the product information to the terminal. Based on the generated cosmetic formula data as input, it searches the online store database and performs data processing to select product information, including how to obtain it. The output is a proposal that includes product information links.
[0629] Step 5:
[0630] The visual imitation device uses AR technology such as MirageXR to present the user with a visual guide to makeup application. It takes suggested makeup application content as input and performs real-time video processing for visualization. The output is an AR-based makeup application guide presented to the user.
[0631] Step 6:
[0632] The user selects their desired cosmetic product based on the presented product information and proceeds with the purchase. Product information links are used as input, and the selection and purchase process takes place on the online store. The output is confirmation information for the purchased product.
[0633] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0634] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0635] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.
[0636] [Fourth Embodiment]
[0637] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.
[0638] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.
[0639] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).
[0640] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.
[0641] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.
[0642] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).
[0643] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.
[0644] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.
[0645] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.
[0646] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.
[0647] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.
[0648] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.
[0649] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0650] This invention is a system that uses AI to analyze a user's facial image and proposes a makeup method tailored to their individual facial characteristics. The system consists of a server and a user's terminal, and the specific processing is carried out as follows.
[0651] First, the user takes a picture of their face using their device's camera and uploads it to the system. This face image is then sent from the device to the server. The server receives this image, analyzes it using an AI model, and extracts facial features. These features are numerical representations of the shape of the face, skin tone, and the relative positions of features such as the eyes and mouth.
[0652] Next, the server compares the extracted features with existing data in the database to match the user's face with suitable makeup techniques. The database contains past makeup styles and success stories based on various facial features. Based on the matching results, the server suggests the most suitable makeup technique for the user.
[0653] This suggestion includes specific makeup steps and a list of recommended cosmetics. It also provides links to purchase the suggested cosmetics, allowing users to easily buy the products. Furthermore, to make the suggestions clearer, the server generates visual guides and explains the makeup process with videos and images.
[0654] After trying out the suggested makeup, users can provide feedback on the results and their impressions using their device. The server uses this feedback to update the database and further improve matching accuracy. This system makes it easier for users to discover the perfect makeup for themselves, enriching their daily beauty routine.
[0655] For example, when a user accesses the system and submits a facial image, the server can determine that the face has a round shape and a light ochre skin tone. Based on this, suggestions such as "natural beige foundation" and "light pink blush" are made, and links to related products are displayed. In this way, it is possible to customize makeup suggestions for each user and provide a high level of satisfaction.
[0656] The following describes the processing flow.
[0657] Step 1:
[0658] The user takes a picture of their face with the device's camera and uploads it to the system. The device checks the image format and resolution, and prepares it for transmission to the server.
[0659] Step 2:
[0660] The device sends the user's facial image data to the server. The transmitted data is protected through secure communication.
[0661] Step 3:
[0662] The server preprocesses the received facial images, standardizing the image resolution and removing noise. This prepares the images for improved facial recognition accuracy.
[0663] Step 4:
[0664] The server performs AI-based image analysis to extract facial features. These features include face shape, skin tone, and the position of the eyes and lips.
[0665] Step 5:
[0666] The server compares the extracted features with existing makeup technique data in the database. The database contains records of diverse aesthetic styles, and the server matches the most suitable makeup technique.
[0667] Step 6:
[0668] The server suggests the most suitable makeup routine for the user based on the matching results. The suggestion includes text and visual guides, including the cosmetics to be used and their application procedures.
[0669] Step 7:
[0670] The terminal displays makeup suggestions received from the server to the user. In addition, it provides links to purchase the suggested cosmetics online, giving the user an easy way to acquire them.
[0671] Step 8:
[0672] After the user performs the makeup application suggested by the system, they provide feedback on their experience and areas for improvement. This information is sent to the server via the terminal.
[0673] Step 9:
[0674] The server collects user feedback and updates the database to improve the accuracy of future suggestions. This allows the system to continuously evolve and provide a more personalized experience.
[0675] (Example 1)
[0676] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0677] In today's world, there is a demand for simple and effective beauty methods tailored to individual users. To achieve this, it is necessary to accurately analyze the user's facial characteristics and provide optimal beauty suggestions based on that analysis. Furthermore, it is crucial that these suggestions are visually understandable and that the accuracy of these suggestions is continuously improved by utilizing user feedback.
[0678] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.
[0679] In this invention, the server includes means for acquiring a user's facial image, means for analyzing the acquired facial image and extracting features by quantifying the shape of the face, skin tone, and the positions of the eyes and mouth, and means for comparing the extracted features with past information and proposing a suitable beauty method. This makes it possible to propose personalized beauty methods to the user. Furthermore, by using a generative AI model with natural language processing technology to effectively generate proposals and create visual guides, it becomes possible to make the makeup procedure easier for the user to understand. In addition, the information can be updated based on feedback obtained from the user, improving the accuracy of the suggestions.
[0680] A "user" refers to an individual who uses this system to analyze their facial image and receive suggestions for beauty treatments.
[0681] A "face image" refers to digital image data of a user's face, which is used for analysis.
[0682] "Features" refer to numerical data extracted from facial images, such as facial shape, skin tone, and the position of eyes and mouth.
[0683] "Past information" refers to data on past makeup styles and success stories accumulated in the database.
[0684] "Beauty methods" refer to specific makeup procedures and combinations of cosmetics that are suggested based on individual characteristics.
[0685] "Natural language processing technology" refers to the technology used to analyze, understand, and generate human language using computers.
[0686] A "generative AI model" refers to a program model that uses AI technology to generate new text or suggestions from data.
[0687] A "visual guide" refers to videos or images created to clearly explain a proposed beauty method.
[0688] "Feedback" refers to information about the results and impressions of makeup application provided by users, and this information is used to improve the system.
[0689] This invention is a system that uses AI technology to analyze a user's facial image and propose a beauty treatment method tailored to their individual needs. The system mainly consists of a server and user terminals, and processing progresses through communication between each terminal and the server.
[0690] First, the user takes a picture of their face using the camera on their device. The device then uses communication technology to send the captured face image to the server. Wi-Fi or mobile data is typically used for data transmission, and HTTP or HTTPS are commonly used protocols.
[0691] The server receives the transmitted facial image and analyzes the image data. This analysis involves applying an AI model built using libraries such as TensorFlow and PyTorch to extract features such as facial shape, skin tone, and the positions of the eyes and mouth.
[0692] After features are extracted, the server compares them with historical information stored in a database. SQL or NoSQL databases are used in this comparison process. Based on the comparison results, the server suggests the most suitable beauty routine for the user. This suggestion is generated using a generative AI model and is expressed in natural language, including specific makeup steps and a list of recommended cosmetics.
[0693] To provide visually clear suggestions, the server generates video and image-based visual guides. Video editing software such as Adobe Premiere Pro and Final Cut Pro are used to create these visual guides. The created guides are provided to the user as links.
[0694] As a concrete example, when a user submits an image of their face to the system, the server analyzes that the face shape is round and the skin tone is light ochre. Based on this analysis, it recommends cosmetics such as "natural beige foundation" and "light pink blush." In this way, it is possible to provide makeup suggestions tailored to the user. The following prompt is used for the generating AI model: "Generate beauty recommendations based on features extracted from the user's face image. These features include a round face shape and light ochre skin tone."
[0695] The flow of the specific processing in Example 1 will be explained using Figure 11.
[0696] Step 1:
[0697] The user takes a picture of their face using the device's camera. The input is the face image acquired through the camera. Specifically, the user launches the camera app and correctly frames their face on the screen to take a picture. This image must be high resolution as it will be used in subsequent analysis processing.
[0698] Step 2:
[0699] The device sends the captured facial image to the server. The input is the captured facial image, and the output is the digital image data sent to the server. The device uploads the image to the server via the internet using the HTTP or HTTPS protocol. This transmission process is initiated when the user taps the "Send" button.
[0700] Step 3:
[0701] The server analyzes the received facial images. The input is a facial image sent from the terminal, and the output is digitized features. The server uses an AI model to analyze the image and extract features such as facial shape, skin tone, and the position of the eyes and mouth. The specific techniques used here are image processing algorithms using libraries such as TensorFlow and PyTorch.
[0702] Step 4:
[0703] The server compares the extracted features with historical information in the database. The input is the features, and the output is a suggestion of suitable beauty methods. The server uses SQL or NoSQL databases to search for past makeup styles with similar features and refers to particularly successful cases. Through this process, the user receives the most suitable beauty suggestions.
[0704] Step 5:
[0705] The server uses a generative AI model to generate suggestions as text and provide them to the user. The input is the result of matching with a database, and the output is a suggestion of beauty treatments expressed in natural language. The server inputs a prompt sentence into the generative AI model and generates a suggestion in the format of, "Generate a beauty treatment based on features extracted from the user's facial image. The features include a round face shape and a light ochre skin tone."
[0706] Step 6:
[0707] The server creates visual guides based on the suggestions, using videos and images. The input is the suggested beauty method, and the output is a digital guide showing the makeup process that the user can view. Adobe Premiere Pro and Final Cut Pro are used for video editing. The generated guides are provided as links for the user to use.
[0708] Step 7:
[0709] Users try out makeup based on beauty suggestions received from the server and send their results and impressions as feedback to the server. The input is the user's actions, and the output is feedback data. Users enter their results into a dedicated feedback form and tap the "Submit" button to return the information to the server.
[0710] Step 8:
[0711] The server receives feedback from users and updates the database. The input is the feedback data, and the output is the updated database. The feedback is used as further training data for the AI model, contributing to improved accuracy of suggestions. This improves the overall system performance.
[0712] (Application Example 1)
[0713] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0714] In today's beauty market, it is difficult for consumers to find makeup techniques and beauty products that suit them, and opportunities to actually try out cosmetics in stores to see which ones are right for them are limited. This problem contributes to decreased consumer satisfaction. Furthermore, the lack of convenient ways to try out and purchase suggested makeup techniques and beauty products means that consumer purchasing intent is not being fully stimulated.
[0715] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.
[0716] In this invention, the server includes a device for acquiring a user's facial image, a device for analyzing the acquired facial image and extracting facial features, a device for comparing the extracted features with past data and proposing a suitable beauty method, a device for selecting consumer goods related to the proposed beauty method and providing means for purchasing them, and means for trying out the proposed consumer goods in a store. This enables the user to find the makeup method and beauty products that are best suited to them, effectively try them out, and then make an appropriate purchasing decision.
[0717] An "apparatus" is a machine or device configured to perform a specific function.
[0718] A "server" is a computer system that provides information and services to other computers via a network.
[0719] "User" refers to an individual or group that uses the system or service.
[0720] A "face image" is a still image or video data of a person's face.
[0721] A "feature" is a numerical value or indicator that represents a specific pattern or trend in data analysis.
[0722] "Matching" is the process of comparing one dataset with another to identify similarities and differences.
[0723] "Beauty treatments" refer to a series of procedures and techniques used to improve the appearance of the face and body according to specific purposes.
[0724] "Consumer goods" are products that consumers purchase for personal use or consumption.
[0725] "Trial use" refers to the act of actually using a product or service before purchasing it to check its effectiveness and how it feels to use.
[0726] "Means of purchase" refers to the methods and processes used to buy goods or services.
[0727] To implement this invention, a system is constructed that primarily utilizes a user terminal, a server, and an AI model. The user terminal is equipped with a camera and used for image acquisition. The user takes a picture of their face using the terminal's camera and uploads it to the server via the internet.
[0728] The server analyzes the received facial images using an AI model. The AI model is built on frameworks such as TensorFlow and PyTorch, and it extracts features from the facial images. This model extracts elements such as facial shape, skin tone, and the relative positions of facial features such as eyes and mouth as numerical data.
[0729] The server compares the extracted features with a pre-stored database. This database contains past makeup styles and success stories, and uses this information to suggest appropriate beauty methods. These suggestions include specific makeup steps and recommended cosmetics. Furthermore, the server generates links to provide purchasing options, allowing users to easily buy the suggested cosmetics.
[0730] Furthermore, when used in stores, it allows users to actually try out the suggested consumer goods, improving the convenience of making immediate purchases.
[0731] For example, when a user accesses the system and sends a facial image, the server immediately analyzes the facial contours and skin tone and suggests products such as "light beige foundation" and "peach pink lipstick." This information is visually displayed on the user's device, allowing consumers to try it immediately. An example of a prompt message would be: "Based on the user's facial data, please suggest the most suitable makeup method. Please provide detailed instructions on the cosmetics to use and the application procedure, taking into account skin tone, face shape, and eye features."
[0732] The flow of a specific process in Application Example 1 will be explained using Figure 12.
[0733] Step 1:
[0734] The user uses the device's camera to capture an image of their face. At this point, the input is the camera image, and the output is the facial image data. The user saves this facial image and prepares to upload it to the system.
[0735] Step 2:
[0736] The terminal sends the acquired facial image to the server. The input is the user's facial image, and the output is the image data transferred to the server. This image data reaches the server via the internet.
[0737] Step 3:
[0738] The server analyzes the received image data by running it through an AI model. The input is the user's facial image data, and the output is facial features. Specifically, it analyzes the facial contour, eye position, and skin tone, and extracts them as numerical data. An AI generative model is used for this analysis.
[0739] Step 4:
[0740] The server compares the extracted features with existing data. The input is facial features, and the output is a profile of suitable beauty treatments. In this process, the current features are compared with past beauty profiles in the database to select the most suitable beauty suggestion.
[0741] Step 5:
[0742] The server suggests beauty methods to the user and generates data to display them visually. The input is a profile of the beauty method, and the output is a visual guide and a list of cosmetics. The visualized suggestions are sent to the user's terminal, which the user receives and confirms.
[0743] Step 6:
[0744] Based on the presented beauty methods, users can try out consumer goods within the store. The input is a visual guide and information about the consumer goods, and the output is the trial result. Through this process, users can gain a feel for actually using the consumer goods.
[0745] Step 7:
[0746] The server receives user feedback and updates the database. Input is user satisfaction and feedback based on usability, and output is the updated database. This feedback contributes to improving the accuracy of future suggestions.
[0747] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.
[0748] This invention is a system that uses AI technology to analyze a user's facial image and expressions, determine their emotional state, and suggest the optimal makeup application. The system consists of a user's terminal and a server, and includes an emotion engine to provide a more personalized experience.
[0749] First, the user takes a picture of their face with the device's camera and uploads it to the system. The device then sends the image data to the server. At this point, the image may be tagged with a digital tag to identify the facial attributes.
[0750] Next, the server receives the facial image, performs image analysis using an AI model, and extracts facial features. Simultaneously, the emotion engine performs facial expression analysis to identify the user's emotional state. This emotional state provides a means to determine how the user is feeling, whether they are preparing for a specific event, etc.
[0751] The server then simultaneously considers the extracted features and emotional states and compares them with the database. The database contains past success stories along with makeup techniques based on diverse facial features and emotions. Based on this, the server suggests the most suitable makeup technique for the user.
[0752] The suggestions include specific cosmetic product selections and application procedures, tailored to the user's emotional state. Visual guides are also included to help users understand the makeup process more easily.
[0753] Furthermore, the device displays makeup suggestions received from the server and provides links that allow users to easily purchase the suggested cosmetics. Through these links, users can obtain related products without any hassle.
[0754] After trying out the makeup, users can provide feedback on their experience and impressions. This feedback is sent to the server via their device, and the server uses this information to improve the accuracy of its suggestions by updating its database. This cycle allows the system to continuously learn and enhance the value it provides to users.
[0755] For example, when a user accesses the system to prepare for a date, the server's emotion engine recognizes that the user is in an "excited" state. In this case, the system can meet the user's needs by suggesting makeup techniques appropriate for the situation, such as "glamorous eyeshadow" or "calm-toned lipstick," and providing access to related products.
[0756] The following describes the processing flow.
[0757] Step 1:
[0758] The user takes a picture of their face with the device's camera and uploads it to the system. The device then provides an interface where the user can select a situation or purpose along with the face image.
[0759] Step 2:
[0760] The device sends facial image data and user selection information to the server. The data is encrypted during transmission to ensure secure transmission.
[0761] Step 3:
[0762] The server processes the received facial images and extracts facial features using an AI model. Specifically, the shape of the contours, skin tone, and the position of facial features are analyzed.
[0763] Step 4:
[0764] The server simultaneously activates an emotion engine and recognizes the user's emotional state from their facial expressions through image analysis. The emotional state is quantified into categories such as smile, anger, and surprise.
[0765] Step 5:
[0766] The server compares the extracted facial features with the recognized emotional state against a database. The database contains various emotional states and makeup techniques suited to different face shapes.
[0767] Step 6:
[0768] Based on the matching results, the server adjusts and suggests the most suitable makeup routine for the user. This suggestion reflects appropriate color choices and styles according to the user's mood and is provided along with a list of specific cosmetics.
[0769] Step 7:
[0770] The terminal displays makeup suggestions from the server to the user, along with a visual makeup guide. It also provides links to purchase the suggested cosmetics.
[0771] Step 8:
[0772] The user performs the makeup application and enters feedback about the results into the terminal. The terminal then sends this feedback to the server.
[0773] Step 9:
[0774] The server analyzes the received feedback and updates the database. The system continues to learn by utilizing the feedback to improve the accuracy of suggestions to other users.
[0775] (Example 2)
[0776] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0777] Conventional makeup application suggestion systems often failed to provide personalized suggestions based on the user's emotional state, tending instead to offer generic advice. This made it difficult to provide advice optimized for the user's specific situation or mood. Furthermore, the system lacked effective mechanisms for suggesting related products, making it difficult for users to easily select and purchase them.
[0778] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.
[0779] In this invention, the server includes means for acquiring the user's visual information, means for analyzing the acquired visual information to extract facial attributes, and means for identifying the emotional state based on the extracted attributes. This makes it possible to propose an optimal makeup method that simultaneously considers the user's facial features and emotional state.
[0780] "Visual information" refers to digital information such as images and videos of the user's face.
[0781] "Facial attributes" refer to various data that describe facial features, including age, gender, and the shape of the eyes and mouth.
[0782] "Emotional state" refers to a psychological state obtained by analyzing the user's facial expressions, and includes emotions such as joy, sadness, and surprise.
[0783] "Knowledge" refers to the contents of databases collected and accumulated in the past, and includes information on makeup techniques and their effects.
[0784] "Makeup techniques" refer to the methods and procedures for applying makeup to the face, including the cosmetics used and the order in which they are applied.
[0785] "General-purpose products" refer to commercially available cosmetics and related products that are proposed for specific situations.
[0786] "Evaluation" refers to users' opinions and impressions of the proposed makeup methods, and includes feedback information used to improve the system.
[0787] This system provides technology to analyze a user's facial image, determine their emotional state, and suggest the most suitable makeup application. First, the user takes a picture of their face with their device's camera and uploads it to the system via a dedicated application or web interface. When the device sends this facial image to the server, it may attach a digital tag to identify facial attributes. The transmitted image data is received and analyzed by the server.
[0788] The server performs image analysis using advanced AI models (for example, face recognition models using TensorFlow or PyTorch). In this process, facial features are extracted, and an emotion engine further analyzes facial expressions to identify the user's emotional state. The server then compares the extracted facial features and emotional state with a database. The database contains a wealth of makeup techniques based on diverse facial features and emotions, and the server suggests the optimal makeup technique by comparing it with past success stories.
[0789] The suggestions include specific cosmetic product selections and application procedures, which are displayed to the user via their device. The suggestions also include visual guides to help users understand the makeup process. Furthermore, links are provided for easy purchase of the suggested cosmetics, allowing users to acquire related products on the spot.
[0790] For example, when a user uses the system to prepare for an event, the server identifies a "highly anticipated" state through its emotion engine. Based on this information, the server makes suggestions such as "glamorous eyeshadow" or "calm-toned lipstick," and provides purchase links for related products on the terminal, thus quickly and easily meeting the user's needs.
[0791] An example of a prompt for a generative AI model would be: "Please specify a suitable makeup routine when the user is identified as being in an excited state in preparation for an event. Please include specific cosmetics and their application steps." This would allow the system to prompt for appropriate makeup suggestions.
[0792] The flow of the specific processing in Example 2 will be explained using Figure 13.
[0793] Step 1:
[0794] The user takes a picture of their face using the device's camera and uploads the image data to the system via a dedicated application or web interface. Here, the input is the face image, and the output is the image data uploaded to the system. This image is then sent to a server for post-processing.
[0795] Step 2:
[0796] The device uses a basic face detection algorithm to locate faces in uploaded facial images and adds digital tags. The input is a user's facial image, and the output is image data with facial location information added. This data is then transmitted to the server using the communication function.
[0797] Step 3:
[0798] The server receives image data sent from the terminal and performs image analysis using an advanced AI model. The input is tagged image data received from the terminal, and the output is facial features obtained through image analysis. Specifically, it performs feature extraction using an AI model (for example, a TensorFlow-based facial recognition model).
[0799] Step 4:
[0800] The server analyzes facial expressions using an emotion engine based on facial features to identify the user's emotional state. The input is facial features, and the output is the identified emotional state. This process allows for the identification of the user's psychological state.
[0801] Step 5:
[0802] The server accesses an internal database to match facial features with identified emotional states. The input consists of facial features and emotional states, and the output is a suggestion of the optimal makeup application based on the matching results. The database contains data on past success stories and recommended makeup applications.
[0803] Step 6:
[0804] The server generates a suggested makeup application method deemed optimal and prepares its details (selection of cosmetics and application procedures). Input is database information as a result of matching, and output is the specific suggested makeup application method, including visual guidance.
[0805] Step 7:
[0806] The terminal displays makeup suggestions received from the server to the user. The input is makeup suggestion data from the server, and the output is specific makeup instructions and purchase links shown to the user. A user-friendly design is used for the display.
[0807] Step 8:
[0808] Users try out makeup based on the suggested methods and input their results and impressions as feedback. The input is the user's feedback, and the output is its transmission to the server. This feedback is used to update the database and improve the accuracy of suggestions.
[0809] (Application Example 2)
[0810] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".
[0811] Currently, makeup application systems simply offer fixed suggestions based on facial features, making it difficult to provide personalized suggestions that take into account the user's emotional state. Furthermore, feedback systems for improving the accuracy and applicability of these suggestions are insufficient. Therefore, there is a need for more sophisticated makeup application methods that respond to user needs and emotions.
[0812] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.
[0813] In this invention, the server includes means for acquiring user facial information, means for analyzing the acquired facial information to identify emotional states, means for generating adaptive makeup methods based on emotional states, means for selecting products related to the generated makeup methods and providing methods for acquiring them, and means for presenting a visual guide to the makeup methods using a visual imitation device. This makes it possible to suggest more personalized makeup methods that are tailored to the user's emotions and individual characteristics. Furthermore, the accuracy of the suggestions can be improved through a feedback function, thereby enhancing the quality of the user experience.
[0814] "User facial information" refers to data related to the user's facial image and expressions, which is acquired through cameras and sensors.
[0815] "Identifying emotional states" refers to the process of analyzing and identifying the user's feelings and mood from acquired facial information.
[0816] "Generating adaptive makeup techniques" refers to calculating and determining the optimal makeup techniques and product selections based on the user's emotional state.
[0817] "Selecting products and providing methods for obtaining them" means selecting suitable products based on the generated cosmetic formula and providing users with information on how to obtain those products.
[0818] A "visual imitation device" is a device that visually reproduces the steps and results of a makeup application in real time and presents them to the user in an easy-to-understand manner.
[0819] The system necessary to implement this invention mainly consists of a terminal, a server, and a visual imitation device. First, the terminal is equipped with a high-precision camera and sensors, which are used to acquire the user's facial information. The facial information is collected in real time and then transmitted to the server.
[0820] The server uses a TensorFlow-based facial expression recognition model to process the received facial information. This process identifies the user's emotional state through data analysis. Based on the recognized emotional state, the server generates optimal suggestions from a variety of makeup techniques stored in the database.
[0821] Furthermore, products corresponding to the generated makeup method are selected, and how users can obtain those products is indicated. Product information is often provided as links to e-commerce sites such as online stores.
[0822] Furthermore, the server provides a visual guide to the generated makeup techniques using a visual imitation device equipped with AR technology such as MirageXR, to visually demonstrate the techniques to the user. This allows the user to easily understand and practice the proposed makeup techniques.
[0823] As a concrete example, when a user takes a photo of their face with their device before attending a social event, the server recognizes it as an "expression of joy." In this case, the server suggests "makeup that gives a glamorous and positive impression" and selects cosmetics for it. The visual imitation device demonstrates the makeup technique using an avatar, and the user can proceed with the makeup application according to the steps.
[0824] An example of a prompt to input into a generative AI model would be, "What makeup technique is appropriate when the user's facial expression is 'joyful'?"
[0825] The flow of a specific process in Application Example 2 will be explained using Figure 14.
[0826] Step 1:
[0827] The device uses a high-precision camera to acquire the user's facial information. It takes a real-time image of the user's face as input and sends this as image data to the server.
[0828] Step 2:
[0829] The server analyzes the received face image using a TensorFlow-based facial expression recognition model. The input is the face image data sent in step 1, and the data is processed to extract facial features in order to identify the emotional state. The output is the identification result indicating the user's emotional state.
[0830] Step 3:
[0831] The server generates adaptive makeup techniques from a database based on identified emotional states. It uses the identified emotional state as input and performs data calculations by matching it against various makeup patterns in the database. The output is a suggestion of the optimal makeup technique.
[0832] Step 4:
[0833] The server selects products related to the generated cosmetic formula and sends the product information to the terminal. Based on the generated cosmetic formula data as input, it searches the online store database and performs data processing to select product information, including how to obtain it. The output is a proposal that includes product information links.
[0834] Step 5:
[0835] The visual imitation device uses AR technology such as MirageXR to present the user with a visual guide to makeup application. It takes suggested makeup application content as input and performs real-time video processing for visualization. The output is an AR-based makeup application guide presented to the user.
[0836] Step 6:
[0837] The user selects their desired cosmetic product based on the presented product information and proceeds with the purchase. Product information links are used as input, and the selection and purchase process takes place on the online store. The output is confirmation information for the purchased product.
[0838] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.
[0839] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.
[0840] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.
[0841] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.
[0842] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.
[0843] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.
[0844] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.
[0845] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.
[0846] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."
[0847] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.
[0848] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.
[0849] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.
[0850] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.
[0851] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.
[0852] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.
[0853] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.
[0854] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.
[0855] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.
[0856] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.
[0857] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.
[0858] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.
[0859] The following is further disclosed regarding the embodiments described above.
[0860] (Claim 1)
[0861] A means of obtaining a user's facial image,
[0862] A means for analyzing acquired facial images and extracting facial features,
[0863] A method for proposing a suitable cosmetic method by comparing extracted features with past databases,
[0864] A system that includes means for selecting products related to a proposed cosmetic method and providing a method for purchasing them.
[0865] (Claim 2)
[0866] The system according to claim 1, further comprising means for generating a visual guide for the proposed makeup method.
[0867] (Claim 3)
[0868] The system according to claim 1, further comprising means for obtaining user feedback to update the database and improve the accuracy of suggestions.
[0869] "Example 1"
[0870] (Claim 1)
[0871] A means of obtaining a user's facial image,
[0872] A method for analyzing acquired facial images and extracting features by quantifying facial shape, skin tone, and the positions of eyes and mouth,
[0873] A means of proposing a suitable beauty method by comparing extracted features with past information,
[0874] A means of selecting products based on the proposed beauty method and providing information on their purchase,
[0875] A system that includes means of using a generative AI model that generates proposed content using natural language processing technology.
[0876] (Claim 2)
[0877] The system according to claim 1, which generates a visual guide of a proposed beauty method and edits it as an image or video.
[0878] (Claim 3)
[0879] The system according to claim 1, which obtains user feedback and updates information to improve the accuracy of suggestions.
[0880] "Application Example 1"
[0881] (Claim 1)
[0882] A device that acquires the user's facial image,
[0883] A device that analyzes acquired facial images and extracts facial features,
[0884] A device that compares extracted features with past data to propose a suitable beauty method,
[0885] A device that selects consumer goods related to the proposed beauty method and provides a means of purchasing them,
[0886] A system that includes means for customers to try out consumer goods offered within a store.
[0887] (Claim 2)
[0888] The system according to claim 1, further comprising a device for generating visual instructions for a proposed beauty method.
[0889] (Claim 3)
[0890] The system according to claim 1, further comprising a device for obtaining user feedback, updating data, and improving the accuracy of suggestions.
[0891] "Example 2 of combining an emotion engine"
[0892] (Claim 1)
[0893] Means for acquiring the user's visual information,
[0894] A means of analyzing acquired visual information to extract facial attributes,
[0895] A means for identifying emotional states based on extracted attributes,
[0896] A method for suggesting the optimal makeup method by comparing identified emotional states and facial attributes with past knowledge,
[0897] A system including means for selecting general-purpose products related to a proposed cosmetic method and providing a method for obtaining them.
[0898] (Claim 2)
[0899] The system according to claim 1, further comprising means for visually demonstrating the proposed cosmetic method.
[0900] (Claim 3)
[0901] The system according to claim 1, further comprising means for obtaining user feedback to update knowledge and improve the accuracy of proposals.
[0902] "Application example 2 when combining with an emotional engine"
[0903] (Claim 1)
[0904] Means for obtaining user facial information,
[0905] A means of analyzing acquired facial information to identify emotional states,
[0906] A means for generating adaptive makeup techniques based on emotional states,
[0907] A means of selecting products related to the generated cosmetic method and providing a method for obtaining them,
[0908] A means of presenting a visual guide to makeup techniques using a visual imitation device,
[0909] ...
[0910] A system that includes this.
[0911] (Claim 2)
[0912] The system according to claim 1, which obtains information from a database corresponding to the emotional state of the user and performs a method to improve the accuracy of the makeup application.
[0913] (Claim 3)
[0914] The system according to claim 1, which has a function to improve the accuracy of cosmetic method suggestions by collecting evaluations from users and updating the database. [Explanation of Symbols]
[0915] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>
Claims
1. A device that acquires the user's facial image, A device that analyzes acquired facial images and extracts facial features, A device that compares extracted features with past data to propose a suitable beauty method, A device that selects consumer goods related to the proposed beauty method and provides a means of purchasing them, A system that includes means for customers to try out consumer goods offered within a store.
2. The system according to claim 1, further comprising a device for generating visual instructions for a proposed beauty method.
3. The system according to claim 1, further comprising a device that obtains user feedback, updates data, and improves the accuracy of suggestions.