system

A system automatically classifies and integrates digital content with supplementary information, generating unified digital books stored in cloud storage for seamless access and sharing, addressing the challenges of managing diverse and analog content.

JP2026105350APending Publication Date: 2026-06-26SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-16
Publication Date
2026-06-26

AI Technical Summary

Technical Problem

Users face challenges in centrally organizing and saving memories recorded in multiple devices and formats, handling analog content that has not been digitized, integrating it with auxiliary information, and ensuring digital books follow device updates.

Method used

A system that automatically classifies user-uploaded digital content into a unified format, retrieves supplementary information from external databases, and generates digital books considering user profiles and life journeys, storing them in cloud storage for optimal management and accessibility.

Benefits of technology

Enables users to view and share valuable records seamlessly across devices without manual effort, integrating memories with supplementary information and ensuring consistency across device updates.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure 2026105350000001_ABST
    Figure 2026105350000001_ABST
Patent Text Reader

Abstract

We provide the system. [Solution] A means of automatically classifying uploaded information based on its attributes, A means of converting classified information into an integrated format, Means of obtaining additional information on relevant information and historical background from external sources, A means of generating digital records that take into account the user's characteristics and history, A means of providing the generated digital record to the user's device in a viewable format, A method for adding narration and music to digital content using a generative AI model to create visual deliverables based on a storyline, A system that includes this.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] The problem to be solved by the present invention is that it is difficult for a user to centrally organize and save memories recorded in a plurality of different devices and formats in a valuable form. Further, it is also a problem to be solved that it is difficult to handle analog content that has not been digitized and to integrate it with auxiliary information such as regional information and background of the era. In addition to this, it is also necessary that the digital book follows the user's device update and is managed in an optimal state.

Means for Solving the Problems

[0005] This invention provides a system that automatically classifies user-uploaded digital content according to its format and converts it to a unified format as needed. The system also includes means for retrieving supplementary information such as regional and historical context from external databases and generating digital books while considering the user's profile and life journey. Furthermore, the generated digital books are stored in cloud storage and automatically optimized when the user's device is updated. This allows users to view and share valuable records with family and friends anytime, anywhere, without cumbersome manual work.

[0006] "Uploading" refers to the act of a user transferring digital data from their device to a server.

[0007] "Digital data" refers to information that is stored in an electronic format and can be read by a computer.

[0008] "Classification" is the process of dividing uploaded digital data into specific categories based on its format and content.

[0009] "Conversion" refers to the process of changing one digital data format to another.

[0010] An "external database" refers to a data storage system located outside the system that stores a large amount of information about local areas and historical context.

[0011] "Local information" refers to information related to a specific geographical area, including its history and local characteristics.

[0012] "Historical context" refers to events and social conditions related to a particular era.

[0013] A "profile" refers to data that compiles basic information and background details about a user.

[0014] "Life's journey" refers to information that records a user's life history and important events in chronological order.

[0015] A "digital book" is a book or album created in electronic format that contains photographs and text.

[0016] "Cloud storage" is a remote digital storage service that allows you to save and manage data via the internet.

[0017] "Optimization" is the act of adjusting data or processing so that it performs best under specific conditions.

[0018] An "interface" refers to the point of contact between a user and a system for exchanging data and instructions. [Brief explanation of the drawing]

[0019] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8]It is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] It shows an emotion map to which a plurality of emotions are mapped. [Figure 10] It shows an emotion map to which a plurality of emotions are mapped. [Figure 11] It is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] It is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] It is a sequence diagram showing the processing flow of the data processing system in Example 2 when the emotion engine is combined. [Figure 14] It is a sequence diagram showing the processing flow of the data processing system in Application Example 2 when the emotion engine is combined.

Mode for Carrying Out the Invention

[0020] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0021] First, the terms used in the following description will be explained.

[0022] In the following embodiments, a processor with a reference number (hereinafter simply referred to as "processor") may be one arithmetic unit or a combination of a plurality of arithmetic units. Further, the processor may be one type of arithmetic unit or a combination of a plurality of types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0023] In the following embodiments, signed RAM (Random Access Memory) is a memory that temporarily stores information and is used as work memory by the processor.

[0024] In the following embodiments, the signed storage is one or more non-volatile storage devices that store various programs and various parameters. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes.

[0025] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0026] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0027] [First Embodiment]

[0028] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0029] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0030] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0031] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0032] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0033] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0034] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0035] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0036] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0037] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0038] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0039] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0040] To implement this invention, the process begins with the user uploading digital content to a server using their digital device (e.g., a smartphone or PC). The user selects digitized photos and videos and sends them to the system. In the case of analog content, it is digitized using a dedicated scanner or digitizing application and then uploaded to the server.

[0041] The server automatically analyzes the received digital data and classifies it based on its format. This classification is performed by utilizing content metadata to determine the date and time of capture, location, and type of content. Next, the server uses generative AI to convert data in different formats into a consistent format and optimize it. This is done to adjust the data according to the viewing environment and improve user convenience.

[0042] The server also communicates with external databases to retrieve additional information about local area and historical context. This information is linked to the user's profile and life journey and integrated into the digital book. As a result, the user's personal history and memories are compiled in a richer and more meaningful way. The generative AI naturally combines this information, taking care to maintain consistency in the visuals and narratives that the user desires.

[0043] The generated digital books are securely stored in cloud storage and can be viewed at any time by users using a dedicated application or web browser. Users can add comments to the digital books or share specific pages with friends and family as needed.

[0044] For example, when a user uploads family photos and videos taken over many years, the images are organized chronologically, and AI adds supplementary information such as important events and trivia about the locations where the photos were taken during that period. When users have these digital books, they can not only look back on their own history but also preserve them as valuable records to pass on to future generations.

[0045] The following describes the processing flow.

[0046] Step 1:

[0047] Users upload digital content (photos, videos, etc.) to the server using digital devices. At the same time, users scan and digitize analog content as needed and upload it in the same manner.

[0048] Step 2:

[0049] The server analyzes the received digital data and automatically identifies the file format (JPEG, PNG, MP4, etc.). Then, the server uses metadata (e.g., date and time of shooting, location) to classify and categorize the data.

[0050] Step 3:

[0051] The server uses generation AI to perform the necessary conversions and optimizations to standardize the format and file size of digital content. Specifically, it converts images and videos of different formats into a specified unified format and optimizes them according to the viewing environment.

[0052] Step 4:

[0053] The server queries an external database to retrieve regional and historical context information related to the uploaded content. This information is used in the generation process, adding value to the digital book.

[0054] Step 5:

[0055] The server uses a generation AI to integrate the user's profile information, life history, and acquired supplementary information to generate a digital book. This process applies design templates and arranges photos and text harmoniously.

[0056] Step 6:

[0057] The generated digital books are stored in cloud storage by the server, allowing users to access them anytime, anywhere. Furthermore, automatic adjustments are made to ensure that the digital books remain in the correct format as the user's device is updated.

[0058] Step 7:

[0059] Users can access, view, and edit digital books through a dedicated application or web browser. They can also share specific pages and add new information and media.

[0060] (Example 1)

[0061] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0062] As the volume of electronic data increases, there is a growing need to manage it efficiently and provide information in a format that is easily accessible to users. In particular, when diverse data formats or supplementary information are required, there is a lack of unified means to handle them, which impairs user convenience. In addition, providing data in a way that connects past information with present information is an important element for deepening user understanding, but there is a lack of effective means to achieve this.

[0063] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0064] In this invention, the server includes means for automatically classifying uploaded electronic data based on attributes, means for converting the classified electronic data into unified attributes, and means for obtaining location information and supplementary information such as historical context from an external database. This enables efficient management and provision of diverse data, and allows users to search and view electronic data in a unified format. Furthermore, by associating additional information with the user's characteristics and history, a deeper understanding and the provision of valuable information can be achieved.

[0065] "Electronic data" refers to a collection of information stored or transmitted in digital format, including photographs, videos, and documents.

[0066] "Attributes" refer to specific characteristics or properties of data, and include information that serves as a basis for classification and transformation.

[0067] An "external database" is a collection of information located outside the system, and is particularly used to provide supplementary information such as regional information and historical context.

[0068] "Supplemental information" refers to details and background information added to basic information, providing additional value and context to the data.

[0069] A "generative AI model" refers to a structure or data processing method generated by artificial intelligence, and is used in the creation of ebooks and other similar materials.

[0070] An "eBook" is a book presented in digital format, integrating text, images, and other media formats.

[0071] The "connection interface" refers to the interface through which a user interacts with the system, and includes means for displaying and manipulating information.

[0072] "Remote storage" refers to storage located in the cloud, providing a mechanism for securely and efficiently storing data.

[0073] To implement this invention, users need to transmit digital content to a server using their own electronic devices (such as smartphones or computers). Users select photos and videos they have saved and upload the data to the server via the internet. If analog data exists, it is digitized using a scanner or digitizing app and then transmitted to the server.

[0074] The server analyzes the received digital data and automatically classifies it based on its attributes. Digital analysis algorithms are used for metadata analysis, classifying the data by date and time of capture, location, and content. Image recognition technology may also be applied during this process. The classified data is then formatted into consistent attributes using a generative AI model. The generative AI model supports data format optimization and conversion, ensuring smooth display across different device environments.

[0075] In parallel, the server collaborates with external information sources to obtain relevant location information and supplementary historical context. For example, it can add historical background and general knowledge about a region to a photograph taken in that area. This gives user-uploaded content a richer context.

[0076] Ultimately, the server utilizes a generative AI model to generate an ebook from the aggregated information. This ebook is customized to take into account the user's characteristics and history, ensuring visual and narrative consistency. The generated ebook is securely and conveniently stored in remote storage (cloud storage) and can be accessed from the user's electronic device via a dedicated interface.

[0077] As a concrete example, when a user uploads a folder titled "Summer Family Trip," the server analyzes and categorizes the photos within, supplementing them with information about landmarks and events at the travel destination. The resulting ebook becomes a rich resource for reminiscing about the trip. An example of a prompt used in this process might be, "Please add events related to this photo to create a visual story."

[0078] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0079] Step 1:

[0080] Users select digital content using electronic devices and send it to the server. Digital data such as photos and videos are prepared as input and uploaded to the server via an interface. If analog data exists, it is digitized using a scanner or digitizing application. The output is the digital data sent to the server.

[0081] Step 2:

[0082] The server analyzes the received digital data. Based on the input digital data, the server extracts metadata and automatically classifies the data by attributes such as the date and time of capture, location, and device information. Specifically, it uses image recognition technology to identify and classify the content (e.g., people, landscapes). The output is a list of the classified data.

[0083] Step 3:

[0084] The server converts classified digital data into a unified format using a generative AI model. It uses classified data as input and converts it to formats and resolutions suitable for different devices. Specifically, the generative AI model adjusts the size and resolution to ensure optimal display across different viewing environments. The output is the converted data in a unified format.

[0085] Step 4:

[0086] The server connects to external information sources to obtain supplementary information. The input uses location information and historical context obtained from classified data, and based on this, it retrieves relevant information from external databases. Specifically, it adds information such as geographical and historical context to the data. The output is data with the added supplementary information.

[0087] Step 5:

[0088] The server utilizes a generative AI model to generate ebooks based on all acquired information. It uses supplementary data as input to design pages and structure content. Specifically, it constructs the ebook while considering user history and characteristics, maintaining visual and narrative consistency. The final ebook is then generated as output.

[0089] Step 6:

[0090] The server saves the generated e-books to remote storage, making them accessible from the user's device. The generated e-books are used as input and securely stored in cloud storage. Specifically, users can view and share the e-books at any time using a dedicated application or web browser. As output, the e-books are stored in a format accessible to the user.

[0091] (Application Example 1)

[0092] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0093] There is a problem in that users cannot efficiently manage the diverse digital content they possess and utilize it as a consistent visual narrative, thus failing to fully extract the value from individual memories. In addition, there is no easy way for users to share the digital content they generate with family and friends, making it difficult to effectively share memories with others.

[0094] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0095] In this invention, the server includes means for automatically classifying uploaded information based on its attributes, means for converting the classified information into an integrated format, and means for adding narration and music to digital content using a generative AI model to create visual deliverables based on a storyline. This enables users to integrate diverse digital content and manage and share memories in a visually appealing way.

[0096] "Information" refers to digital data owned by a user, including multimedia content such as photographs and videos.

[0097] "Attributes" refer to characteristics and metadata related to content, such as the date and time of shooting, location, and type of content.

[0098] "Unified format" refers to a standard data structure or layout for organizing digital content as a coherent visual narrative.

[0099] A "generative AI model" refers to a system that uses artificial intelligence technology to automatically add relevant information, music, and narration to digital content.

[0100] A "storyline" refers to a framework for structuring content in a visual product as a narrative, either temporally or logically.

[0101] "Visual deliverables" refer to the final product created from the user's digital content in a format that is easy to understand visually.

[0102] The specific system for implementing this invention uses a program that can run on various devices. The server receives digital data uploaded by the user and activates a data classification module, which automatically classifies the digital data based on its attributes. The classified data is converted into an integrated format through a format conversion module, after which a generative AI model begins operation. The generative AI model uses natural language processing technology to automatically add voice narration and background music to the digital content, creating a visual output.

[0103] The server utilizes cloud services to perform these processes and employs open-source libraries (e.g., TENSORFLOW® and PyTorch) for data processing. It also uses cloud services such as AWS® and Google® Cloud for storage. On the user's device, a specific application receives these visual artifacts, allowing them to be displayed, edited, and shared with others through an interface.

[0104] As a concrete example, when a user uploads photos from their summer vacation to the system, the server analyzes the photos and generates a video with narration introducing the shooting locations and summer events. This video is delivered to the user's application via cloud storage, and the user can share the video with family and friends through the app.

[0105] An example of a prompt for a generative AI model would be: "Analyze the data captured by the user and provide narration with historical and geographical information related to the data. Specifically, add information about the history and climate of the beach mentioned as 'a trip to a beach visited in the summer of 2023.'"

[0106] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0107] Step 1:

[0108] Users select digital content from their smartphones or PCs and upload it to the server. The input data consists of photos and videos, and the server receives raw data as output.

[0109] Step 2:

[0110] The server analyzes the metadata of the received data (e.g., date and time of capture, location) and automatically classifies the data based on its attributes. This process utilizes information from a database and employs an attribute clustering algorithm. The output is digital data classified by attribute.

[0111] Step 3:

[0112] The server converts the classified digital data into a unified format. Here, it converts different media formats into a standard data format (e.g., MP4 or JPEG) for unified processing. The input is the classified digital data, and the output is the unified format data.

[0113] Step 4:

[0114] The server uses a generative AI model to add narration and music to digital content. This process employs natural language generation and speech synthesis technologies to generate prompts based on the user's digital data. The input is integrated digital data, and the output is a visual artifact with added music and narration.

[0115] Step 5:

[0116] The generated visual artifacts are saved to cloud storage by the server. Services such as AWS and Google Cloud are used, and the data is managed with backups and security measures in place. The output is the visual artifact stored in the cloud.

[0117] Step 6:

[0118] The user's device downloads visual artifacts from cloud storage and displays them visually within a dedicated application. Here, a user interface is generated via the device's GUI, providing the output. The user can then share this with family and friends. The input is the visual artifact in the cloud, and the output is the artifact displayed on the device.

[0119] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0120] This invention combines a system for organizing and utilizing digital content with an emotion engine that recognizes user emotions and reflects them in content suggestions.

[0121] Users upload digital content such as photos and videos to the server using their own digital devices. If they have analog content, they scan it, digitize it, and upload it to the server as well.

[0122] The server analyzes the uploaded digital data and automatically classifies it based on its format. Generative AI is used to convert and optimize the format and file size of the digital content. Then, supplementary information such as regional and historical context is retrieved from an external database, and this information is integrated with the user's profile and life story. This process generates a digital book.

[0123] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. This information is used in the digital book generation process to suggest content tailored to the user's emotions. For example, if the engine analyzes that the user is feeling nostalgic, it will suggest specific photos and layouts that evoke that emotion.

[0124] The generated digital books are securely stored in cloud storage. Users can access, view, and edit their digital books at any time through a dedicated application or web browser.

[0125] For example, when a user uploads photos and videos documenting their travel experiences, the server organizes them chronologically, adding historical context and related stories about the places visited. Furthermore, when the emotion engine senses the user's level of excitement at the time of upload, it automatically adjusts the color scheme and content placement to enhance that excitement. Ultimately, the user can enjoy a personalized digital book that strongly reflects their emotions.

[0126] The following describes the processing flow.

[0127] Step 1:

[0128] Users select digital content such as photos and videos using their own digital devices and upload them to the server. If analog content is available, they use scanners or digital conversion functions to digitize it before uploading it.

[0129] Step 2:

[0130] The server analyzes the received data and identifies the file format of each piece of content (JPEG, PNG, MP4, etc.). Subsequently, the server uses metadata (date and time of capture, location information, etc.) to classify the content and organize it into categories.

[0131] Step 3:

[0132] The server activates a generation AI to convert data in different formats into a unified format. At the same time, it optimizes the resolution and file size to suit the user's terminal environment, enabling efficient data management.

[0133] Step 4:

[0134] The server connects to an external database to retrieve local information and historical background data related to the uploaded content. This information, along with the user's profile information and life story, is incorporated into the creation of the digital book.

[0135] Step 5:

[0136] The emotion engine activates and analyzes the user's emotions based on past user behavior data and current interactions. Based on this, the server generates recommendations tailored to the user's emotions and creates a digital book that reflects them.

[0137] Step 6:

[0138] The completed digital book is stored in cloud storage by the server. Users can access this digital book from their devices. Furthermore, they can add new information and media, and share it with family and friends.

[0139] Step 7:

[0140] Users view and edit digital books using a dedicated application or web browser. The emotional feedback users provide to the system is continuously stored in the emotion engine as learning data and used to suggest content for future use.

[0141] (Example 2)

[0142] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0143] There is a need to efficiently organize the vast amounts of digital content data and provide personalized content suggestions based on the user's emotional state and characteristics. Furthermore, it is essential that the generated digital deliverables are always stored in an up-to-date state and are consistently accessible across multiple devices. Additionally, an interface that allows users to easily share and add information through sharing functions is crucial.

[0144] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0145] This invention includes a server that automatically classifies uploaded digital data according to its format, a server that utilizes generative AI technology to convert and optimize the classified digital data into a unified format, and a server that acquires regional information and historical contextual information from an external storage device. This enables users to efficiently manage vast amounts of digital content and receive personalized content suggestions based on their emotions and characteristics. Furthermore, the generated digital output is always stored in an up-to-date state, providing an environment in which users can consistently access and share it across multiple devices.

[0146] "Uploaded digital data" refers to electronic information such as photos, videos, and audio sent by users from their information terminals to a server.

[0147] "Generative AI technology" is a technique that uses artificial intelligence to automatically perform data format transformation and optimization, and it is a technology that utilizes machine learning models.

[0148] An "external storage device" is an information source that allows a server to access and obtain supplementary information such as regional information and historical context.

[0149] "Digital deliverables" refer to digital books or similar content in electronic format that are generated by a server and provided to the user.

[0150] "Emotional state" refers to the user's feelings at a given time, estimated based on their past actions and interactions.

[0151] An "information terminal" is a device such as a computer or smartphone that a user uses to view, edit, and upload digital data.

[0152] This system is designed to help users efficiently organize and utilize digital content. Users upload digital data such as photos and videos from their information terminals to the server. For analog content, they digitize it using the terminal's scanner before uploading. The server receives this data and performs analysis and optimization using a generative AI model. Specific generative AI technologies used include TensorFlow and PyTorch.

[0153] The server automatically classifies uploaded data based on its format. This process utilizes machine learning clustering algorithms and image recognition technologies. After classification, the data is converted into a unified format using generative AI and optimized for size. The optimized data is integrated with regional and historical contextual supplementary information collected via external storage devices. The server then generates and delivers a digital book as a digital output.

[0154] The sentiment engine is used to estimate a user's emotional state based on their past behavior and current interactions. This information is reflected in the content presented by the server, providing a personalized experience. IBM Watson® Sentiment Analysis API may be used for sentiment analysis.

[0155] The generated digital books are securely stored in cloud storage and can be accessed at any time from the user's device. The digital books can be viewed and edited through a dedicated application or web browser.

[0156] As a concrete example, a user uploads photos and videos from their trip to a server, from which a digital book containing historical information and stories about the places they visited is generated. This digital book's color scheme and layout are optimally adjusted based on an emotion engine that analyzes the user's state of excitement.

[0157] An example of a prompt might be, "Analyze your emotions during your trip, gather information about your destinations, and compile it into a digital book. Include a suggested color palette, especially to express your excitement." This allows users to obtain a personalized digital product that strongly reflects their emotions and experiences.

[0158] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0159] Step 1:

[0160] Users upload digital content such as photos and videos to the server using their information terminals. This serves as input, and the server receives the digital content. Specifically, the HTTP protocol is used to send the files. As output, the server saves the received data to its storage.

[0161] Step 2:

[0162] The server analyzes uploaded digital data and automatically classifies it based on its format. The input is uploaded digital data, and image recognition and metadata analysis are performed. Specifically, image features are extracted using OpenCV, and the classified results are output.

[0163] Step 3:

[0164] The server applies a generative AI model to classified digital data, performing format conversion and size optimization. The input is classified digital data, and format conversion and compression are performed using tools such as TensorFlow and PyTorch. The output is optimized data.

[0165] Step 4:

[0166] The server retrieves regional information and supplementary information about the historical context from external storage devices. The input here is classified digital data, and the process involves retrieving appropriate information via external APIs. Specifically, a REST API is used to retrieve destination-related information, and integrated data is output.

[0167] Step 5:

[0168] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. The input is user behavior history data, which is processed using tools such as the IBM Watson Sentiment Analysis API. The output is the estimated emotional state, which is then used in the next step of the process.

[0169] Step 6:

[0170] The server generates personalized digital artifacts based on the estimated emotional state. Both auxiliary and emotional information are input for digital book generation, and color tones and layouts appropriate to the user's emotions are applied. The output is the final digital book provided to the user.

[0171] Step 7:

[0172] The generated digital book is securely stored in cloud storage. The input is the digital book itself, and the output is the storage location information on the cloud. Users can access this digital book through a dedicated app or web browser, and further content updates are possible by viewing and editing it.

[0173] (Application Example 2)

[0174] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as a "server" and the smart device 14 as a "terminal".

[0175] With the increasing volume of digital content, there is a growing need to organize and effectively utilize personalized content that responds to the emotions of individual users. Traditional methods have struggled to suggest content that reflects the emotional state of users, making it difficult to generate electronic documents that strongly convey individual experiences.

[0176] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0177] In this invention, the server includes means for automatically classifying uploaded electronic information according to its format, means for converting the classified electronic information into a unified format, means for acquiring regional information and auxiliary information such as temporal background from an external storage device, and means for analyzing the user's emotional state using an emotion engine and selecting and editing content. This makes it possible to generate personalized electronic documents that are in line with the user's emotions.

[0178] "Electronic information" refers to all data expressed in digital format, and includes a variety of media such as audio, video, text, and images.

[0179] "Format" refers to attributes that indicate the structure, type, and representation of data, and is an indicator of how data is organized and processed.

[0180] "External storage devices" refer to external databases and cloud storage accessed via the internet or other connections, allowing data to be retrieved from a variety of sources.

[0181] "Local information" refers to data related to a specific geographical area, including information such as geographical names and locations, and cultural and social backgrounds.

[0182] "Temporal context" refers to information about a specific time or era, including historical events, historical background, and characteristic events of each era.

[0183] "User" refers to an individual or legal entity that uses the present invention and is a person who receives the services provided by this system.

[0184] "Personal data" refers to information about an individual, such as the user's name, address, age, and occupation, and includes information that can identify the user.

[0185] "Life history" refers to information about a user's past behavior, experiences, and activities, including their personal history and life events.

[0186] "Electronic documents" refer to books and documents generated in digital format, and are publications that integrate multiple media, including text, images, and videos.

[0187] An "emotion engine" refers to a system or software that analyzes and interprets a user's emotional state and has the function of adjusting content according to the user's emotions.

[0188] "Electrical devices" refer to equipment used for processing, storing, and displaying electronic information, and include smartphones, tablets, and personal computers.

[0189] The system that implements this application operates by uploading digital information to the cloud through an application installed on a user's device, such as a smartphone or tablet. The uploaded electronic information is automatically classified by the server based on its format and converted into a unified format. Furthermore, auxiliary information such as regional information and temporal context is retrieved from an external storage device and used together with the electronic information.

[0190] The server is equipped with an emotion engine that analyzes the user's emotional state based on their past behavior. For this analysis, an emotion analysis library and generative AI models are used. Based on the emotional data analyzed by the emotion engine, content appropriate to the user is selected and edited. As a result, personalized electronic documents are generated.

[0191] The generated electronic documents are stored in network storage and provided in a format viewable on the user's device. This system allows users to obtain a personalized experience tailored to their own emotions.

[0192] As a concrete example, when a user uploads photos or videos taken in a park on a holiday to the application, the system analyzes the feelings of happiness and relaxation experienced at that time. Based on this, it can generate a story-driven video with calming music playing in the background. An appropriate prompt would be, "Please generate an emotional and calming video story that evokes a relaxed mood using photos and videos taken in a park on a holiday."

[0193] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0194] Step 1:

[0195] The user uploads electronic information such as photos and videos to the cloud using their device. The input for this step is the digital data captured by the user, and the output is the data stored in cloud storage. The file structure is modified based on the user's actions.

[0196] Step 2:

[0197] The server automatically classifies electronic information stored in the cloud according to its format. The input is uploaded digital data, and the output is a dataset classified by category. The server uses machine learning algorithms to analyze the data format and assign it to the appropriate category.

[0198] Step 3:

[0199] The server converts the classified data into a unified format. The input for this step is classified digital data, and the output is the converted data in a unified format. The server uses a data conversion tool to convert data in different formats into compatible formats.

[0200] Step 4:

[0201] The server retrieves regional information and temporal contextual information from external storage. The input consists of location information and timestamps contained within the data, while the output is this information along with related auxiliary data. The server retrieves this information by calling an external API.

[0202] Step 5:

[0203] The server analyzes the user's emotional state using an emotion engine. The input is the user's past behavioral data, and the output is the estimated emotional state. The server utilizes an emotion analysis library and a generative AI model to estimate the user's emotions.

[0204] Step 6:

[0205] The server selects and edits content based on emotional states. The input is a dataset containing estimated emotional states and supplementary information, and the output is a personalized electronic document. The server uses an editing algorithm to select appropriate elements that highlight the emotions and generate a digital book.

[0206] Step 7:

[0207] The server stores the generated electronic documents in network storage and prepares them in a format that can be provided to the user's terminal. The input is a personalized electronic document, and the output is a digital book in a format viewable by the user. The server uses prompts to perform necessary format conversions.

[0208] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0209] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include those described above. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions shown by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0210] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0211] [Second Embodiment]

[0212] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0213] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0214] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0215] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0216] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0217] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0218] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0219] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0220] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0221] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0222] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0223] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0224] To implement this invention, the process begins with the user uploading digital content to a server using their digital device (e.g., a smartphone or PC). The user selects digitized photos and videos and sends them to the system. In the case of analog content, it is digitized using a dedicated scanner or digitizing application and then uploaded to the server.

[0225] The server automatically analyzes the received digital data and classifies it based on its format. This classification is performed by utilizing content metadata to determine the date and time of capture, location, and type of content. Next, the server uses generative AI to convert data in different formats into a consistent format and optimize it. This is done to adjust the data according to the viewing environment and improve user convenience.

[0226] The server also communicates with external databases to retrieve additional information about local area and historical context. This information is linked to the user's profile and life journey and integrated into the digital book. As a result, the user's personal history and memories are compiled in a richer and more meaningful way. The generative AI naturally combines this information, taking care to maintain consistency in the visuals and narratives that the user desires.

[0227] The generated digital books are securely stored in cloud storage and can be viewed at any time by users using a dedicated application or web browser. Users can add comments to the digital books or share specific pages with friends and family as needed.

[0228] For example, when a user uploads family photos and videos taken over many years, the images are organized chronologically, and AI adds supplementary information such as important events and trivia about the locations where the photos were taken during that period. When users have these digital books, they can not only look back on their own history but also preserve them as valuable records to pass on to future generations.

[0229] The following describes the processing flow.

[0230] Step 1:

[0231] Users upload digital content (photos, videos, etc.) to the server using digital devices. At the same time, users scan and digitize analog content as needed and upload it in the same manner.

[0232] Step 2:

[0233] The server analyzes the received digital data and automatically identifies the file format (JPEG, PNG, MP4, etc.). Then, the server uses metadata (e.g., date and time of shooting, location) to classify and categorize the data.

[0234] Step 3:

[0235] The server uses generation AI to perform the necessary conversions and optimizations to standardize the format and file size of digital content. Specifically, it converts images and videos of different formats into a specified unified format and optimizes them according to the viewing environment.

[0236] Step 4:

[0237] The server queries an external database to retrieve regional and historical context information related to the uploaded content. This information is used in the generation process, adding value to the digital book.

[0238] Step 5:

[0239] The server uses a generation AI to integrate the user's profile information, life history, and acquired supplementary information to generate a digital book. This process applies design templates and arranges photos and text harmoniously.

[0240] Step 6:

[0241] The generated digital books are stored in cloud storage by the server, allowing users to access them anytime, anywhere. Furthermore, automatic adjustments are made to ensure that the digital books remain in the correct format as the user's device is updated.

[0242] Step 7:

[0243] Users can access, view, and edit digital books through a dedicated application or web browser. They can also share specific pages and add new information and media.

[0244] (Example 1)

[0245] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0246] As the volume of electronic data increases, there is a growing need to manage it efficiently and provide information in a format that is easily accessible to users. In particular, when diverse data formats or supplementary information are required, there is a lack of unified means to handle them, which impairs user convenience. In addition, providing data in a way that connects past information with present information is an important element for deepening user understanding, but there is a lack of effective means to achieve this.

[0247] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0248] In this invention, the server includes means for automatically classifying uploaded electronic data based on attributes, means for converting the classified electronic data into unified attributes, and means for obtaining location information and supplementary information such as historical context from an external database. This enables efficient management and provision of diverse data, and allows users to search and view electronic data in a unified format. Furthermore, by associating additional information with the user's characteristics and history, a deeper understanding and the provision of valuable information can be achieved.

[0249] "Electronic data" refers to a collection of information stored or transmitted in digital format, including photographs, videos, and documents.

[0250] "Attributes" refer to specific characteristics or properties of data, and include information that serves as a basis for classification and transformation.

[0251] An "external database" is a collection of information located outside the system, and is particularly used to provide supplementary information such as regional information and historical context.

[0252] "Supplemental information" refers to details and background information added to basic information, providing additional value and context to the data.

[0253] A "generative AI model" refers to a structure or data processing method generated by artificial intelligence, and is used in the creation of ebooks and other similar materials.

[0254] An "eBook" is a book presented in digital format, integrating text, images, and other media formats.

[0255] The "connection interface" refers to the interface through which a user interacts with the system, and includes means for displaying and manipulating information.

[0256] "Remote storage" refers to storage located in the cloud, providing a mechanism for securely and efficiently storing data.

[0257] To implement this invention, users need to transmit digital content to a server using their own electronic devices (such as smartphones or computers). Users select photos and videos they have saved and upload the data to the server via the internet. If analog data exists, it is digitized using a scanner or digitizing app and then transmitted to the server.

[0258] The server analyzes the received digital data and automatically classifies it based on its attributes. Digital analysis algorithms are used for metadata analysis, classifying the data by date and time of capture, location, and content. Image recognition technology may also be applied during this process. The classified data is then formatted into consistent attributes using a generative AI model. The generative AI model supports data format optimization and conversion, ensuring smooth display across different device environments.

[0259] In parallel, the server collaborates with external information sources to obtain relevant location information and supplementary historical context. For example, it can add historical background and general knowledge about a region to a photograph taken in that area. This gives user-uploaded content a richer context.

[0260] Ultimately, the server utilizes a generative AI model to generate an ebook from the aggregated information. This ebook is customized to take into account the user's characteristics and history, ensuring visual and narrative consistency. The generated ebook is securely and conveniently stored in remote storage (cloud storage) and can be accessed from the user's electronic device via a dedicated interface.

[0261] As a concrete example, when a user uploads a folder titled "Summer Family Trip," the server analyzes and categorizes the photos within, supplementing them with information about landmarks and events at the travel destination. The resulting ebook becomes a rich resource for reminiscing about the trip. An example of a prompt used in this process might be, "Please add events related to this photo to create a visual story."

[0262] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0263] Step 1:

[0264] Users select digital content using electronic devices and send it to the server. Digital data such as photos and videos are prepared as input and uploaded to the server via an interface. If analog data exists, it is digitized using a scanner or digitizing application. The output is the digital data sent to the server.

[0265] Step 2:

[0266] The server analyzes the received digital data. Based on the input digital data, the server extracts metadata and automatically classifies the data by attributes such as the date and time of capture, location, and device information. Specifically, it uses image recognition technology to identify and classify the content (e.g., people, landscapes). The output is a list of the classified data.

[0267] Step 3:

[0268] The server converts classified digital data into a unified format using a generative AI model. It uses classified data as input and converts it to formats and resolutions suitable for different devices. Specifically, the generative AI model adjusts the size and resolution to ensure optimal display across different viewing environments. The output is the converted data in a unified format.

[0269] Step 4:

[0270] The server connects to external information sources to obtain supplementary information. The input uses location information and historical context obtained from classified data, and based on this, it retrieves relevant information from external databases. Specifically, it adds information such as geographical and historical context to the data. The output is data with the added supplementary information.

[0271] Step 5:

[0272] The server utilizes a generative AI model to generate ebooks based on all acquired information. It uses supplementary data as input to design pages and structure content. Specifically, it constructs the ebook while considering user history and characteristics, maintaining visual and narrative consistency. The final ebook is then generated as output.

[0273] Step 6:

[0274] The server saves the generated e-books to remote storage, making them accessible from the user's device. The generated e-books are used as input and securely stored in cloud storage. Specifically, users can view and share the e-books at any time using a dedicated application or web browser. As output, the e-books are stored in a format accessible to the user.

[0275] (Application Example 1)

[0276] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0277] There is a problem in that users cannot efficiently manage the diverse digital content they possess and utilize it as a consistent visual narrative, thus failing to fully extract the value from individual memories. In addition, there is no easy way for users to share the digital content they generate with family and friends, making it difficult to effectively share memories with others.

[0278] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0279] In this invention, the server includes means for automatically classifying the uploaded information based on attributes, means for converting the classified information into an integrated format, and means for adding narration or music to digital content using a generative AI model and creating a visual output based on a storyline. As a result, users can integrate various digital contents and manage and share memories in a visually appealing form.

[0280] "Information" refers to digital-form data held by users and refers to multimedia content such as photos and videos.

[0281] "Attribute" refers to features or metadata related to content and refers to the shooting date and time, location, type of content, etc.

[0282] "Integrated format" refers to a standard data structure or layout for organizing digital content as a coherent visual story.

[0283] "Generative AI model" refers to a system that automatically adds relevant information, music, and narration to digital content using artificial intelligence technology.

[0284] "Storyline" refers to a framework for logically or temporally structuring content as a story in a visual output.

[0285] "Visual output" refers to the final product in a form that is easy to understand visually and is created based on the user's digital content.

[0286] The specific system for implementing this invention uses programs that can operate on various devices. The server receives digital data uploaded by users and activates the data classification module. Here, the digital data is automatically classified based on the attributes of the data. The classified data is converted into an integrated format through the format conversion module, and then the generated AI model starts operating. The generated AI model uses natural language processing technology to automatically add voice narration and background music to the digital content and compose visual outputs.

[0287] When performing these processes, the server utilizes cloud services and uses open-source libraries (e.g., TensorFlow and PyTorch) for data processing. For storage, cloud services such as AWS and Google Cloud are used. On the user's terminal, a specific application can receive these visual outputs and enables display, editing, and sharing with others through the interface.

[0288] As a specific example, when a user uploads photos of a summer vacation to the system, the server analyzes the photos and generates a video with a narration introducing the shooting location and events related to summer. This video is delivered to the user's application via cloud storage, and the user can share the video with family and friends through the application.

[0289] Examples of prompt texts for the generated AI model are as follows. "Analyze the data captured by the user and provide narration on the historical background and geographical information related to the data. Specifically, add information about the history and climate of the beach, which can be cited as 'a trip to the beach visited in the summer of 2023'."

[0290] The flow of specific processing in Application Example 1 will be described using FIG. 12.

[0291] Step 1:

[0292] Users select digital content from their smartphones or PCs and upload it to the server. The input data consists of photos and videos, and the server receives raw data as output.

[0293] Step 2:

[0294] The server analyzes the metadata of the received data (e.g., date and time of capture, location) and automatically classifies the data based on its attributes. This process utilizes information from a database and employs an attribute clustering algorithm. The output is digital data classified by attribute.

[0295] Step 3:

[0296] The server converts the classified digital data into a unified format. Here, it converts different media formats into a standard data format (e.g., MP4 or JPEG) for unified processing. The input is the classified digital data, and the output is the unified format data.

[0297] Step 4:

[0298] The server uses a generative AI model to add narration and music to digital content. This process employs natural language generation and speech synthesis technologies to generate prompts based on the user's digital data. The input is integrated digital data, and the output is a visual artifact with added music and narration.

[0299] Step 5:

[0300] The generated visual artifacts are saved to cloud storage by the server. Services such as AWS and Google Cloud are used, and the data is managed with backups and security measures in place. The output is the visual artifact stored in the cloud.

[0301] Step 6:

[0302] The user's terminal downloads the visual output from the cloud storage and visually displays it within a dedicated application. Here, a user interface is generated via the terminal's GUI to provide the output, which the user can share with family and friends. The input is the visual output on the cloud, and the output is the output displayed on the terminal.

[0303] Furthermore, an emotion engine for estimating the user's emotions may be combined. That is, the specific processing unit 290 may estimate the user's emotions using the emotion recognition model 59 and perform specific processing using the user's emotions.

[0304] In addition to a system for organizing and utilizing digital content, the present invention combines an emotion engine that recognizes the user's emotions and reflects them in content recommendations.

[0305] The user uploads digital content such as photos and videos to the server using their digital device. Also, if there is analog content, it is scanned and digitized and uploaded to the server in the same way.

[0306] The server analyzes the uploaded digital data and automatically classifies it based on the format. Using generative AI, the format and file size of the digital content are converted and optimized. Then, auxiliary information regarding regional information and the background of the era is acquired from an external database, and this information is integrated with the user's profile and life journey. Thereby, a digital book is generated.

[0307] The emotion engine analyzes the user's past actions and current interactions to estimate the emotional state. This information is utilized in the digital book generation process to make content recommendations tailored to the user's emotions. For example, if it is analyzed that the user is feeling nostalgic, specific photos or arrangements that enhance that emotion are recommended.

[0308] The generated digital books are securely stored in cloud storage. Users can access, view, and edit their digital books at any time through a dedicated application or web browser.

[0309] For example, when a user uploads photos and videos documenting their travel experiences, the server organizes them chronologically, adding historical context and related stories about the places visited. Furthermore, when the emotion engine senses the user's level of excitement at the time of upload, it automatically adjusts the color scheme and content placement to enhance that excitement. Ultimately, the user can enjoy a personalized digital book that strongly reflects their emotions.

[0310] The following describes the processing flow.

[0311] Step 1:

[0312] Users select digital content such as photos and videos using their own digital devices and upload them to the server. If analog content is available, they use scanners or digital conversion functions to digitize it before uploading it.

[0313] Step 2:

[0314] The server analyzes the received data and identifies the file format of each piece of content (JPEG, PNG, MP4, etc.). Subsequently, the server uses metadata (date and time of capture, location information, etc.) to classify the content and organize it into categories.

[0315] Step 3:

[0316] The server activates a generation AI to convert data in different formats into a unified format. At the same time, it optimizes the resolution and file size to suit the user's terminal environment, enabling efficient data management.

[0317] Step 4:

[0318] The server connects to an external database to retrieve local information and historical background data related to the uploaded content. This information, along with the user's profile information and life story, is incorporated into the creation of the digital book.

[0319] Step 5:

[0320] The emotion engine activates and analyzes the user's emotions based on past user behavior data and current interactions. Based on this, the server generates recommendations tailored to the user's emotions and creates a digital book that reflects them.

[0321] Step 6:

[0322] The completed digital book is stored in cloud storage by the server. Users can access this digital book from their devices. Furthermore, they can add new information and media, and share it with family and friends.

[0323] Step 7:

[0324] Users view and edit digital books using a dedicated application or web browser. The emotional feedback users provide to the system is continuously stored in the emotion engine as learning data and used to suggest content for future use.

[0325] (Example 2)

[0326] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0327] There is a need to efficiently organize the vast amounts of digital content data and provide personalized content suggestions based on the user's emotional state and characteristics. Furthermore, it is essential that the generated digital deliverables are always stored in an up-to-date state and are consistently accessible across multiple devices. Additionally, an interface that allows users to easily share and add information through sharing functions is crucial.

[0328] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0329] This invention includes a server that automatically classifies uploaded digital data according to its format, a server that utilizes generative AI technology to convert and optimize the classified digital data into a unified format, and a server that acquires regional information and historical contextual information from an external storage device. This enables users to efficiently manage vast amounts of digital content and receive personalized content suggestions based on their emotions and characteristics. Furthermore, the generated digital output is always stored in an up-to-date state, providing an environment in which users can consistently access and share it across multiple devices.

[0330] "Uploaded digital data" refers to electronic information such as photos, videos, and audio sent by users from their information terminals to a server.

[0331] "Generative AI technology" is a technique that uses artificial intelligence to automatically perform data format transformation and optimization, and it is a technology that utilizes machine learning models.

[0332] An "external storage device" is an information source that allows a server to access and obtain supplementary information such as regional information and historical context.

[0333] "Digital deliverables" refer to digital books or similar content in electronic format that are generated by a server and provided to the user.

[0334] "Emotional state" refers to the user's feelings at a given time, estimated based on their past actions and interactions.

[0335] An "information terminal" is a device such as a computer or smartphone that a user uses to view, edit, and upload digital data.

[0336] This system is designed to help users efficiently organize and utilize digital content. Users upload digital data such as photos and videos from their information terminals to the server. For analog content, they digitize it using the terminal's scanner before uploading. The server receives this data and performs analysis and optimization using a generative AI model. Specific generative AI technologies used include TensorFlow and PyTorch.

[0337] The server automatically classifies uploaded data based on its format. This process utilizes machine learning clustering algorithms and image recognition technologies. After classification, the data is converted into a unified format using generative AI and optimized for size. The optimized data is integrated with regional and historical contextual supplementary information collected via external storage devices. The server then generates and delivers a digital book as a digital output.

[0338] The sentiment engine is used to estimate a user's emotional state based on their past behavior and current interactions. This information is reflected in the content presented by the server, providing a personalized experience. The IBM Watson Sentiment Analysis API may be used for sentiment analysis.

[0339] The generated digital books are securely stored in cloud storage and can be accessed at any time from the user's device. The digital books can be viewed and edited through a dedicated application or web browser.

[0340] As a concrete example, a user uploads photos and videos from their trip to a server, from which a digital book containing historical information and stories about the places they visited is generated. This digital book's color scheme and layout are optimally adjusted based on an emotion engine that analyzes the user's state of excitement.

[0341] An example of a prompt might be, "Analyze your emotions during your trip, gather information about your destinations, and compile it into a digital book. Include a suggested color palette, especially to express your excitement." This allows users to obtain a personalized digital product that strongly reflects their emotions and experiences.

[0342] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0343] Step 1:

[0344] Users upload digital content such as photos and videos to the server using their information terminals. This serves as input, and the server receives the digital content. Specifically, the HTTP protocol is used to send the files. As output, the server saves the received data to its storage.

[0345] Step 2:

[0346] The server analyzes uploaded digital data and automatically classifies it based on its format. The input is uploaded digital data, and image recognition and metadata analysis are performed. Specifically, image features are extracted using OpenCV, and the classified results are output.

[0347] Step 3:

[0348] The server applies a generative AI model to classified digital data, performing format conversion and size optimization. The input is classified digital data, and format conversion and compression are performed using tools such as TensorFlow and PyTorch. The output is optimized data.

[0349] Step 4:

[0350] The server retrieves regional information and supplementary information about the historical context from external storage devices. The input here is classified digital data, and the process involves retrieving appropriate information via external APIs. Specifically, a REST API is used to retrieve destination-related information, and integrated data is output.

[0351] Step 5:

[0352] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. The input is user behavior history data, which is processed using tools such as the IBM Watson Sentiment Analysis API. The output is the estimated emotional state, which is then used in the next step of the process.

[0353] Step 6:

[0354] The server generates personalized digital artifacts based on the estimated emotional state. Both auxiliary and emotional information are input for digital book generation, and color tones and layouts appropriate to the user's emotions are applied. The output is the final digital book provided to the user.

[0355] Step 7:

[0356] The generated digital book is securely stored in cloud storage. The input is the digital book itself, and the output is the storage location information on the cloud. Users can access this digital book through a dedicated app or web browser, and further content updates are possible by viewing and editing it.

[0357] (Application Example 2)

[0358] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0359] With the increasing volume of digital content, there is a growing need to organize and effectively utilize personalized content that responds to the emotions of individual users. Traditional methods have struggled to suggest content that reflects the emotional state of users, making it difficult to generate electronic documents that strongly convey individual experiences.

[0360] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0361] In this invention, the server includes means for automatically classifying uploaded electronic information according to its format, means for converting the classified electronic information into a unified format, means for acquiring regional information and auxiliary information such as temporal background from an external storage device, and means for analyzing the user's emotional state using an emotion engine and selecting and editing content. This makes it possible to generate personalized electronic documents that are in line with the user's emotions.

[0362] "Electronic information" refers to all data expressed in digital format, and includes a variety of media such as audio, video, text, and images.

[0363] "Format" refers to attributes that indicate the structure, type, and representation of data, and is an indicator of how data is organized and processed.

[0364] "External storage devices" refer to external databases and cloud storage accessed via the internet or other connections, allowing data to be retrieved from a variety of sources.

[0365] "Local information" refers to data related to a specific geographical area, including information such as geographical names and locations, and cultural and social backgrounds.

[0366] "Temporal context" refers to information about a specific time or era, including historical events, historical background, and characteristic events of each era.

[0367] "User" refers to an individual or legal entity that uses the present invention and is a person who receives the services provided by this system.

[0368] "Personal data" refers to information about an individual, such as the user's name, address, age, and occupation, and includes information that can identify the user.

[0369] "Life history" refers to information about a user's past behavior, experiences, and activities, including their personal history and life events.

[0370] "Electronic documents" refer to books and documents generated in digital format, and are publications that integrate multiple media, including text, images, and videos.

[0371] An "emotion engine" refers to a system or software that analyzes and interprets a user's emotional state and has the function of adjusting content according to the user's emotions.

[0372] "Electrical devices" refer to equipment used for processing, storing, and displaying electronic information, and include smartphones, tablets, and personal computers.

[0373] The system that implements this application operates by uploading digital information to the cloud through an application installed on a user's device, such as a smartphone or tablet. The uploaded electronic information is automatically classified by the server based on its format and converted into a unified format. Furthermore, auxiliary information such as regional information and temporal context is retrieved from an external storage device and used together with the electronic information.

[0374] The server is equipped with an emotion engine that analyzes the user's emotional state based on their past behavior. For this analysis, an emotion analysis library and generative AI models are used. Based on the emotional data analyzed by the emotion engine, content appropriate to the user is selected and edited. As a result, personalized electronic documents are generated.

[0375] The generated electronic documents are stored in network storage and provided in a format viewable on the user's device. This system allows users to obtain a personalized experience tailored to their own emotions.

[0376] As a concrete example, when a user uploads photos or videos taken in a park on a holiday to the application, the system analyzes the feelings of happiness and relaxation experienced at that time. Based on this, it can generate a story-driven video with calming music playing in the background. An appropriate prompt would be, "Please generate an emotional and calming video story that evokes a relaxed mood using photos and videos taken in a park on a holiday."

[0377] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0378] Step 1:

[0379] The user uploads electronic information such as photos and videos to the cloud using their device. The input for this step is the digital data captured by the user, and the output is the data stored in cloud storage. The file structure is modified based on the user's actions.

[0380] Step 2:

[0381] The server automatically classifies electronic information stored in the cloud according to its format. The input is uploaded digital data, and the output is a dataset classified by category. The server uses machine learning algorithms to analyze the data format and assign it to the appropriate category.

[0382] Step 3:

[0383] The server converts the classified data into a unified format. The input for this step is classified digital data, and the output is the converted data in a unified format. The server uses a data conversion tool to convert data in different formats into compatible formats.

[0384] Step 4:

[0385] The server retrieves regional information and temporal contextual information from external storage. The input consists of location information and timestamps contained within the data, while the output is this information along with related auxiliary data. The server retrieves this information by calling an external API.

[0386] Step 5:

[0387] The server analyzes the user's emotional state using an emotion engine. The input is the user's past behavioral data, and the output is the estimated emotional state. The server utilizes an emotion analysis library and a generative AI model to estimate the user's emotions.

[0388] Step 6:

[0389] The server selects and edits content based on emotional states. The input is a dataset containing estimated emotional states and supplementary information, and the output is a personalized electronic document. The server uses an editing algorithm to select appropriate elements that highlight the emotions and generate a digital book.

[0390] Step 7:

[0391] The server stores the generated electronic documents in network storage and prepares them in a format that can be provided to the user's terminal. The input is a personalized electronic document, and the output is a digital book in a format viewable by the user. The server uses prompts to perform necessary format conversions.

[0392] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0393] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include those described above. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions shown by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0394] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0395] [Third Embodiment]

[0396] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0397] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0398] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0399] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0400] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0401] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0402] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0403] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0404] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0405] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0406] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0407] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0408] To implement this invention, the process begins with the user uploading digital content to a server using their digital device (e.g., a smartphone or PC). The user selects digitized photos and videos and sends them to the system. In the case of analog content, it is digitized using a dedicated scanner or digitizing application and then uploaded to the server.

[0409] The server automatically analyzes the received digital data and classifies it based on its format. This classification is performed by utilizing content metadata to determine the date and time of capture, location, and type of content. Next, the server uses generative AI to convert data in different formats into a consistent format and optimize it. This is done to adjust the data according to the viewing environment and improve user convenience.

[0410] The server also communicates with external databases to retrieve additional information about local area and historical context. This information is linked to the user's profile and life journey and integrated into the digital book. As a result, the user's personal history and memories are compiled in a richer and more meaningful way. The generative AI naturally combines this information, taking care to maintain consistency in the visuals and narratives that the user desires.

[0411] The generated digital books are securely stored in cloud storage and can be viewed at any time by users using a dedicated application or web browser. Users can add comments to the digital books or share specific pages with friends and family as needed.

[0412] For example, when a user uploads family photos and videos taken over many years, the images are organized chronologically, and AI adds supplementary information such as important events and trivia about the locations where the photos were taken during that period. When users have these digital books, they can not only look back on their own history but also preserve them as valuable records to pass on to future generations.

[0413] The following describes the processing flow.

[0414] Step 1:

[0415] Users upload digital content (photos, videos, etc.) to the server using digital devices. At the same time, users scan and digitize analog content as needed and upload it in the same manner.

[0416] Step 2:

[0417] The server analyzes the received digital data and automatically identifies the file format (JPEG, PNG, MP4, etc.). Then, the server uses metadata (e.g., date and time of shooting, location) to classify and categorize the data.

[0418] Step 3:

[0419] The server uses generation AI to perform the necessary conversions and optimizations to standardize the format and file size of digital content. Specifically, it converts images and videos of different formats into a specified unified format and optimizes them according to the viewing environment.

[0420] Step 4:

[0421] The server queries an external database to retrieve regional and historical context information related to the uploaded content. This information is used in the generation process, adding value to the digital book.

[0422] Step 5:

[0423] The server uses a generation AI to integrate the user's profile information, life history, and acquired supplementary information to generate a digital book. This process applies design templates and arranges photos and text harmoniously.

[0424] Step 6:

[0425] The generated digital books are stored in cloud storage by the server, allowing users to access them anytime, anywhere. Furthermore, automatic adjustments are made to ensure that the digital books remain in the correct format as the user's device is updated.

[0426] Step 7:

[0427] Users can access, view, and edit digital books through a dedicated application or web browser. They can also share specific pages and add new information and media.

[0428] (Example 1)

[0429] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0430] As the volume of electronic data increases, there is a growing need to manage it efficiently and provide information in a format that is easily accessible to users. In particular, when diverse data formats or supplementary information are required, there is a lack of unified means to handle them, which impairs user convenience. In addition, providing data in a way that connects past information with present information is an important element for deepening user understanding, but there is a lack of effective means to achieve this.

[0431] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0432] In this invention, the server includes means for automatically classifying uploaded electronic data based on attributes, means for converting the classified electronic data into unified attributes, and means for obtaining location information and supplementary information such as historical context from an external database. This enables efficient management and provision of diverse data, and allows users to search and view electronic data in a unified format. Furthermore, by associating additional information with the user's characteristics and history, a deeper understanding and the provision of valuable information can be achieved.

[0433] "Electronic data" refers to a collection of information stored or transmitted in digital format, including photographs, videos, and documents.

[0434] "Attributes" refer to specific characteristics or properties of data, and include information that serves as a basis for classification and transformation.

[0435] An "external database" is a collection of information located outside the system, and is particularly used to provide supplementary information such as regional information and historical context.

[0436] "Supplemental information" refers to details and background information added to basic information, providing additional value and context to the data.

[0437] A "generative AI model" refers to a structure or data processing method generated by artificial intelligence, and is used in the creation of ebooks and other similar materials.

[0438] An "eBook" is a book presented in digital format, integrating text, images, and other media formats.

[0439] The "connection interface" refers to the interface through which a user interacts with the system, and includes means for displaying and manipulating information.

[0440] "Remote storage" refers to storage located in the cloud, providing a mechanism for securely and efficiently storing data.

[0441] To implement this invention, users need to transmit digital content to a server using their own electronic devices (such as smartphones or computers). Users select photos and videos they have saved and upload the data to the server via the internet. If analog data exists, it is digitized using a scanner or digitizing app and then transmitted to the server.

[0442] The server analyzes the received digital data and automatically classifies it based on its attributes. Digital analysis algorithms are used for metadata analysis, classifying the data by date and time of capture, location, and content. Image recognition technology may also be applied during this process. The classified data is then formatted into consistent attributes using a generative AI model. The generative AI model supports data format optimization and conversion, ensuring smooth display across different device environments.

[0443] In parallel, the server collaborates with external information sources to obtain relevant location information and supplementary historical context. For example, it can add historical background and general knowledge about a region to a photograph taken in that area. This gives user-uploaded content a richer context.

[0444] Ultimately, the server utilizes a generative AI model to generate an ebook from the aggregated information. This ebook is customized to take into account the user's characteristics and history, ensuring visual and narrative consistency. The generated ebook is securely and conveniently stored in remote storage (cloud storage) and can be accessed from the user's electronic device via a dedicated interface.

[0445] As a concrete example, when a user uploads a folder titled "Summer Family Trip," the server analyzes and categorizes the photos within, supplementing them with information about landmarks and events at the travel destination. The resulting ebook becomes a rich resource for reminiscing about the trip. An example of a prompt used in this process might be, "Please add events related to this photo to create a visual story."

[0446] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0447] Step 1:

[0448] Users select digital content using electronic devices and send it to the server. Digital data such as photos and videos are prepared as input and uploaded to the server via an interface. If analog data exists, it is digitized using a scanner or digitizing application. The output is the digital data sent to the server.

[0449] Step 2:

[0450] The server analyzes the received digital data. Based on the input digital data, the server extracts metadata and automatically classifies the data by attributes such as the date and time of capture, location, and device information. Specifically, it uses image recognition technology to identify and classify the content (e.g., people, landscapes). The output is a list of the classified data.

[0451] Step 3:

[0452] The server converts classified digital data into a unified format using a generative AI model. It uses classified data as input and converts it to formats and resolutions suitable for different devices. Specifically, the generative AI model adjusts the size and resolution to ensure optimal display across different viewing environments. The output is the converted data in a unified format.

[0453] Step 4:

[0454] The server connects to external information sources to obtain supplementary information. The input uses location information and historical context obtained from classified data, and based on this, it retrieves relevant information from external databases. Specifically, it adds information such as geographical and historical context to the data. The output is data with the added supplementary information.

[0455] Step 5:

[0456] The server utilizes a generative AI model to generate ebooks based on all acquired information. It uses supplementary data as input to design pages and structure content. Specifically, it constructs the ebook while considering user history and characteristics, maintaining visual and narrative consistency. The final ebook is then generated as output.

[0457] Step 6:

[0458] The server saves the generated e-books to remote storage, making them accessible from the user's device. The generated e-books are used as input and securely stored in cloud storage. Specifically, users can view and share the e-books at any time using a dedicated application or web browser. As output, the e-books are stored in a format accessible to the user.

[0459] (Application Example 1)

[0460] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0461] There is a problem in that users cannot efficiently manage the diverse digital content they possess and utilize it as a consistent visual narrative, thus failing to fully extract the value from individual memories. In addition, there is no easy way for users to share the digital content they generate with family and friends, making it difficult to effectively share memories with others.

[0462] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0463] In this invention, the server includes means for automatically classifying uploaded information based on its attributes, means for converting the classified information into an integrated format, and means for adding narration and music to digital content using a generative AI model to create visual deliverables based on a storyline. This enables users to integrate diverse digital content and manage and share memories in a visually appealing way.

[0464] "Information" refers to digital data owned by a user, including multimedia content such as photographs and videos.

[0465] "Attributes" refer to characteristics and metadata related to content, such as the date and time of shooting, location, and type of content.

[0466] "Unified format" refers to a standard data structure or layout for organizing digital content as a coherent visual narrative.

[0467] A "generative AI model" refers to a system that uses artificial intelligence technology to automatically add relevant information, music, and narration to digital content.

[0468] A "storyline" refers to a framework for structuring content in a visual product as a narrative, either temporally or logically.

[0469] "Visual deliverables" refer to the final product created from the user's digital content in a format that is easy to understand visually.

[0470] The specific system for implementing this invention uses a program that can run on various devices. The server receives digital data uploaded by the user and activates a data classification module, which automatically classifies the digital data based on its attributes. The classified data is converted into an integrated format through a format conversion module, after which a generative AI model begins operation. The generative AI model uses natural language processing technology to automatically add voice narration and background music to the digital content, creating a visual output.

[0471] The server utilizes cloud services to perform these processes and employs open-source libraries (e.g., TensorFlow and PyTorch) for data processing. It also uses cloud services such as AWS and Google Cloud for storage. On the user's device, a specific application receives these visual artifacts, allowing them to be displayed, edited, and shared with others through an interface.

[0472] As a concrete example, when a user uploads photos from their summer vacation to the system, the server analyzes the photos and generates a video with narration introducing the shooting locations and summer events. This video is delivered to the user's application via cloud storage, and the user can share the video with family and friends through the app.

[0473] An example of a prompt for a generative AI model would be: "Analyze the data captured by the user and provide narration with historical and geographical information related to the data. Specifically, add information about the history and climate of the beach mentioned as 'a trip to a beach visited in the summer of 2023.'"

[0474] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0475] Step 1:

[0476] Users select digital content from their smartphones or PCs and upload it to the server. The input data consists of photos and videos, and the server receives raw data as output.

[0477] Step 2:

[0478] The server analyzes the metadata of the received data (e.g., date and time of capture, location) and automatically classifies the data based on its attributes. This process utilizes information from a database and employs an attribute clustering algorithm. The output is digital data classified by attribute.

[0479] Step 3:

[0480] The server converts the classified digital data into a unified format. Here, it converts different media formats into a standard data format (e.g., MP4 or JPEG) for unified processing. The input is the classified digital data, and the output is the unified format data.

[0481] Step 4:

[0482] The server uses a generative AI model to add narration and music to digital content. This process employs natural language generation and speech synthesis technologies to generate prompts based on the user's digital data. The input is integrated digital data, and the output is a visual artifact with added music and narration.

[0483] Step 5:

[0484] The generated visual artifacts are saved to cloud storage by the server. Services such as AWS and Google Cloud are used, and the data is managed with backups and security measures in place. The output is the visual artifact stored in the cloud.

[0485] Step 6:

[0486] The user's device downloads visual artifacts from cloud storage and displays them visually within a dedicated application. Here, a user interface is generated via the device's GUI, providing the output. The user can then share this with family and friends. The input is the visual artifact in the cloud, and the output is the artifact displayed on the device.

[0487] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0488] This invention combines a system for organizing and utilizing digital content with an emotion engine that recognizes user emotions and reflects them in content suggestions.

[0489] Users upload digital content such as photos and videos to the server using their own digital devices. If they have analog content, they scan it, digitize it, and upload it to the server as well.

[0490] The server analyzes the uploaded digital data and automatically classifies it based on its format. Generative AI is used to convert and optimize the format and file size of the digital content. Then, supplementary information such as regional and historical context is retrieved from an external database, and this information is integrated with the user's profile and life story. This process generates a digital book.

[0491] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. This information is used in the digital book generation process to suggest content tailored to the user's emotions. For example, if the engine analyzes that the user is feeling nostalgic, it will suggest specific photos and layouts that evoke that emotion.

[0492] The generated digital books are securely stored in cloud storage. Users can access, view, and edit their digital books at any time through a dedicated application or web browser.

[0493] For example, when a user uploads photos and videos documenting their travel experiences, the server organizes them chronologically, adding historical context and related stories about the places visited. Furthermore, when the emotion engine senses the user's level of excitement at the time of upload, it automatically adjusts the color scheme and content placement to enhance that excitement. Ultimately, the user can enjoy a personalized digital book that strongly reflects their emotions.

[0494] The following describes the processing flow.

[0495] Step 1:

[0496] Users select digital content such as photos and videos using their own digital devices and upload them to the server. If analog content is available, they use scanners or digital conversion functions to digitize it before uploading it.

[0497] Step 2:

[0498] The server analyzes the received data and identifies the file format of each piece of content (JPEG, PNG, MP4, etc.). Subsequently, the server uses metadata (date and time of capture, location information, etc.) to classify the content and organize it into categories.

[0499] Step 3:

[0500] The server activates a generation AI to convert data in different formats into a unified format. At the same time, it optimizes the resolution and file size to suit the user's terminal environment, enabling efficient data management.

[0501] Step 4:

[0502] The server connects to an external database to retrieve local information and historical background data related to the uploaded content. This information, along with the user's profile information and life story, is incorporated into the creation of the digital book.

[0503] Step 5:

[0504] The emotion engine activates and analyzes the user's emotions based on past user behavior data and current interactions. Based on this, the server generates recommendations tailored to the user's emotions and creates a digital book that reflects them.

[0505] Step 6:

[0506] The completed digital book is stored in cloud storage by the server. Users can access this digital book from their devices. Furthermore, they can add new information and media, and share it with family and friends.

[0507] Step 7:

[0508] Users view and edit digital books using a dedicated application or web browser. The emotional feedback users provide to the system is continuously stored in the emotion engine as learning data and used to suggest content for future use.

[0509] (Example 2)

[0510] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0511] There is a need to efficiently organize the vast amounts of digital content data and provide personalized content suggestions based on the user's emotional state and characteristics. Furthermore, it is essential that the generated digital deliverables are always stored in an up-to-date state and are consistently accessible across multiple devices. Additionally, an interface that allows users to easily share and add information through sharing functions is crucial.

[0512] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0513] This invention includes a server that automatically classifies uploaded digital data according to its format, a server that utilizes generative AI technology to convert and optimize the classified digital data into a unified format, and a server that acquires regional information and historical contextual information from an external storage device. This enables users to efficiently manage vast amounts of digital content and receive personalized content suggestions based on their emotions and characteristics. Furthermore, the generated digital output is always stored in an up-to-date state, providing an environment in which users can consistently access and share it across multiple devices.

[0514] "Uploaded digital data" refers to electronic information such as photos, videos, and audio sent by users from their information terminals to a server.

[0515] "Generative AI technology" is a technique that uses artificial intelligence to automatically perform data format transformation and optimization, and it is a technology that utilizes machine learning models.

[0516] An "external storage device" is an information source that allows a server to access and obtain supplementary information such as regional information and historical context.

[0517] "Digital deliverables" refer to digital books or similar content in electronic format that are generated by a server and provided to the user.

[0518] "Emotional state" refers to the user's feelings at a given time, estimated based on their past actions and interactions.

[0519] An "information terminal" is a device such as a computer or smartphone that a user uses to view, edit, and upload digital data.

[0520] This system is designed to help users efficiently organize and utilize digital content. Users upload digital data such as photos and videos from their information terminals to the server. For analog content, they digitize it using the terminal's scanner before uploading. The server receives this data and performs analysis and optimization using a generative AI model. Specific generative AI technologies used include TensorFlow and PyTorch.

[0521] The server automatically classifies uploaded data based on its format. This process utilizes machine learning clustering algorithms and image recognition technologies. After classification, the data is converted into a unified format using generative AI and optimized for size. The optimized data is integrated with regional and historical contextual supplementary information collected via external storage devices. The server then generates and delivers a digital book as a digital output.

[0522] The sentiment engine is used to estimate a user's emotional state based on their past behavior and current interactions. This information is reflected in the content presented by the server, providing a personalized experience. The IBM Watson Sentiment Analysis API may be used for sentiment analysis.

[0523] The generated digital books are securely stored in cloud storage and can be accessed at any time from the user's device. The digital books can be viewed and edited through a dedicated application or web browser.

[0524] As a concrete example, a user uploads photos and videos from their trip to a server, from which a digital book containing historical information and stories about the places they visited is generated. This digital book's color scheme and layout are optimally adjusted based on an emotion engine that analyzes the user's state of excitement.

[0525] An example of a prompt might be, "Analyze your emotions during your trip, gather information about your destinations, and compile it into a digital book. Include a suggested color palette, especially to express your excitement." This allows users to obtain a personalized digital product that strongly reflects their emotions and experiences.

[0526] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0527] Step 1:

[0528] Users upload digital content such as photos and videos to the server using their information terminals. This serves as input, and the server receives the digital content. Specifically, the HTTP protocol is used to send the files. As output, the server saves the received data to its storage.

[0529] Step 2:

[0530] The server analyzes uploaded digital data and automatically classifies it based on its format. The input is uploaded digital data, and image recognition and metadata analysis are performed. Specifically, image features are extracted using OpenCV, and the classified results are output.

[0531] Step 3:

[0532] The server applies a generative AI model to classified digital data, performing format conversion and size optimization. The input is classified digital data, and format conversion and compression are performed using tools such as TensorFlow and PyTorch. The output is optimized data.

[0533] Step 4:

[0534] The server retrieves regional information and supplementary information about the historical context from external storage devices. The input here is classified digital data, and the process involves retrieving appropriate information via external APIs. Specifically, a REST API is used to retrieve destination-related information, and integrated data is output.

[0535] Step 5:

[0536] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. The input is user behavior history data, which is processed using tools such as the IBM Watson Sentiment Analysis API. The output is the estimated emotional state, which is then used in the next step of the process.

[0537] Step 6:

[0538] The server generates personalized digital artifacts based on the estimated emotional state. Both auxiliary and emotional information are input for digital book generation, and color tones and layouts appropriate to the user's emotions are applied. The output is the final digital book provided to the user.

[0539] Step 7:

[0540] The generated digital book is securely stored in cloud storage. The input is the digital book itself, and the output is the storage location information on the cloud. Users can access this digital book through a dedicated app or web browser, and further content updates are possible by viewing and editing it.

[0541] (Application Example 2)

[0542] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0543] With the increasing volume of digital content, there is a growing need to organize and effectively utilize personalized content that responds to the emotions of individual users. Traditional methods have struggled to suggest content that reflects the emotional state of users, making it difficult to generate electronic documents that strongly convey individual experiences.

[0544] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0545] In this invention, the server includes means for automatically classifying uploaded electronic information according to its format, means for converting the classified electronic information into a unified format, means for acquiring regional information and auxiliary information such as temporal background from an external storage device, and means for analyzing the user's emotional state using an emotion engine and selecting and editing content. This makes it possible to generate personalized electronic documents that are in line with the user's emotions.

[0546] "Electronic information" refers to all data expressed in digital format, and includes a variety of media such as audio, video, text, and images.

[0547] "Format" refers to attributes that indicate the structure, type, and representation of data, and is an indicator of how data is organized and processed.

[0548] "External storage devices" refer to external databases and cloud storage accessed via the internet or other connections, allowing data to be retrieved from a variety of sources.

[0549] "Local information" refers to data related to a specific geographical area, including information such as geographical names and locations, and cultural and social backgrounds.

[0550] "Temporal context" refers to information about a specific time or era, including historical events, historical background, and characteristic events of each era.

[0551] "User" refers to an individual or legal entity that uses the present invention and is a person who receives the services provided by this system.

[0552] "Personal data" refers to information about an individual, such as the user's name, address, age, and occupation, and includes information that can identify the user.

[0553] "Life history" refers to information about a user's past behavior, experiences, and activities, including their personal history and life events.

[0554] "Electronic documents" refer to books and documents generated in digital format, and are publications that integrate multiple media, including text, images, and videos.

[0555] An "emotion engine" refers to a system or software that analyzes and interprets a user's emotional state and has the function of adjusting content according to the user's emotions.

[0556] "Electrical devices" refer to equipment used for processing, storing, and displaying electronic information, and include smartphones, tablets, and personal computers.

[0557] The system that implements this application operates by uploading digital information to the cloud through an application installed on a user's device, such as a smartphone or tablet. The uploaded electronic information is automatically classified by the server based on its format and converted into a unified format. Furthermore, auxiliary information such as regional information and temporal context is retrieved from an external storage device and used together with the electronic information.

[0558] The server is equipped with an emotion engine that analyzes the user's emotional state based on their past behavior. For this analysis, an emotion analysis library and generative AI models are used. Based on the emotional data analyzed by the emotion engine, content appropriate to the user is selected and edited. As a result, personalized electronic documents are generated.

[0559] The generated electronic documents are stored in network storage and provided in a format viewable on the user's device. This system allows users to obtain a personalized experience tailored to their own emotions.

[0560] As a concrete example, when a user uploads photos or videos taken in a park on a holiday to the application, the system analyzes the feelings of happiness and relaxation experienced at that time. Based on this, it can generate a story-driven video with calming music playing in the background. An appropriate prompt would be, "Please generate an emotional and calming video story that evokes a relaxed mood using photos and videos taken in a park on a holiday."

[0561] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0562] Step 1:

[0563] The user uploads electronic information such as photos and videos to the cloud using their device. The input for this step is the digital data captured by the user, and the output is the data stored in cloud storage. The file structure is modified based on the user's actions.

[0564] Step 2:

[0565] The server automatically classifies electronic information stored in the cloud according to its format. The input is uploaded digital data, and the output is a dataset classified by category. The server uses machine learning algorithms to analyze the data format and assign it to the appropriate category.

[0566] Step 3:

[0567] The server converts the classified data into a unified format. The input for this step is classified digital data, and the output is the converted data in a unified format. The server uses a data conversion tool to convert data in different formats into compatible formats.

[0568] Step 4:

[0569] The server retrieves regional information and temporal contextual information from external storage. The input consists of location information and timestamps contained within the data, while the output is this information along with related auxiliary data. The server retrieves this information by calling an external API.

[0570] Step 5:

[0571] The server analyzes the user's emotional state using an emotion engine. The input is the user's past behavioral data, and the output is the estimated emotional state. The server utilizes an emotion analysis library and a generative AI model to estimate the user's emotions.

[0572] Step 6:

[0573] The server selects and edits content based on emotional states. The input is a dataset containing estimated emotional states and supplementary information, and the output is a personalized electronic document. The server uses an editing algorithm to select appropriate elements that highlight the emotions and generate a digital book.

[0574] Step 7:

[0575] The server stores the generated electronic documents in network storage and prepares them in a format that can be provided to the user's terminal. The input is a personalized electronic document, and the output is a digital book in a format viewable by the user. The server uses prompts to perform necessary format conversions.

[0576] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0577] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include those described above. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions shown by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0578] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0579] [Fourth Embodiment]

[0580] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0581] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0582] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0583] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0584] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0585] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0586] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0587] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0588] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0589] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0590] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0591] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0592] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0593] To implement this invention, the process begins with the user uploading digital content to a server using their digital device (e.g., a smartphone or PC). The user selects digitized photos and videos and sends them to the system. In the case of analog content, it is digitized using a dedicated scanner or digitizing application and then uploaded to the server.

[0594] The server automatically analyzes the received digital data and classifies it based on its format. This classification is performed by utilizing content metadata to determine the date and time of capture, location, and type of content. Next, the server uses generative AI to convert data in different formats into a consistent format and optimize it. This is done to adjust the data according to the viewing environment and improve user convenience.

[0595] The server also communicates with external databases to retrieve additional information about local area and historical context. This information is linked to the user's profile and life journey and integrated into the digital book. As a result, the user's personal history and memories are compiled in a richer and more meaningful way. The generative AI naturally combines this information, taking care to maintain consistency in the visuals and narratives that the user desires.

[0596] The generated digital books are securely stored in cloud storage and can be viewed at any time by users using a dedicated application or web browser. Users can add comments to the digital books or share specific pages with friends and family as needed.

[0597] For example, when a user uploads family photos and videos taken over many years, the images are organized chronologically, and AI adds supplementary information such as important events and trivia about the locations where the photos were taken during that period. When users have these digital books, they can not only look back on their own history but also preserve them as valuable records to pass on to future generations.

[0598] The following describes the processing flow.

[0599] Step 1:

[0600] Users upload digital content (photos, videos, etc.) to the server using digital devices. At the same time, users scan and digitize analog content as needed and upload it in the same manner.

[0601] Step 2:

[0602] The server analyzes the received digital data and automatically identifies the file format (JPEG, PNG, MP4, etc.). Then, the server uses metadata (e.g., date and time of shooting, location) to classify and categorize the data.

[0603] Step 3:

[0604] The server uses generation AI to perform the necessary conversions and optimizations to standardize the format and file size of digital content. Specifically, it converts images and videos of different formats into a specified unified format and optimizes them according to the viewing environment.

[0605] Step 4:

[0606] The server queries an external database to retrieve regional and historical context information related to the uploaded content. This information is used in the generation process, adding value to the digital book.

[0607] Step 5:

[0608] The server uses a generation AI to integrate the user's profile information, life history, and acquired supplementary information to generate a digital book. This process applies design templates and arranges photos and text harmoniously.

[0609] Step 6:

[0610] The generated digital books are stored in cloud storage by the server, allowing users to access them anytime, anywhere. Furthermore, automatic adjustments are made to ensure that the digital books remain in the correct format as the user's device is updated.

[0611] Step 7:

[0612] Users can access, view, and edit digital books through a dedicated application or web browser. They can also share specific pages and add new information and media.

[0613] (Example 1)

[0614] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0615] As the volume of electronic data increases, there is a growing need to manage it efficiently and provide information in a format that is easily accessible to users. In particular, when diverse data formats or supplementary information are required, there is a lack of unified means to handle them, which impairs user convenience. In addition, providing data in a way that connects past information with present information is an important element for deepening user understanding, but there is a lack of effective means to achieve this.

[0616] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0617] In this invention, the server includes means for automatically classifying uploaded electronic data based on attributes, means for converting the classified electronic data into unified attributes, and means for obtaining location information and supplementary information such as historical context from an external database. This enables efficient management and provision of diverse data, and allows users to search and view electronic data in a unified format. Furthermore, by associating additional information with the user's characteristics and history, a deeper understanding and the provision of valuable information can be achieved.

[0618] "Electronic data" refers to a collection of information stored or transmitted in digital format, including photographs, videos, and documents.

[0619] "Attributes" refer to specific characteristics or properties of data, and include information that serves as a basis for classification and transformation.

[0620] An "external database" is a collection of information located outside the system, and is particularly used to provide supplementary information such as regional information and historical context.

[0621] "Supplemental information" refers to details and background information added to basic information, providing additional value and context to the data.

[0622] A "generative AI model" refers to a structure or data processing method generated by artificial intelligence, and is used in the creation of ebooks and other similar materials.

[0623] An "eBook" is a book presented in digital format, integrating text, images, and other media formats.

[0624] The "connection interface" refers to the interface through which a user interacts with the system, and includes means for displaying and manipulating information.

[0625] "Remote storage" refers to storage located in the cloud, providing a mechanism for securely and efficiently storing data.

[0626] To implement this invention, users need to transmit digital content to a server using their own electronic devices (such as smartphones or computers). Users select photos and videos they have saved and upload the data to the server via the internet. If analog data exists, it is digitized using a scanner or digitizing app and then transmitted to the server.

[0627] The server analyzes the received digital data and automatically classifies it based on its attributes. Digital analysis algorithms are used for metadata analysis, classifying the data by date and time of capture, location, and content. Image recognition technology may also be applied during this process. The classified data is then formatted into consistent attributes using a generative AI model. The generative AI model supports data format optimization and conversion, ensuring smooth display across different device environments.

[0628] In parallel, the server collaborates with external information sources to obtain relevant location information and supplementary historical context. For example, it can add historical background and general knowledge about a region to a photograph taken in that area. This gives user-uploaded content a richer context.

[0629] Ultimately, the server utilizes a generative AI model to generate an ebook from the aggregated information. This ebook is customized to take into account the user's characteristics and history, ensuring visual and narrative consistency. The generated ebook is securely and conveniently stored in remote storage (cloud storage) and can be accessed from the user's electronic device via a dedicated interface.

[0630] As a concrete example, when a user uploads a folder titled "Summer Family Trip," the server analyzes and categorizes the photos within, supplementing them with information about landmarks and events at the travel destination. The resulting ebook becomes a rich resource for reminiscing about the trip. An example of a prompt used in this process might be, "Please add events related to this photo to create a visual story."

[0631] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0632] Step 1:

[0633] Users select digital content using electronic devices and send it to the server. Digital data such as photos and videos are prepared as input and uploaded to the server via an interface. If analog data exists, it is digitized using a scanner or digitizing application. The output is the digital data sent to the server.

[0634] Step 2:

[0635] The server analyzes the received digital data. Based on the input digital data, the server extracts metadata and automatically classifies the data by attributes such as the date and time of capture, location, and device information. Specifically, it uses image recognition technology to identify and classify the content (e.g., people, landscapes). The output is a list of the classified data.

[0636] Step 3:

[0637] The server converts classified digital data into a unified format using a generative AI model. It uses classified data as input and converts it to formats and resolutions suitable for different devices. Specifically, the generative AI model adjusts the size and resolution to ensure optimal display across different viewing environments. The output is the converted data in a unified format.

[0638] Step 4:

[0639] The server connects to external information sources to obtain supplementary information. The input uses location information and historical context obtained from classified data, and based on this, it retrieves relevant information from external databases. Specifically, it adds information such as geographical and historical context to the data. The output is data with the added supplementary information.

[0640] Step 5:

[0641] The server utilizes a generative AI model to generate ebooks based on all acquired information. It uses supplementary data as input to design pages and structure content. Specifically, it constructs the ebook while considering user history and characteristics, maintaining visual and narrative consistency. The final ebook is then generated as output.

[0642] Step 6:

[0643] The server saves the generated e-books to remote storage, making them accessible from the user's device. The generated e-books are used as input and securely stored in cloud storage. Specifically, users can view and share the e-books at any time using a dedicated application or web browser. As output, the e-books are stored in a format accessible to the user.

[0644] (Application Example 1)

[0645] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0646] There is a problem in that users cannot efficiently manage the diverse digital content they possess and utilize it as a consistent visual narrative, thus failing to fully extract the value from individual memories. In addition, there is no easy way for users to share the digital content they generate with family and friends, making it difficult to effectively share memories with others.

[0647] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0648] In this invention, the server includes means for automatically classifying uploaded information based on its attributes, means for converting the classified information into an integrated format, and means for adding narration and music to digital content using a generative AI model to create visual deliverables based on a storyline. This enables users to integrate diverse digital content and manage and share memories in a visually appealing way.

[0649] "Information" refers to digital data owned by a user, including multimedia content such as photographs and videos.

[0650] "Attributes" refer to characteristics and metadata related to content, such as the date and time of shooting, location, and type of content.

[0651] "Unified format" refers to a standard data structure or layout for organizing digital content as a coherent visual narrative.

[0652] A "generative AI model" refers to a system that uses artificial intelligence technology to automatically add relevant information, music, and narration to digital content.

[0653] A "storyline" refers to a framework for structuring content in a visual product as a narrative, either temporally or logically.

[0654] "Visual deliverables" refer to the final product created from the user's digital content in a format that is easy to understand visually.

[0655] The specific system for implementing this invention uses a program that can run on various devices. The server receives digital data uploaded by the user and activates a data classification module, which automatically classifies the digital data based on its attributes. The classified data is converted into an integrated format through a format conversion module, after which a generative AI model begins operation. The generative AI model uses natural language processing technology to automatically add voice narration and background music to the digital content, creating a visual output.

[0656] The server utilizes cloud services to perform these processes and employs open-source libraries (e.g., TensorFlow and PyTorch) for data processing. It also uses cloud services such as AWS and Google Cloud for storage. On the user's device, a specific application receives these visual artifacts, allowing them to be displayed, edited, and shared with others through an interface.

[0657] As a concrete example, when a user uploads photos from their summer vacation to the system, the server analyzes the photos and generates a video with narration introducing the shooting locations and summer events. This video is delivered to the user's application via cloud storage, and the user can share the video with family and friends through the app.

[0658] An example of a prompt for a generative AI model would be: "Analyze the data captured by the user and provide narration with historical and geographical information related to the data. Specifically, add information about the history and climate of the beach mentioned as 'a trip to a beach visited in the summer of 2023.'"

[0659] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0660] Step 1:

[0661] Users select digital content from their smartphones or PCs and upload it to the server. The input data consists of photos and videos, and the server receives raw data as output.

[0662] Step 2:

[0663] The server analyzes the metadata of the received data (e.g., date and time of capture, location) and automatically classifies the data based on its attributes. This process utilizes information from a database and employs an attribute clustering algorithm. The output is digital data classified by attribute.

[0664] Step 3:

[0665] The server converts the classified digital data into a unified format. Here, it converts different media formats into a standard data format (e.g., MP4 or JPEG) for unified processing. The input is the classified digital data, and the output is the unified format data.

[0666] Step 4:

[0667] The server uses a generative AI model to add narration and music to digital content. This process employs natural language generation and speech synthesis technologies to generate prompts based on the user's digital data. The input is integrated digital data, and the output is a visual artifact with added music and narration.

[0668] Step 5:

[0669] The generated visual artifacts are saved to cloud storage by the server. Services such as AWS and Google Cloud are used, and the data is managed with backups and security measures in place. The output is the visual artifact stored in the cloud.

[0670] Step 6:

[0671] The user's device downloads visual artifacts from cloud storage and displays them visually within a dedicated application. Here, a user interface is generated via the device's GUI, providing the output. The user can then share this with family and friends. The input is the visual artifact in the cloud, and the output is the artifact displayed on the device.

[0672] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0673] This invention combines a system for organizing and utilizing digital content with an emotion engine that recognizes user emotions and reflects them in content suggestions.

[0674] Users upload digital content such as photos and videos to the server using their own digital devices. If they have analog content, they scan it, digitize it, and upload it to the server as well.

[0675] The server analyzes the uploaded digital data and automatically classifies it based on its format. Generative AI is used to convert and optimize the format and file size of the digital content. Then, supplementary information such as regional and historical context is retrieved from an external database, and this information is integrated with the user's profile and life story. This process generates a digital book.

[0676] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. This information is used in the digital book generation process to suggest content tailored to the user's emotions. For example, if the engine analyzes that the user is feeling nostalgic, it will suggest specific photos and layouts that evoke that emotion.

[0677] The generated digital books are securely stored in cloud storage. Users can access, view, and edit their digital books at any time through a dedicated application or web browser.

[0678] For example, when a user uploads photos and videos documenting their travel experiences, the server organizes them chronologically, adding historical context and related stories about the places visited. Furthermore, when the emotion engine senses the user's level of excitement at the time of upload, it automatically adjusts the color scheme and content placement to enhance that excitement. Ultimately, the user can enjoy a personalized digital book that strongly reflects their emotions.

[0679] The following describes the processing flow.

[0680] Step 1:

[0681] Users select digital content such as photos and videos using their own digital devices and upload them to the server. If analog content is available, they use scanners or digital conversion functions to digitize it before uploading it.

[0682] Step 2:

[0683] The server analyzes the received data and identifies the file format of each piece of content (JPEG, PNG, MP4, etc.). Subsequently, the server uses metadata (date and time of capture, location information, etc.) to classify the content and organize it into categories.

[0684] Step 3:

[0685] The server activates a generation AI to convert data in different formats into a unified format. At the same time, it optimizes the resolution and file size to suit the user's terminal environment, enabling efficient data management.

[0686] Step 4:

[0687] The server connects to an external database to retrieve local information and historical background data related to the uploaded content. This information, along with the user's profile information and life story, is incorporated into the creation of the digital book.

[0688] Step 5:

[0689] The emotion engine activates and analyzes the user's emotions based on past user behavior data and current interactions. Based on this, the server generates recommendations tailored to the user's emotions and creates a digital book that reflects them.

[0690] Step 6:

[0691] The completed digital book is stored in cloud storage by the server. Users can access this digital book from their devices. Furthermore, they can add new information and media, and share it with family and friends.

[0692] Step 7:

[0693] Users view and edit digital books using a dedicated application or web browser. The emotional feedback users provide to the system is continuously stored in the emotion engine as learning data and used to suggest content for future use.

[0694] (Example 2)

[0695] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0696] There is a need to efficiently organize the vast amounts of digital content data and provide personalized content suggestions based on the user's emotional state and characteristics. Furthermore, it is essential that the generated digital deliverables are always stored in an up-to-date state and are consistently accessible across multiple devices. Additionally, an interface that allows users to easily share and add information through sharing functions is crucial.

[0697] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0698] This invention includes a server that automatically classifies uploaded digital data according to its format, a server that utilizes generative AI technology to convert and optimize the classified digital data into a unified format, and a server that acquires regional information and historical contextual information from an external storage device. This enables users to efficiently manage vast amounts of digital content and receive personalized content suggestions based on their emotions and characteristics. Furthermore, the generated digital output is always stored in an up-to-date state, providing an environment in which users can consistently access and share it across multiple devices.

[0699] "Uploaded digital data" refers to electronic information such as photos, videos, and audio sent by users from their information terminals to a server.

[0700] "Generative AI technology" is a technique that uses artificial intelligence to automatically perform data format transformation and optimization, and it is a technology that utilizes machine learning models.

[0701] An "external storage device" is an information source that allows a server to access and obtain supplementary information such as regional information and historical context.

[0702] "Digital deliverables" refer to digital books or similar content in electronic format that are generated by a server and provided to the user.

[0703] "Emotional state" refers to the user's feelings at a given time, estimated based on their past actions and interactions.

[0704] An "information terminal" is a device such as a computer or smartphone that a user uses to view, edit, and upload digital data.

[0705] This system is designed to help users efficiently organize and utilize digital content. Users upload digital data such as photos and videos from their information terminals to the server. For analog content, they digitize it using the terminal's scanner before uploading. The server receives this data and performs analysis and optimization using a generative AI model. Specific generative AI technologies used include TensorFlow and PyTorch.

[0706] The server automatically classifies uploaded data based on its format. This process utilizes machine learning clustering algorithms and image recognition technologies. After classification, the data is converted into a unified format using generative AI and optimized for size. The optimized data is integrated with regional and historical contextual supplementary information collected via external storage devices. The server then generates and delivers a digital book as a digital output.

[0707] The sentiment engine is used to estimate a user's emotional state based on their past behavior and current interactions. This information is reflected in the content presented by the server, providing a personalized experience. The IBM Watson Sentiment Analysis API may be used for sentiment analysis.

[0708] The generated digital books are securely stored in cloud storage and can be accessed at any time from the user's device. The digital books can be viewed and edited through a dedicated application or web browser.

[0709] As a concrete example, a user uploads photos and videos from their trip to a server, from which a digital book containing historical information and stories about the places they visited is generated. This digital book's color scheme and layout are optimally adjusted based on an emotion engine that analyzes the user's state of excitement.

[0710] An example of a prompt might be, "Analyze your emotions during your trip, gather information about your destinations, and compile it into a digital book. Include a suggested color palette, especially to express your excitement." This allows users to obtain a personalized digital product that strongly reflects their emotions and experiences.

[0711] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0712] Step 1:

[0713] Users upload digital content such as photos and videos to the server using their information terminals. This serves as input, and the server receives the digital content. Specifically, the HTTP protocol is used to send the files. As output, the server saves the received data to its storage.

[0714] Step 2:

[0715] The server analyzes uploaded digital data and automatically classifies it based on its format. The input is uploaded digital data, and image recognition and metadata analysis are performed. Specifically, image features are extracted using OpenCV, and the classified results are output.

[0716] Step 3:

[0717] The server applies a generative AI model to classified digital data, performing format conversion and size optimization. The input is classified digital data, and format conversion and compression are performed using tools such as TensorFlow and PyTorch. The output is optimized data.

[0718] Step 4:

[0719] The server retrieves regional information and supplementary information about the historical context from external storage devices. The input here is classified digital data, and the process involves retrieving appropriate information via external APIs. Specifically, a REST API is used to retrieve destination-related information, and integrated data is output.

[0720] Step 5:

[0721] The emotion engine analyzes the user's past behavior and current interactions to estimate their emotional state. The input is user behavior history data, which is processed using tools such as the IBM Watson Sentiment Analysis API. The output is the estimated emotional state, which is then used in the next step of the process.

[0722] Step 6:

[0723] The server generates personalized digital artifacts based on the estimated emotional state. Both auxiliary and emotional information are input for digital book generation, and color tones and layouts appropriate to the user's emotions are applied. The output is the final digital book provided to the user.

[0724] Step 7:

[0725] The generated digital book is securely stored in cloud storage. The input is the digital book itself, and the output is the storage location information on the cloud. Users can access this digital book through a dedicated app or web browser, and further content updates are possible by viewing and editing it.

[0726] (Application Example 2)

[0727] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0728] With the increasing volume of digital content, there is a growing need to organize and effectively utilize personalized content that responds to the emotions of individual users. Traditional methods have struggled to suggest content that reflects the emotional state of users, making it difficult to generate electronic documents that strongly convey individual experiences.

[0729] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0730] In this invention, the server includes means for automatically classifying uploaded electronic information according to its format, means for converting the classified electronic information into a unified format, means for acquiring regional information and auxiliary information such as temporal background from an external storage device, and means for analyzing the user's emotional state using an emotion engine and selecting and editing content. This makes it possible to generate personalized electronic documents that are in line with the user's emotions.

[0731] "Electronic information" refers to all data expressed in digital format, and includes a variety of media such as audio, video, text, and images.

[0732] "Format" refers to attributes that indicate the structure, type, and representation of data, and is an indicator of how data is organized and processed.

[0733] "External storage devices" refer to external databases and cloud storage accessed via the internet or other connections, allowing data to be retrieved from a variety of sources.

[0734] "Local information" refers to data related to a specific geographical area, including information such as geographical names and locations, and cultural and social backgrounds.

[0735] "Temporal context" refers to information about a specific time or era, including historical events, historical background, and characteristic events of each era.

[0736] "User" refers to an individual or legal entity that uses the present invention and is a person who receives the services provided by this system.

[0737] "Personal data" refers to information about an individual, such as the user's name, address, age, and occupation, and includes information that can identify the user.

[0738] "Life history" refers to information about a user's past behavior, experiences, and activities, including their personal history and life events.

[0739] "Electronic documents" refer to books and documents generated in digital format, and are publications that integrate multiple media, including text, images, and videos.

[0740] An "emotion engine" refers to a system or software that analyzes and interprets a user's emotional state and has the function of adjusting content according to the user's emotions.

[0741] "Electrical devices" refer to equipment used for processing, storing, and displaying electronic information, and include smartphones, tablets, and personal computers.

[0742] The system that implements this application operates by uploading digital information to the cloud through an application installed on a user's device, such as a smartphone or tablet. The uploaded electronic information is automatically classified by the server based on its format and converted into a unified format. Furthermore, auxiliary information such as regional information and temporal context is retrieved from an external storage device and used together with the electronic information.

[0743] The server is equipped with an emotion engine that analyzes the user's emotional state based on their past behavior. For this analysis, an emotion analysis library and generative AI models are used. Based on the emotional data analyzed by the emotion engine, content appropriate to the user is selected and edited. As a result, personalized electronic documents are generated.

[0744] The generated electronic documents are stored in network storage and provided in a format viewable on the user's device. This system allows users to obtain a personalized experience tailored to their own emotions.

[0745] As a concrete example, when a user uploads photos or videos taken in a park on a holiday to the application, the system analyzes the feelings of happiness and relaxation experienced at that time. Based on this, it can generate a story-driven video with calming music playing in the background. An appropriate prompt would be, "Please generate an emotional and calming video story that evokes a relaxed mood using photos and videos taken in a park on a holiday."

[0746] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0747] Step 1:

[0748] The user uploads electronic information such as photos and videos to the cloud using their device. The input for this step is the digital data captured by the user, and the output is the data stored in cloud storage. The file structure is modified based on the user's actions.

[0749] Step 2:

[0750] The server automatically classifies electronic information stored in the cloud according to its format. The input is uploaded digital data, and the output is a dataset classified by category. The server uses machine learning algorithms to analyze the data format and assign it to the appropriate category.

[0751] Step 3:

[0752] The server converts the classified data into a unified format. The input for this step is classified digital data, and the output is the converted data in a unified format. The server uses a data conversion tool to convert data in different formats into compatible formats.

[0753] Step 4:

[0754] The server retrieves regional information and temporal contextual information from external storage. The input consists of location information and timestamps contained within the data, while the output is this information along with related auxiliary data. The server retrieves this information by calling an external API.

[0755] Step 5:

[0756] The server analyzes the user's emotional state using an emotion engine. The input is the user's past behavioral data, and the output is the estimated emotional state. The server utilizes an emotion analysis library and a generative AI model to estimate the user's emotions.

[0757] Step 6:

[0758] The server selects and edits content based on emotional states. The input is a dataset containing estimated emotional states and supplementary information, and the output is a personalized electronic document. The server uses an editing algorithm to select appropriate elements that highlight the emotions and generate a digital book.

[0759] Step 7:

[0760] The server stores the generated electronic documents in network storage and prepares them in a format that can be provided to the user's terminal. The input is a personalized electronic document, and the output is a digital book in a format viewable by the user. The server uses prompts to perform necessary format conversions.

[0761] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0762] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include those described above. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions shown by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0763] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0764] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0765] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0766] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0767] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0768] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0769] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0770] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0771] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0772] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0773] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0774] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0775] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0776] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0777] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0778] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0779] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0780] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0781] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted to be incorporated by reference.

[0782] The following is further disclosed regarding the embodiments described above.

[0783] (Claim 1)

[0784] A means of automatically classifying uploaded digital data according to its format,

[0785] A means of converting classified digital data into a unified format,

[0786] A means of obtaining regional information and supplementary information on historical context from external databases,

[0787] A method for generating digital books that takes into account the user's profile and life journey,

[0788] A means of providing the generated digital book to the user's terminal in a viewable format,

[0789] A system that includes this.

[0790] (Claim 2)

[0791] The system according to claim 1, comprising means for storing the generated digital book in cloud storage and for automatically adjusting it in response to user device updates.

[0792] (Claim 3)

[0793] The system according to claim 1, comprising means for adding additional information to a digital book or providing an interface for sharing specific pages.

[0794] "Example 1"

[0795] (Claim 1)

[0796] A means of automatically classifying uploaded electronic data based on its attributes,

[0797] A means of converting classified electronic data into unified attributes,

[0798] A means of obtaining location information and supplementary information on historical context from an external database,

[0799] A method for generating ebooks using a generative AI model that takes into account user characteristics and history,

[0800] A means of providing the generated e-book to the user's device in a referable format,

[0801] A system that includes this.

[0802] (Claim 2)

[0803] The system according to claim 1, comprising means for storing the generated e-books in a remote storage device and for automatically adjusting them in response to updates to the user's device.

[0804] (Claim 3)

[0805] The system according to claim 1, comprising means for adding additional information to an ebook or providing a connection surface for sharing specific pages.

[0806] "Application Example 1"

[0807] (Claim 1)

[0808] A means of automatically classifying uploaded information based on its attributes,

[0809] A means of converting classified information into an integrated format,

[0810] Means of obtaining additional information on relevant information and historical background from external sources,

[0811] A means of generating digital records that take into account the user's characteristics and history,

[0812] A means of providing the generated digital record to the user's device in a viewable format,

[0813] A method for adding narration and music to digital content using a generative AI model to create visual deliverables based on a storyline,

[0814] A system that includes this.

[0815] (Claim 2)

[0816] The system according to claim 1, comprising means for storing the generated digital records in an online storage system and for automatically adjusting them in accordance with updates to the user's device.

[0817] (Claim 3)

[0818] The system according to claim 1, comprising means for adding supplementary information to a digital record or providing an operating environment for sharing specific content.

[0819] "Example 2 of combining an emotion engine"

[0820] (Claim 1)

[0821] A means of automatically classifying uploaded digital data according to its format,

[0822] A means of using generative AI technology to convert classified digital data into a unified format and optimize it,

[0823] A means of obtaining regional information and supplementary information on historical context from an external storage device,

[0824] A means of generating digital deliverables, taking into account user characteristics and the passage of time,

[0825] A means of analyzing the user's emotional state and reflecting it in the proposed content of the deliverables,

[0826] A means of providing the generated digital output in a viewable format to the user's information terminal,

[0827] A system that includes this.

[0828] (Claim 2)

[0829] The system according to claim 1, comprising means for storing the generated digital output in a remote information storage system and for automatically adjusting it in response to changes in the user's equipment.

[0830] (Claim 3)

[0831] The system according to claim 1, comprising means for providing means for adding additional information to digital deliverables or for sharing specific parts of them.

[0832] "Application example 2 when combining with an emotional engine"

[0833] (Claim 1)

[0834] A means for automatically classifying uploaded electronic information according to its format,

[0835] A means of converting classified electronic information into a unified format,

[0836] A means of obtaining regional information and auxiliary information regarding the temporal background from an external storage device,

[0837] A means of generating electronic documents while considering the user's personal data and lifestyle history,

[0838] A means of analyzing the user's emotional state using an emotion engine and selecting and editing content accordingly.

[0839] A means of providing the generated electronic document to the user's electrical device in a displayable format,

[0840] A system that includes this.

[0841] (Claim 2)

[0842] The system according to claim 1, which includes means for storing generated electronic documents in a network storage device and for automatically adjusting them in response to user terminal updates.

[0843] (Claim 3)

[0844] The system according to claim 1, comprising means for adding supplementary information to an electronic document or providing an operation screen for sharing a specific page. [Explanation of symbols]

[0845] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. A means of automatically classifying uploaded information based on its attributes, A means of converting classified information into an integrated format, Means of obtaining additional information on relevant information and historical background from external sources, A means of generating digital records that take into account the user's characteristics and history, A means of providing the generated digital record to the user's device in a viewable format, A method for adding narration and music to digital content using a generative AI model to create visual deliverables based on a storyline, A system that includes this.

2. The system according to claim 1, comprising means for storing the generated digital records in an online storage system and for automatically adjusting them in accordance with updates to the user's device.

3. The system according to claim 1, comprising means for adding supplementary information to a digital record or providing an operating environment for sharing specific content.