system

The system automates biodata and clinical trial data processing, using machine learning and generative AI to efficiently identify and visualize new drug candidates, addressing the inefficiencies and high costs of traditional drug development methods.

JP2026104531APending Publication Date: 2026-06-25SOFTBANK GROUP CORP

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Applications
Current Assignee / Owner
SOFTBANK GROUP CORP
Filing Date
2024-12-13
Publication Date
2026-06-25

Smart Images

  • Figure 2026104531000001_ABST
    Figure 2026104531000001_ABST
Patent Text Reader

Abstract

Provide a system. 【Solution means】 Means for collecting biological information and clinical trial information, Means for analyzing the above information with a machine learning algorithm to identify new drug candidates, Means for performing simulations of the identified new drug candidate substances in virtual space, Means for generating new chemical structures using a generative AI model, Means for analyzing relevant materials to extract useful findings, Output device means for visualizing and displaying the analysis results, Means for the user to construct a logical hypothesis using the above results and select candidate substances, Means for generating and recommending customized content based on the user's interests, A system including the above.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The technology of the present disclosure relates to a system.

Background Art

[0002] Patent Document 1 discloses a persona chatbot control method performed by at least one processor, including steps of receiving a user utterance, adding the user utterance to a prompt including an instruction sentence related to an explanation of a chatbot character, encoding the prompt, and inputting the encoded prompt into a language model to generate a chatbot utterance in response to the user utterance.

Prior Art Documents

Patent Documents

[0003]

Patent Document 1

Summary of the Invention

Problems to be Solved by the Invention

[0004] The problem is that the new drug development process is very time-consuming, costly, and has a low success rate. In particular, it is difficult to efficiently analyze a huge amount of biodata, clinical trial data, and complex gene information to identify promising new drug candidates, and these processes are often carried out manually based on experience. Also, in the conventional methods, a lot of time is required for hypothesis construction and experimental plan formulation, so optimization of research and development is demanded.

Means for Solving the Problems

[0005] This invention provides a means for automatically collecting biodata and clinical trial data and performing preprocessing, including noise reduction. The collected data is then analyzed using a machine learning model to identify potential drug molecules. Furthermore, in silico simulations are performed on the identified candidates, and new chemical structures are generated using a generative AI model. Finally, relevant literature is analyzed to extract useful insights. The analysis results are visualized on a terminal, enabling the user to construct hypotheses and select candidate molecules based on the results. This provides a system that enables efficient and highly accurate new drug development.

[0006] "Biodata" refers to digital data of biological information obtained from living organisms, including gene sequences, protein information, and cellular activity data.

[0007] "Clinical trial data" refers to data collected to evaluate the safety and effectiveness of new drugs or treatments, and includes subject responses and trial results.

[0008] A "machine learning model" is an artificial intelligence algorithm trained to identify patterns in data and make predictions.

[0009] "In silico simulation" is a method of virtually experimenting with biological processes using a computer, and it predicts results without actually conducting an experiment.

[0010] A "generative AI model" is an artificial intelligence model that has the ability to generate new data and structures, and is particularly used for creative tasks.

[0011] "Visualization" is a method of representing data visually and providing information in a way that is easy for people to understand intuitively.

[0012] A "hypothesis" is a theoretical premise established to explain observational results, and experiments and verifications are conducted based on this hypothesis.

[0013] A "molecule" is a collection of atoms linked together by chemical bonds, and is the basic building block of matter. [Brief explanation of the drawing]

[0014] [Figure 1] This is a conceptual diagram showing an example of the configuration of a data processing system according to the first embodiment. [Figure 2] This is a conceptual diagram showing an example of the essential functions of a data processing device and a smart device according to the first embodiment. [Figure 3] This is a conceptual diagram showing an example of the configuration of a data processing system according to the second embodiment. [Figure 4] This is a conceptual diagram showing an example of the main functions of a data processing device and smart glasses according to the second embodiment. [Figure 5] This is a conceptual diagram showing an example of the configuration of a data processing system according to the third embodiment. [Figure 6] This is a conceptual diagram showing an example of the main functions of a data processing device and a headset-type terminal according to the third embodiment. [Figure 7] This is a conceptual diagram showing an example of the configuration of a data processing system according to the fourth embodiment. [Figure 8] This is a conceptual diagram showing an example of the main functions of a data processing device and a robot according to the fourth embodiment. [Figure 9] This shows an emotion map where multiple emotions are mapped. [Figure 10] This shows an emotion map where multiple emotions are mapped. [Figure 11] This is a sequence diagram showing the processing flow of the data processing system in Example 1. [Figure 12] This is a sequence diagram showing the processing flow of the data processing system in Application Example 1. [Figure 13] This is a sequence diagram showing the processing flow of the data processing system in Example 2, which incorporates an emotion engine. [Figure 14] This is a sequence diagram showing the processing flow of the data processing system in Application Example 2, which combines an emotion engine.

Mode for Carrying Out the Invention

[0015] Hereinafter, an example of an embodiment of a system according to the technology of the present disclosure will be described with reference to the accompanying drawings.

[0016] First, the terms used in the following description will be explained.

[0017] In the following embodiments, a numbered processor (hereinafter simply referred to as "processor") may be a single arithmetic unit or a combination of multiple arithmetic units. Also, the processor may be a single type of arithmetic unit or a combination of multiple types of arithmetic units. Examples of arithmetic units include a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), a GPGPU (General-Purpose computing on Graphics Processing Units), an APU (Accelerated Processing Unit), and the like.

[0018] In the following embodiments, a numbered RAM (Random Access Memory) is a memory in which information is temporarily stored and is used as a work memory by the processor.

[0019] In the following embodiments, a numbered storage is one or more non-volatile storage devices that store various programs and various parameters, etc. Examples of non-volatile storage devices include flash memory (SSD (Solid State Drive)), magnetic disks (e.g., hard disks), or magnetic tapes, and the like.

[0020] In the following embodiments, the signed communication interface (I / F) is an interface that includes a communication processor and an antenna, etc. The communication interface manages communication between multiple computers. Examples of communication standards applicable to the communication interface include wireless communication standards such as 5G (5th Generation Mobile Communication System), Wi-Fi (registered trademark), or Bluetooth (registered trademark).

[0021] In the following embodiments, "A and / or B" is synonymous with "at least one of A and B." That is, "A and / or B" means that it may be A alone, or B alone, or a combination of A and B. Furthermore, in this specification, the same concept as "A and / or B" applies when expressing three or more things linked by "and / or."

[0022] [First Embodiment]

[0023] Figure 1 shows an example of the configuration of the data processing system 10 according to the first embodiment.

[0024] As shown in Figure 1, the data processing system 10 includes a data processing device 12 and a smart device 14. An example of the data processing device 12 is a server.

[0025] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0026] The smart device 14 comprises a computer 36, a reception device 38, an output device 40, a camera 42, and a communication interface 44. The computer 36 comprises a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The reception device 38, output device 40, and camera 42 are also connected to the bus 52.

[0027] The reception device 38 is equipped with a touch panel 38A and a microphone 38B, etc., and receives user input. The touch panel 38A receives user input by detecting contact with an object (e.g., a pen or finger). The microphone 38B receives user input by detecting the user's voice. The control unit 46A transmits data indicating the user input received by the touch panel 38A and microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the data indicating the user input.

[0028] The output device 40 includes a display 40A and a speaker 40B, and presents data to the user 20 by outputting the data in a form perceptible to the user 20 (e.g., audio and / or text). The display 40A displays visible information such as text and images according to instructions from the processor 46. The speaker 40B outputs audio according to instructions from the processor 46. The camera 42 is a small digital camera equipped with an optical system such as a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor.

[0029] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various types of information between processor 46 and processor 28 via network 54.

[0030] Figure 2 shows an example of the main functions of the data processing device 12 and the smart device 14.

[0031] As shown in Figure 2, in the data processing device 12, a specific processing is performed by the processor 28. A specific processing program 56 is stored in the storage 32. The specific processing program 56 is an example of a "program" related to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 according to the specific processing program 56 executed on the RAM 30.

[0032] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0033] In the smart device 14, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The reception output program 60 is used in conjunction with a specific processing program 56 by the data processing system 10. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0034] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0035] This invention is implemented as an automated system to accelerate new drug development and improve its success rate. The system functions as a core system with three components: a server, a terminal, and a user.

[0036] Server-based functionality

[0037] The server first automatically collects large amounts of biodata and clinical trial data from diverse databases. Next, it preprocesses this data to remove noise and improve quality. The purified data is then analyzed by machine learning models to identify molecules and compounds that could be candidates for new drugs. The server further performs in silico simulations to verify the biological effects of candidate molecules in a virtual environment. It also uses generative AI models to generate novel chemical structures and explore new possibilities. The server also analyzes relevant literature, supporting research by extracting highly relevant insights.

[0038] Device-specific features

[0039] On the terminal, analysis results and generated information sent from the server are displayed visually. The terminal provides an interface that allows users to intuitively understand these results and decide on the next action. Through data visualization, users can easily grasp the relationships and trends between datasets and formulate concrete hypotheses.

[0040] User-defined features

[0041] Based on the data presented by the device, users select the most promising new drug candidates. Furthermore, users can formulate hypotheses based on the analysis results and design and adjust protocols for actual experiments and clinical trials. The results of the experiments conducted by the user are fed back to the server and used to improve the accuracy of the AI ​​model.

[0042] Specific example

[0043] For example, in the early stages of new drug development, the server collects relevant information from diverse gene databases and analyzes it to discover promising molecules for specific disease conditions. Then, the function of these molecules is verified through in silico simulations, and they are evaluated, including any newly generated molecules. Finally, based on the user's experience and expertise, they can plan the next research and development steps from the presented options, enabling efficient new drug development.

[0044] The following describes the processing flow.

[0045] Step 1:

[0046] The server automatically collects necessary data from various biodatabases and clinical trial databases. It retrieves data using APIs and web scraping and stores it in the database.

[0047] Step 2:

[0048] The server preprocesses the collected data. It performs noise reduction and imputation of missing values ​​to improve data quality and ensure the accuracy of the analysis.

[0049] Step 3:

[0050] The server feeds the pre-processed data into a machine learning model for analysis. The model extracts patterns and features from the data to identify potential drug molecules.

[0051] Step 4:

[0052] The server performs in silico simulations to evaluate the biological effects of identified new drug candidate molecules in a virtual environment. The simulation results are then used in the next step.

[0053] Step 5:

[0054] The server uses a generative AI model to generate new chemical structures. This allows for the exploration of further possibilities by comparing them with existing molecules.

[0055] Step 6:

[0056] The server analyzes relevant literature from the bibliographic database and extracts information and insights useful for research. This information is provided to users as support information.

[0057] Step 7:

[0058] The terminal visualizes analysis results, simulation data, generated molecular information, and related literature information sent from the server and displays them on the user interface.

[0059] Step 8:

[0060] Based on the information provided on the device, users formulate hypotheses and plan the next steps in new drug development. They then design experimental plans for the selected drug candidate molecules.

[0061] Step 9:

[0062] The user's experimental results are fed back to the server. This allows the AI ​​model to be further trained, improving the accuracy of the analysis.

[0063] (Example 1)

[0064] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0065] The drug development process consumes a great deal of time and resources, making it necessary to identify and evaluate drug candidates more efficiently and effectively. This invention was developed to improve the success rate of drug development by processing the vast amount and complexity of data generated in this process, and by enabling highly accurate data analysis and hypothesis building.

[0066] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0067] In this invention, the server includes means for collecting and storing information, means for analyzing the information using a machine learning algorithm to identify new product candidates, and means for generating new structures using a generative AI algorithm. This makes it possible to efficiently analyze vast amounts of data and rapidly identify promising new drug candidates.

[0068] "Means for collecting and storing information" refers to a function that automatically retrieves relevant data from various sources and stores that data in storage as needed.

[0069] "Means of analyzing data using machine learning algorithms to identify potential new products" refers to a function that applies a specific algorithm to collected data to detect useful patterns and trends, and then uses this to identify new and potentially valuable products.

[0070] "Means of performing virtual simulations" refers to a function that runs simulated experiments on a computer to predict the impact in the real world and analyzes the results.

[0071] "Methods for generating new structures using generative AI algorithms" refer to processes that utilize artificial intelligence technology to automatically create new compounds and designs based on existing data.

[0072] "Means for analyzing related books and extracting useful insights" refers to a function that scans literature databases and identifies important information and insights related to a specific theme.

[0073] "Means of providing a user interface for visualizing and displaying analysis results" refers to a function that provides an interface for graphically representing complex data and results, enabling users to understand them intuitively.

[0074] "Means for users to construct hypotheses using the results and select candidate products" refers to support functions that enable users to make inferences based on the provided information and to make important decisions based on those inferences.

[0075] This invention is a system for streamlining data analysis and decision-making in new drug development and product design. The system mainly consists of three components: a server, a terminal, and a user.

[0076] The server first has the function of collecting and storing information such as biodata and research results from multiple databases. Specifically, it retrieves data from APIs using the programming language Python and the data processing library Pandas. After collection, the server analyzes the data by applying machine learning algorithms to identify promising new drug candidates. Machine learning frameworks such as TENSORFLOW® and Scikit-learn are used for this analysis.

[0077] Next, the server uses a generative AI model to generate new chemical structures. For example, it uses OpenAI® models to generate new ideas through prompts that utilize natural language processing technology. Furthermore, a text analysis library is used to analyze relevant literature based on the collected data and extract useful insights.

[0078] The terminal visualizes the analysis results and generated information from the server, providing them as graphs and charts for easy user understanding. Tableau and Power BI are used as data visualization tools. This enables users to make intuitive decisions.

[0079] Based on the information provided on the device, users construct hypotheses about new products and make selections. The experimental results and observations selected by the users are fed back to the server and used to continuously improve the machine learning model.

[0080] As a concrete example, here is an example of a prompt: "In the development of next-generation anticancer drugs, please tell me the steps necessary to select a compound that is effective against a specific gene mutation." Using this prompt, the AI ​​model provides new insights and supports the user's decision-making.

[0081] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0082] Step 1:

[0083] The server first accesses databases to collect information. Specifically, it retrieves necessary data from biodatabases and clinical trial databases via APIs. Inputs include search queries and access keys, and the output is a structured dataset. This data is stored and prepared for use in the next processing step.

[0084] Step 2:

[0085] The server preprocesses the collected data. Specifically, it uses the Python Pandas library to detect and impute or remove missing or outlier values. The input is the collected raw data, and the output is a clean, unified dataset. This dataset is then prepared for further analysis.

[0086] Step 3:

[0087] The server applies machine learning algorithms to preprocessed data for analysis. Using TensorFlow or Scikit-learn, it identifies promising new drug candidates from the input data. The input is a preprocessed dataset, and the output is the results of identifying new drug candidates. Ranking and scoring are then performed based on these results.

[0088] Step 4:

[0089] The server runs in silico simulations, which are used to evaluate the biological effects in a virtual environment and are performed using simulation software. The input is a new drug candidate identified by machine learning, and the output is the simulation results. These results are used to predict the efficacy and safety of the molecule.

[0090] Step 5:

[0091] The server generates new chemical structures using a generative AI model. This process takes text prompts as input and runs the generative model to produce unique structures. Specifically, it uses OpenAI's generative model, providing prompt text as input to create innovative molecules and compounds.

[0092] Step 6:

[0093] The server analyzes relevant materials and extracts useful insights. It utilizes natural language processing techniques to analyze text data and identify important information. The input is text data from literature, and the output is analyzed key points and insights. These insights serve as a guide for determining the direction of research and development.

[0094] Step 7:

[0095] The terminal visually displays the analysis results and generated information. Data visualization tools are used to represent the results in graphs and charts. The input is the analysis results, and the output is visualized information provided in a user-friendly format.

[0096] Step 8:

[0097] Users build hypotheses and make selections based on information visualized on their devices. They then review the results and make a decision to select the most suitable new drug candidate. The input is visualized information, and the output is the next action and feedback determined by the user. This is fed back to the server and contributes to further improving the accuracy of the AI ​​model.

[0098] (Application Example 1)

[0099] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0100] Traditionally, efficiently analyzing large amounts of biodata and clinical trial data in new drug development and rapidly identifying promising drug candidates has been difficult. Furthermore, visualizing this data and tailoring content generation based on user interests required significant manual effort and time. Therefore, there was a need to automate the delivery of individually optimized content to users simultaneously with the discovery of new drug candidates.

[0101] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0102] In this invention, the server includes means for collecting biometric and clinical trial information, means for analyzing the information with machine learning algorithms to identify new drug candidates, means for generating new chemical structures using generative AI models, and means for generating and recommending customized content based on the user's interests. This enables the user to quickly and accurately identify new drug candidates and receive personalized content.

[0103] "Biometric information" refers to data related to biological or health status, including genetic information and clinical test results.

[0104] "Clinical trial information" refers to data obtained from trials conducted to evaluate the safety and efficacy of pharmaceuticals.

[0105] A "machine learning algorithm" is a set of techniques for learning patterns from data and performing predictions and classifications.

[0106] A "new drug candidate" refers to a molecule or substance that is expected to be developed as a pharmaceutical product.

[0107] "Simulation in a virtual space" is a process that uses computer technology to simulate real-world phenomena.

[0108] A "generative AI model" is a model that generates new data and content based on artificial intelligence technology.

[0109] "Materials" refer to relevant documents and datasets, which are sources of information for gaining insights.

[0110] "Output device means" refers to a device or interface that displays analysis results and other information in a way that allows the user to visually confirm them.

[0111] A "logical hypothesis" is an unproven theory set up to explain observed phenomena.

[0112] "Personal interest-based content" refers to information and entertainment that is individually customized based on a user's past behavior and preferences.

[0113] To implement this invention, a system is constructed that uses three main components: a server, a terminal, and a user.

[0114] The server automatically collects biometric and clinical trial information from various databases and performs preprocessing such as noise reduction to refine the information. Next, machine learning algorithms are used to analyze this information and identify potential new drug candidates. The initially identified candidate substances are then virtually validated for their efficacy and safety through simulations in a virtual space. Simultaneously, generative AI models generate new chemical structures, advancing the exploration of new drug possibilities.

[0115] The terminal visually displays analysis results and new chemical structures generated on the server to the user. This allows the user to easily grasp the overall picture of the data and obtain an intuitive interface for constructing logical hypotheses. Furthermore, the terminal provides content based on the user's interests and preferences, supporting the personalization of information.

[0116] Users utilize information provided via their devices to select the most promising new drug candidates. Furthermore, users build hypotheses based on their expertise and design protocols for actual experiments and clinical trials. The results of the experiments conducted by users are sent back to the server through a feedback function, contributing to the updating and accuracy improvement of the AI ​​model.

[0117] As a concrete example, when generating new movie content, a generative AI model can be used to suggest a new story based on the user's past viewing history. An example of this prompt might be the text, "Please suggest the most suitable new story from all movie genres based on the user's interests."

[0118] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0119] Step 1:

[0120] The server automatically collects biometric and clinical trial information from various databases. It uses various databases as input and outputs an initial, unprocessed dataset. This data contains information necessary for discovering new drug candidates and forms the basis for analysis in subsequent processing steps.

[0121] Step 2:

[0122] The server preprocesses the collected data to remove noise. Using the initial raw dataset as input, it produces improved, clean data as output. Specifically, it identifies outliers and duplicate data and removes or corrects them.

[0123] Step 3:

[0124] The server uses machine learning algorithms to analyze clean data and identify potential new drug candidates. It takes clean data as input and generates a list of potential drug candidates as output. Through the analysis, a model is trained to understand the relationships between data and the effectiveness of specific molecules.

[0125] Step 4:

[0126] The server performs virtual simulations of identified new drug candidates to evaluate their biological effects. It uses a list of new drug candidates as input and collects simulation results as output. The simulator virtually tests the molecular structure and function of the compounds, virtually confirming their effectiveness.

[0127] Step 5:

[0128] The server automatically generates new chemical structures using a generative AI model. It takes a trained AI model and simulation results as input and generates candidate new chemical structures as output. This includes the generative AI exploring new chemical possibilities while referring to prompts.

[0129] Step 6:

[0130] The terminal visually displays analysis results and generated chemical structure information obtained from the server to the user. It uses data provided by the server as input and generates a visual report as output. The user interface presents information in an easily understandable format through data visualization.

[0131] Step 7:

[0132] The user constructs hypotheses and selects new drug candidates based on the information presented on the device. Visual reports are used as input, and the system provides a refined list of candidate substances as output. The user then leverages their expertise to make decisions for designing the protocols for subsequent experiments and tests.

[0133] Step 8:

[0134] The user selection results and experimental protocols are fed back to the server and used to update the AI ​​model. The system receives the user selection results as input and builds a more accurate machine learning model as output. This cycle improves prediction accuracy and the potential for new drug development in subsequent trials.

[0135] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0136] This invention provides a system that enhances the user experience and enables efficient research progress by incorporating emotion recognition into the new drug development process. The system consists of components including a server, terminals, a user interface, and an emotion engine.

[0137] Server-based functionality

[0138] The server performs traditional data collection, preprocessing, analysis using machine learning models, in silico simulations, and molecular generation using generative AI models. Analysis results and the latest research insights are transmitted to the terminal to support user decision-making. The server also analyzes user sentiment data collected by the emotion engine and incorporates it into the research process.

[0139] Device-specific features

[0140] The terminal visually displays data and analysis results sent from the server, providing an interface that is easy for the user to understand and operate. Furthermore, the terminal analyzes the user's facial expressions and voice tone through its built-in emotion engine to recognize their emotional state.

[0141] Functions powered by the emotion engine

[0142] The emotion engine monitors the user's emotions in real time and analyzes changes in the user's stress levels and interests. This information is sent to the server, allowing the system's responses and interface settings to be dynamically adjusted according to the user's emotions. For example, if a user shows excitement in response to a particular analysis result, the system will quickly provide relevant additional information to support the user's curiosity. Also, if a high stress level is detected in the user, the emotion engine can recommend content to promote relaxation.

[0143] User-defined features

[0144] Users utilize the analysis results and emotional feedback displayed on their devices to formulate hypotheses and plan experiments and clinical trials. The insights provided by the emotional engine help users understand their own emotional state and support better decision-making. Furthermore, users can adjust the responsiveness of the emotional engine and the system through the feedback function.

[0145] Specific example

[0146] For example, suppose the system detects tension from the user's facial expression while they are analyzing a new drug candidate. In this case, the device starts playing relaxing background music and simultaneously presents the analysis results again from a different perspective, helping the user consider the most appropriate option without feeling rushed. Furthermore, additional relevant materials may be provided based on the user's interests, further improving research efficiency.

[0147] The following describes the processing flow.

[0148] Step 1:

[0149] The server automatically collects diverse biodata and clinical trial data and stores it in a database. This prepares the rich information necessary for analysis.

[0150] Step 2:

[0151] The server preprocesses the collected data, removing noise and imputing missing values. By creating a clean dataset, it improves the accuracy of the analysis.

[0152] Step 3:

[0153] The server inputs pre-processed data into a machine learning model and performs data analysis. This analysis identifies molecules and targets that could be potential new drug candidates.

[0154] Step 4:

[0155] The server performs in silico simulations on identified candidate molecules and evaluates their biological effects in a virtual environment. This allows for the prediction of the effects of promising molecules.

[0156] Step 5:

[0157] The server generates new chemical structures using a generative AI model. This allows it to explore new possibilities based on existing data.

[0158] Step 6:

[0159] The server analyzes relevant literature and extracts useful insights. This information is provided in a format that is relevant to the research context.

[0160] Step 7:

[0161] The terminal visualizes the analysis results, simulation data, and generated molecular information transmitted from the server, and displays them in an easy-to-understand manner for the user.

[0162] Step 8:

[0163] The emotion engine within the device recognizes the user's emotions in real time from their facial expressions and voice, and sends the analysis results to the server.

[0164] Step 9:

[0165] Based on the emotional data it receives, the server adjusts the interface and presents information according to the user's state. For example, if the user is feeling stressed, it will suggest content on the device that promotes relaxation.

[0166] Step 10:

[0167] Users formulate hypotheses based on the presented analysis results and emotion-based insights, and plan the next steps in new drug development. They develop specific experimental plans and continuously improve the AI ​​model by providing feedback to the system.

[0168] (Example 2)

[0169] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart device 14 as the "terminal".

[0170] The research process in new drug development requires time-consuming and resource-intensive analysis and the extraction of insights from diverse data. However, conventional methods struggle to provide users with the most relevant information efficiently and dynamically, lacking responsiveness that considers users' emotional states and interests. This leads to a decline in the quality of decision-making, which is a significant challenge.

[0171] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0172] In this invention, the server includes means for collecting biological information and test data, means for identifying new drug candidates using an artificial intelligence model, and computing means for visualizing and displaying the analysis results. This enables efficient processing of the diverse data collected and dynamic information provision that takes into account the user's emotional state.

[0173] "Biological information" refers to data about the structure, function, or evolution of living organisms, and is used in new drug development and health analysis.

[0174] "Test data" refers to data collected through scientific and clinical trials, and is information used to evaluate the efficacy and safety of products and drugs.

[0175] An "artificial intelligence model" is a computer program that learns patterns from data and makes predictions and judgments based on machine learning and deep learning technologies.

[0176] "Virtual simulation" is a technique that uses computer software to mimic real-world processes and systems, and is a method for verifying hypotheses and designs before conducting physical tests.

[0177] A "generative artificial intelligence model" is an AI model that utilizes machine learning to create new data and information, and is used for generating molecular structures of new drugs and natural language text.

[0178] "Computing means" refers to devices and systems that use computers and electronic devices to process, analyze, and visualize data.

[0179] "Emotional data" refers to data that indicates a user's psychological or emotional state, obtained from things like their facial expressions and tone of voice.

[0180] "Solution" refers to technical, methodological, or means of overcoming a specific problem and achieving an objective.

[0181] This invention is a system that enables efficient and dynamic information provision in the new drug development process. The system mainly consists of three components: a server, a terminal, and a user.

[0182] The server first collects biological information and test data from various sources. The collected data is preprocessed using programming languages ​​such as Python and R to prepare it for appropriate analysis. This preprocessing removes noise from the data and makes it easier to extract important features. Next, the server analyzes the data using artificial intelligence models (e.g., Scikit-learn, TensorFlow) to identify potential new drugs. Furthermore, it utilizes generative artificial intelligence models (e.g., OpenAI models) to generate new chemical structures based on the prompt "Generate a potential new drug molecule."

[0183] The terminal plays the role of visually displaying the analysis results to the user. Using data visualization tools (e.g., Tableau, Power BI), it presents information clearly using methods such as heatmaps and box plots. Furthermore, the terminal is equipped with an emotion engine that analyzes the user's facial expressions and tone of voice to evaluate their emotional state. This enables adaptive information presentation tailored to the user's level of tension and interest.

[0184] Users utilize information provided from their devices to formulate hypotheses and select potential new drugs. During this process, feedback, including user sentiment data, is sent to the server, allowing for dynamic adjustments to the overall system's responsiveness. For example, if a user expresses interest during drug development analysis, relevant additional information and materials can be quickly provided, improving research efficiency.

[0185] In this way, this system enables the rapid and effective progress of the new drug development process through information provision and analysis that takes into account the emotional state of the user.

[0186] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0187] Step 1:

[0188] The server collects biological information and test data from external data sources. This input data is often provided as CSV files or database APIs. Based on the collected data, the server uses Python scripts to remove noise and process outliers, formatting the data for analysis. The pre-processed data is then output, and the server proceeds to the next analysis step.

[0189] Step 2:

[0190] The server feeds pre-processed data into an artificial intelligence model. Based on the formatted data, it performs data analysis such as classification, regression analysis, and clustering using machine learning libraries (e.g., Scikit-learn, TensorFlow). The model's accuracy is verified through holdout cross-validation, and useful analysis results are obtained as output.

[0191] Step 3:

[0192] The server inputs the prompt "Generate a new drug candidate molecule" into a generative artificial intelligence model based on the analysis results. Data generation is performed based on this prompt, and a new chemical structure is proposed. The generated molecular data is used as input for in silico simulations.

[0193] Step 4:

[0194] The terminal visually presents the analysis results and generated chemical structures transmitted from the server to the user. Based on the data provided as input, it generates visualized information using data visualization software (e.g., Tableau, Power BI). The results are output in a visually easy-to-understand format to support decision-making.

[0195] Step 5:

[0196] The device uses an emotion engine to analyze the user's facial expressions and voice tone. Emotional data collected through the camera and microphone is evaluated using Affectiva or similar emotion analysis software. Based on this input, the user's stress and interest levels are output and fed back to the server.

[0197] Step 6:

[0198] The server dynamically adjusts the system's response based on emotional data sent from the terminal. If the user is experiencing high stress levels, it generates content to promote relaxation and instructs the terminal to display it. For example, it may provide relaxing music or related materials as output to support the user's exploration activities.

[0199] (Application Example 2)

[0200] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart device 14 will be referred to as the "terminal."

[0201] This invention aims to solve the problem of difficulty in considering the influence of changes in human emotions during the new drug development process. Conventional approaches often involve data analysis and decision-making without considering the psychological state of researchers, and there is a need to improve the user experience. Furthermore, even in the home environment, there is a lack of systems that recognize users' emotions and support their daily activities.

[0202] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0203] In this invention, the server includes means for collecting biodata and clinical trial data, means for analyzing the data with a machine learning model to identify new drug candidates, means for simulating the identified new drug candidate molecules in a computer environment, means for recognizing emotions and analyzing the user's psychological state, and means for applying emotion recognition data to home appliances to support the user's daily activities. This makes it possible to improve the user experience in new drug development and to provide flexible support tailored to the user's emotional state in daily life.

[0204] "Biodata" refers to information obtained from living organisms and plays a crucial role in new drug development and medical treatment.

[0205] "Clinical trial data" refers to data based on trials conducted to confirm the effectiveness and safety of drugs or treatments.

[0206] A "machine learning model" is a collection of algorithms that automatically find patterns and regularities in data and perform predictions and classifications.

[0207] "Simulation in a computer environment" is a method of virtually reproducing real chemical processes on a computer and predicting their effects and properties before conducting an experiment.

[0208] A "generative AI model" is a computational model that uses artificial intelligence to generate new data and structures.

[0209] "Related literature" refers to research papers, databases, and other literature information that are useful in new drug development.

[0210] "Visualization" is a technique that visually represents data and its analysis results, making them easy to understand.

[0211] "Means of recognizing emotions" refers to technologies that use sensors to detect a user's facial expressions, voice, and behavior, and then analyze their psychological state.

[0212] "Methods for applying emotion recognition data to home appliances" refers to technologies that use the obtained emotion data to adjust the operation of home appliances and support users in living comfortably.

[0213] To realize this invention, a system integrating multiple components is required. The server efficiently collects biodata and clinical trial data, preprocesses this data, and then analyzes it using machine learning models. Noise is removed during the analysis process, and useful information to aid in decision-making is extracted. Then, new drug candidates are identified, and simulations are performed on their molecules in a computer environment. Subsequently, new chemical structures are created using generative AI models, and relevant literature is analyzed to gain insights. The obtained results are visualized and provided to the user through a data terminal.

[0214] The device functions as the user interface, displaying analysis results in a visual and user-friendly format. Furthermore, it incorporates an emotion engine to recognize emotions and monitor the user's psychological state in real time. Based on this information, in a home environment, it can adjust the operation of home appliances to match the user's emotional state. Specifically, it can play relaxation music when the user is feeling stressed.

[0215] Users build hypotheses based on visualized data and proceed with planning research and experiments. Furthermore, emotional feedback enables users to make better decisions and gain new insights that take their psychological state into account.

[0216] An example of a prompt that could be applied is: "Please give an example of how a home robot could suggest appropriate relaxation methods based on the user's emotions."

[0217] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0218] Step 1:

[0219] The server collects biodata and clinical trial data from users. This data includes, for example, genetic information and test results. The input data is preprocessed within the program to remove noise and output as a clean dataset.

[0220] Step 2:

[0221] Within the server, a machine learning model analyzes a clean dataset. This analysis uses an algorithm that extracts correlations and features from each data point to identify potential new drugs. The output is a list of identified drug candidates.

[0222] Step 3:

[0223] The server performs simulations of the identified drug candidate molecules in a computing environment. This step involves virtual experiments to determine how the molecular structure and biological activity function. The input is data on the drug candidate, and the output is the simulation results.

[0224] Step 4:

[0225] The server generates new chemical structures using a generative AI model. Based on the simulation results, it searches for the most optimal molecular structure. In this process, existing structural data is taken as input, and new structural data generated by the AI ​​is output.

[0226] Step 5:

[0227] The server analyzes relevant literature and extracts useful insights. Here, information from a literature database is used as input, and text containing insights useful for research is output.

[0228] Step 6:

[0229] The terminal presents the user with visualized analysis results. This output, displayed on the monitor, is converted into graphs and charts that are easy for the user to understand.

[0230] Step 7:

[0231] An emotion engine built into the device monitors the user's psychological state in real time. The user's facial expressions and voice input are analyzed by emotion recognition software and output as data indicating their state.

[0232] Step 8:

[0233] The device adjusts home appliances based on emotional data to support the user's daily life. For example, if it determines that the user is stressed, it will play relaxation music or perform other actions. The input emotional data acts as a trigger for appliance operation, resulting in the operation of the appliances as the output.

[0234] Step 9:

[0235] Users build hypotheses and plan experiments and tests based on data visualized on their devices and emotional feedback. In this step, text information is input to support the user's decision-making, and a specific plan is output.

[0236] The specific processing unit 290 transmits the result of the specific processing to the smart device 14. In the smart device 14, the control unit 46A causes the output device 40 to output the result of the specific processing. The microphone 38B acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 38B to the data processing device 12. In the data processing device 12, the specific processing unit 290 acquires the audio data.

[0237] Data generation model 58 is a so-called generative AI (Artificial Intelligence). An example of data generation model 58 is ChatGPT (registered trademark) (Internet search).<URL: https: / / openai.com / blog / chatgpt> ), Gemini (registered trademark) (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0238] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart device 14.

[0239] [Second Embodiment]

[0240] Figure 3 shows an example of the configuration of the data processing system 210 according to the second embodiment.

[0241] As shown in Figure 3, the data processing system 210 includes a data processing device 12 and smart glasses 214. An example of the data processing device 12 is a server.

[0242] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0243] The smart glasses 214 include a computer 36, a microphone 238, a speaker 240, a camera 42, and a communication interface 44. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, and camera 42 are also connected to the bus 52.

[0244] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0245] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0246] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0247] Figure 4 shows an example of the main functions of the data processing device 12 and the smart glasses 214. As shown in Figure 4, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0248] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0249] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0250] In the smart glasses 214, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0251] Next, the identification processing performed by the identification processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0252] This invention is implemented as an automated system to accelerate new drug development and improve its success rate. The system functions as a core system with three components: a server, a terminal, and a user.

[0253] Server-based functionality

[0254] The server first automatically collects large amounts of biodata and clinical trial data from diverse databases. Next, it preprocesses this data to remove noise and improve quality. The purified data is then analyzed by machine learning models to identify molecules and compounds that could be candidates for new drugs. The server further performs in silico simulations to verify the biological effects of candidate molecules in a virtual environment. It also uses generative AI models to generate novel chemical structures and explore new possibilities. The server also analyzes relevant literature, supporting research by extracting highly relevant insights.

[0255] Device-specific features

[0256] On the terminal, analysis results and generated information sent from the server are displayed visually. The terminal provides an interface that allows users to intuitively understand these results and decide on the next action. Through data visualization, users can easily grasp the relationships and trends between datasets and formulate concrete hypotheses.

[0257] User-defined features

[0258] Based on the data presented by the device, users select the most promising new drug candidates. Furthermore, users can formulate hypotheses based on the analysis results and design and adjust protocols for actual experiments and clinical trials. The results of the experiments conducted by the user are fed back to the server and used to improve the accuracy of the AI ​​model.

[0259] Specific example

[0260] For example, in the early stages of new drug development, the server collects relevant information from diverse gene databases and analyzes it to discover promising molecules for specific disease conditions. Then, the function of these molecules is verified through in silico simulations, and they are evaluated, including any newly generated molecules. Finally, based on the user's experience and expertise, they can plan the next research and development steps from the presented options, enabling efficient new drug development.

[0261] The following describes the processing flow.

[0262] Step 1:

[0263] The server automatically collects necessary data from various biodatabases and clinical trial databases. It retrieves data using APIs and web scraping and stores it in the database.

[0264] Step 2:

[0265] The server preprocesses the collected data. It performs noise reduction and imputation of missing values ​​to improve data quality and ensure the accuracy of the analysis.

[0266] Step 3:

[0267] The server feeds the pre-processed data into a machine learning model for analysis. The model extracts patterns and features from the data to identify potential drug molecules.

[0268] Step 4:

[0269] The server performs in silico simulations to evaluate the biological effects of identified new drug candidate molecules in a virtual environment. The simulation results are then used in the next step.

[0270] Step 5:

[0271] The server uses a generative AI model to generate new chemical structures. This allows for the exploration of further possibilities by comparing them with existing molecules.

[0272] Step 6:

[0273] The server analyzes relevant literature from the bibliographic database and extracts information and insights useful for research. This information is provided to users as support information.

[0274] Step 7:

[0275] The terminal visualizes analysis results, simulation data, generated molecular information, and related literature information sent from the server and displays them on the user interface.

[0276] Step 8:

[0277] Based on the information provided on the device, users formulate hypotheses and plan the next steps in new drug development. They then design experimental plans for the selected drug candidate molecules.

[0278] Step 9:

[0279] The user's experimental results are fed back to the server. This allows the AI ​​model to be further trained, improving the accuracy of the analysis.

[0280] (Example 1)

[0281] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0282] The drug development process consumes a great deal of time and resources, making it necessary to identify and evaluate drug candidates more efficiently and effectively. This invention was developed to improve the success rate of drug development by processing the vast amount and complexity of data generated in this process, and by enabling highly accurate data analysis and hypothesis building.

[0283] The specific processing by the specific processing unit 290 of the data processing device 12 in the first embodiment is realized by the following means.

[0284] In this invention, the server includes means for collecting and storing information, means for analyzing the information by a machine learning algorithm to identify new product candidates, and means for generating a new structure using a generative AI algorithm. Thereby, it becomes possible to efficiently analyze a huge amount of data and quickly identify promising new drug candidates.

[0285] The "means for collecting and storing information" is a function that automatically acquires relevant data from various sources and stores the data in storage as needed.

[0286] The "means for analyzing by a machine learning algorithm to identify new product candidates" is a function that applies a specific algorithm to the collected data, detects useful patterns and trends, and identifies new potential products based on this.

[0287] The "means for performing virtual simulation" is a function that executes a simulation experiment on a computer to predict the impact in the real world and analyzes the results.

[0288] The "means for generating a new structure using a generative AI algorithm" is a process that utilizes artificial intelligence technology to automatically generate new compounds or designs based on existing data.

[0289] The "means for analyzing related books and extracting useful findings" is a function that scans a literature database and identifies important information and insights related to a specific theme.

[0290] The "means for providing a user interface for visualizing and displaying analysis results" is a function that provides an interface for graphically representing complex data and results so that users can intuitively understand them.

[0291] "Means for users to construct hypotheses using the results and select candidate products" refers to support functions that enable users to make inferences based on the provided information and to make important decisions based on those inferences.

[0292] This invention is a system for streamlining data analysis and decision-making in new drug development and product design. The system mainly consists of three components: a server, a terminal, and a user.

[0293] The server first has the function of collecting and storing information such as biodata and research results from multiple databases. Specifically, it retrieves data from APIs using the programming language Python and the data processing library Pandas. After collection, the server analyzes the data by applying machine learning algorithms to identify promising new drug candidates. Machine learning frameworks such as TensorFlow and Scikit-learn are used for this analysis.

[0294] Next, the server uses a generative AI model to generate new chemical structures. For example, it uses OpenAI models to generate new ideas through prompts that utilize natural language processing technology. Furthermore, a text analysis library is used to analyze relevant literature based on the collected data and extract useful insights.

[0295] The terminal visualizes the analysis results and generated information from the server, providing them as graphs and charts for easy user understanding. Tableau and Power BI are used as data visualization tools. This enables users to make intuitive decisions.

[0296] Based on the information provided on the device, users construct hypotheses about new products and make selections. The experimental results and observations selected by the users are fed back to the server and used to continuously improve the machine learning model.

[0297] As a concrete example, here is an example of a prompt: "In the development of next-generation anticancer drugs, please tell me the steps necessary to select a compound that is effective against a specific gene mutation." Using this prompt, the AI ​​model provides new insights and supports the user's decision-making.

[0298] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0299] Step 1:

[0300] The server first accesses databases to collect information. Specifically, it retrieves necessary data from biodatabases and clinical trial databases via APIs. Inputs include search queries and access keys, and the output is a structured dataset. This data is stored and prepared for use in the next processing step.

[0301] Step 2:

[0302] The server preprocesses the collected data. Specifically, it uses the Python Pandas library to detect and impute or remove missing or outlier values. The input is the collected raw data, and the output is a clean, unified dataset. This dataset is then prepared for further analysis.

[0303] Step 3:

[0304] The server applies machine learning algorithms to preprocessed data for analysis. Using TensorFlow or Scikit-learn, it identifies promising new drug candidates from the input data. The input is a preprocessed dataset, and the output is the results of identifying new drug candidates. Ranking and scoring are then performed based on these results.

[0305] Step 4:

[0306] The server performs in silico simulations. This is for evaluating biological effects in a virtual environment and is carried out using simulation software. The input is new drug candidates identified by machine learning, and the output is the result of the simulation. This result is used to predict the effectiveness and safety of molecules.

[0307] Step 5:

[0308] The server uses a generative AI model to generate new chemical structures. In this process, a text prompt is used as the input, and by operating the generative model, a unique structure is output. Specifically, the generative model of OpenAI is used, and by giving a prompt sentence as the input, innovative molecules and compounds are created.

[0309] Step 6:

[0310] The server analyzes relevant materials and extracts useful insights. Utilizing natural language processing technology, it analyzes text data and identifies important information. The input is the text data of the literature, and the output is the analyzed key points and insights. This insight serves as a reference for determining the direction of research and development.

[0311] Step 7:

[0312] The terminal visually displays the analysis results and the generated information. Using data visualization tools, the results are presented in graphs and charts. The input is the analysis results, and the output is information visualized in a form that is easy for users to understand.

[0313] Step 8:

[0314] Users build hypotheses and make selections based on information visualized on their devices. They then review the results and make a decision to select the most suitable new drug candidate. The input is visualized information, and the output is the next action and feedback determined by the user. This is fed back to the server and contributes to further improving the accuracy of the AI ​​model.

[0315] (Application Example 1)

[0316] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0317] Traditionally, efficiently analyzing large amounts of biodata and clinical trial data in new drug development and rapidly identifying promising drug candidates has been difficult. Furthermore, visualizing this data and tailoring content generation based on user interests required significant manual effort and time. Therefore, there was a need to automate the delivery of individually optimized content to users simultaneously with the discovery of new drug candidates.

[0318] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0319] In this invention, the server includes means for collecting biometric and clinical trial information, means for analyzing the information with machine learning algorithms to identify new drug candidates, means for generating new chemical structures using generative AI models, and means for generating and recommending customized content based on the user's interests. This enables the user to quickly and accurately identify new drug candidates and receive personalized content.

[0320] "Biometric information" refers to data related to biological or health status, including genetic information and clinical test results.

[0321] "Clinical trial information" refers to data obtained from trials conducted to evaluate the safety and efficacy of pharmaceuticals.

[0322] A "machine learning algorithm" is a set of techniques for learning patterns from data and performing predictions and classifications.

[0323] A "new drug candidate" refers to a molecule or substance that is expected to be developed as a pharmaceutical product.

[0324] "Simulation in a virtual space" is a process that uses computer technology to simulate real-world phenomena.

[0325] A "generative AI model" is a model that generates new data and content based on artificial intelligence technology.

[0326] "Materials" refer to relevant documents and datasets, which are sources of information for gaining insights.

[0327] "Output device means" refers to a device or interface that displays analysis results and other information in a way that allows the user to visually confirm them.

[0328] A "logical hypothesis" is an unproven theory set up to explain observed phenomena.

[0329] "Personal interest-based content" refers to information and entertainment that is individually customized based on a user's past behavior and preferences.

[0330] To implement this invention, a system is constructed that uses three main components: a server, a terminal, and a user.

[0331] The server automatically collects biometric and clinical trial information from various databases and performs preprocessing such as noise reduction to refine the information. Next, machine learning algorithms are used to analyze this information and identify potential new drug candidates. The initially identified candidate substances are then virtually validated for their efficacy and safety through simulations in a virtual space. Simultaneously, generative AI models generate new chemical structures, advancing the exploration of new drug possibilities.

[0332] The terminal visually displays analysis results and new chemical structures generated on the server to the user. This allows the user to easily grasp the overall picture of the data and obtain an intuitive interface for constructing logical hypotheses. Furthermore, the terminal provides content based on the user's interests and preferences, supporting the personalization of information.

[0333] Users utilize information provided via their devices to select the most promising new drug candidates. Furthermore, users build hypotheses based on their expertise and design protocols for actual experiments and clinical trials. The results of the experiments conducted by users are sent back to the server through a feedback function, contributing to the updating and accuracy improvement of the AI ​​model.

[0334] As a concrete example, when generating new movie content, a generative AI model can be used to suggest a new story based on the user's past viewing history. An example of this prompt might be the text, "Please suggest the most suitable new story from all movie genres based on the user's interests."

[0335] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0336] Step 1:

[0337] The server automatically collects biometric and clinical trial information from various databases. It uses various databases as input and outputs an initial, unprocessed dataset. This data contains information necessary for discovering new drug candidates and forms the basis for analysis in subsequent processing steps.

[0338] Step 2:

[0339] The server preprocesses the collected data to remove noise. Using the initial raw dataset as input, it produces improved, clean data as output. Specifically, it identifies outliers and duplicate data and removes or corrects them.

[0340] Step 3:

[0341] The server uses machine learning algorithms to analyze clean data and identify potential new drug candidates. It takes clean data as input and generates a list of potential drug candidates as output. Through the analysis, a model is trained to understand the relationships between data and the effectiveness of specific molecules.

[0342] Step 4:

[0343] The server performs virtual simulations of identified new drug candidates to evaluate their biological effects. It uses a list of new drug candidates as input and collects simulation results as output. The simulator virtually tests the molecular structure and function of the compounds, virtually confirming their effectiveness.

[0344] Step 5:

[0345] The server automatically generates new chemical structures using a generative AI model. It takes a trained AI model and simulation results as input and generates candidate new chemical structures as output. This includes the generative AI exploring new chemical possibilities while referring to prompts.

[0346] Step 6:

[0347] The terminal visually displays analysis results and generated chemical structure information obtained from the server to the user. It uses data provided by the server as input and generates a visual report as output. The user interface presents information in an easily understandable format through data visualization.

[0348] Step 7:

[0349] The user constructs hypotheses and selects new drug candidates based on the information presented on the device. Visual reports are used as input, and the system provides a refined list of candidate substances as output. The user then leverages their expertise to make decisions for designing the protocols for subsequent experiments and tests.

[0350] Step 8:

[0351] The user selection results and experimental protocols are fed back to the server and used to update the AI ​​model. The system receives the user selection results as input and builds a more accurate machine learning model as output. This cycle improves prediction accuracy and the potential for new drug development in subsequent trials.

[0352] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0353] This invention provides a system that enhances the user experience and enables efficient research progress by incorporating emotion recognition into the new drug development process. The system consists of components including a server, terminals, a user interface, and an emotion engine.

[0354] Server-based functionality

[0355] The server performs traditional data collection, preprocessing, analysis using machine learning models, in silico simulations, and molecular generation using generative AI models. Analysis results and the latest research insights are transmitted to the terminal to support user decision-making. The server also analyzes user sentiment data collected by the emotion engine and incorporates it into the research process.

[0356] Device-specific features

[0357] The terminal visually displays data and analysis results sent from the server, providing an interface that is easy for the user to understand and operate. Furthermore, the terminal analyzes the user's facial expressions and voice tone through its built-in emotion engine to recognize their emotional state.

[0358] Functions powered by the emotion engine

[0359] The emotion engine monitors the user's emotions in real time and analyzes changes in the user's stress levels and interests. This information is sent to the server, allowing the system's responses and interface settings to be dynamically adjusted according to the user's emotions. For example, if a user shows excitement in response to a particular analysis result, the system will quickly provide relevant additional information to support the user's curiosity. Also, if a high stress level is detected in the user, the emotion engine can recommend content to promote relaxation.

[0360] User-defined features

[0361] Users utilize the analysis results and emotional feedback displayed on their devices to formulate hypotheses and plan experiments and clinical trials. The insights provided by the emotional engine help users understand their own emotional state and support better decision-making. Furthermore, users can adjust the responsiveness of the emotional engine and the system through the feedback function.

[0362] Specific example

[0363] For example, suppose the system detects tension from the user's facial expression while they are analyzing a new drug candidate. In this case, the device starts playing relaxing background music and simultaneously presents the analysis results again from a different perspective, helping the user consider the most appropriate option without feeling rushed. Furthermore, additional relevant materials may be provided based on the user's interests, further improving research efficiency.

[0364] The following describes the processing flow.

[0365] Step 1:

[0366] The server automatically collects diverse biodata and clinical trial data and stores it in a database. This prepares the rich information necessary for analysis.

[0367] Step 2:

[0368] The server preprocesses the collected data, removing noise and imputing missing values. By creating a clean dataset, it improves the accuracy of the analysis.

[0369] Step 3:

[0370] The server inputs pre-processed data into a machine learning model and performs data analysis. This analysis identifies molecules and targets that could be potential new drug candidates.

[0371] Step 4:

[0372] The server performs in silico simulations on identified candidate molecules and evaluates their biological effects in a virtual environment. This allows for the prediction of the effects of promising molecules.

[0373] Step 5:

[0374] The server generates new chemical structures using a generative AI model. This allows it to explore new possibilities based on existing data.

[0375] Step 6:

[0376] The server analyzes relevant literature and extracts useful insights. This information is provided in a format that is relevant to the research context.

[0377] Step 7:

[0378] The terminal visualizes the analysis results, simulation data, and generated molecular information transmitted from the server, and displays them in an easy-to-understand manner for the user.

[0379] Step 8:

[0380] The emotion engine within the device recognizes the user's emotions in real time from their facial expressions and voice, and sends the analysis results to the server.

[0381] Step 9:

[0382] Based on the emotional data it receives, the server adjusts the interface and presents information according to the user's state. For example, if the user is feeling stressed, it will suggest content on the device that promotes relaxation.

[0383] Step 10:

[0384] Users formulate hypotheses based on the presented analysis results and emotion-based insights, and plan the next steps in new drug development. They develop specific experimental plans and continuously improve the AI ​​model by providing feedback to the system.

[0385] (Example 2)

[0386] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the smart glasses 214 will be referred to as the "terminal".

[0387] The research process in new drug development requires time-consuming and resource-intensive analysis and the extraction of insights from diverse data. However, conventional methods struggle to provide users with the most relevant information efficiently and dynamically, lacking responsiveness that considers users' emotional states and interests. This leads to a decline in the quality of decision-making, which is a significant challenge.

[0388] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0389] In this invention, the server includes means for collecting biological information and test data, means for identifying new drug candidates using an artificial intelligence model, and computing means for visualizing and displaying the analysis results. This enables efficient processing of the diverse data collected and dynamic information provision that takes into account the user's emotional state.

[0390] "Biological information" refers to data about the structure, function, or evolution of living organisms, and is used in new drug development and health analysis.

[0391] "Test data" refers to data collected through scientific and clinical trials, and is information used to evaluate the efficacy and safety of products and drugs.

[0392] An "artificial intelligence model" is a computer program that learns patterns from data and makes predictions and judgments based on machine learning and deep learning technologies.

[0393] "Virtual simulation" is a technique that uses computer software to mimic real-world processes and systems, and is a method for verifying hypotheses and designs before conducting physical tests.

[0394] A "generative artificial intelligence model" is an AI model that utilizes machine learning to create new data and information, and is used for generating molecular structures of new drugs and natural language text.

[0395] "Computing means" refers to devices and systems that use computers and electronic devices to process, analyze, and visualize data.

[0396] "Emotional data" refers to data that indicates a user's psychological or emotional state, obtained from things like their facial expressions and tone of voice.

[0397] "Solution" refers to technical, methodological, or means of overcoming a specific problem and achieving an objective.

[0398] This invention is a system that enables efficient and dynamic information provision in the new drug development process. The system mainly consists of three components: a server, a terminal, and a user.

[0399] The server first collects biological information and test data from various sources. The collected data is preprocessed using programming languages ​​such as Python and R to prepare it for appropriate analysis. This preprocessing removes noise from the data and makes it easier to extract important features. Next, the server analyzes the data using artificial intelligence models (e.g., Scikit-learn, TensorFlow) to identify potential new drugs. Furthermore, it utilizes generative artificial intelligence models (e.g., OpenAI models) to generate new chemical structures based on the prompt "Generate a potential new drug molecule."

[0400] The terminal plays the role of visually displaying the analysis results to the user. Using data visualization tools (e.g., Tableau, Power BI), it presents information clearly using methods such as heatmaps and box plots. Furthermore, the terminal is equipped with an emotion engine that analyzes the user's facial expressions and tone of voice to evaluate their emotional state. This enables adaptive information presentation tailored to the user's level of tension and interest.

[0401] Users utilize information provided from their devices to formulate hypotheses and select potential new drugs. During this process, feedback, including user sentiment data, is sent to the server, allowing for dynamic adjustments to the overall system's responsiveness. For example, if a user expresses interest during drug development analysis, relevant additional information and materials can be quickly provided, improving research efficiency.

[0402] In this way, this system enables the rapid and effective progress of the new drug development process through information provision and analysis that takes into account the emotional state of the user.

[0403] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0404] Step 1:

[0405] The server collects biological information and test data from external data sources. This input data is often provided as CSV files or database APIs. Based on the collected data, the server uses Python scripts to remove noise and process outliers, formatting the data for analysis. The pre-processed data is then output, and the server proceeds to the next analysis step.

[0406] Step 2:

[0407] The server feeds pre-processed data into an artificial intelligence model. Based on the formatted data, it performs data analysis such as classification, regression analysis, and clustering using machine learning libraries (e.g., Scikit-learn, TensorFlow). The model's accuracy is verified through holdout cross-validation, and useful analysis results are obtained as output.

[0408] Step 3:

[0409] The server inputs the prompt "Generate a new drug candidate molecule" into a generative artificial intelligence model based on the analysis results. Data generation is performed based on this prompt, and a new chemical structure is proposed. The generated molecular data is used as input for in silico simulations.

[0410] Step 4:

[0411] The terminal visually presents the analysis results and generated chemical structures transmitted from the server to the user. Based on the data provided as input, it generates visualized information using data visualization software (e.g., Tableau, Power BI). The results are output in a visually easy-to-understand format to support decision-making.

[0412] Step 5:

[0413] The device uses an emotion engine to analyze the user's facial expressions and voice tone. Emotional data collected through the camera and microphone is evaluated using Affectiva or similar emotion analysis software. Based on this input, the user's stress and interest levels are output and fed back to the server.

[0414] Step 6:

[0415] The server dynamically adjusts the system's response based on emotional data sent from the terminal. If the user is experiencing high stress levels, it generates content to promote relaxation and instructs the terminal to display it. For example, it may provide relaxing music or related materials as output to support the user's exploration activities.

[0416] (Application Example 2)

[0417] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the smart glasses 214 will be referred to as the "terminal."

[0418] This invention aims to solve the problem of difficulty in considering the influence of changes in human emotions during the new drug development process. Conventional approaches often involve data analysis and decision-making without considering the psychological state of researchers, and there is a need to improve the user experience. Furthermore, even in the home environment, there is a lack of systems that recognize users' emotions and support their daily activities.

[0419] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0420] In this invention, the server includes means for collecting biodata and clinical trial data, means for analyzing the data with a machine learning model to identify new drug candidates, means for simulating the identified new drug candidate molecules in a computer environment, means for recognizing emotions and analyzing the user's psychological state, and means for applying emotion recognition data to home appliances to support the user's daily activities. This makes it possible to improve the user experience in new drug development and to provide flexible support tailored to the user's emotional state in daily life.

[0421] "Biodata" refers to information obtained from living organisms and plays a crucial role in new drug development and medical treatment.

[0422] "Clinical trial data" refers to data based on trials conducted to confirm the effectiveness and safety of drugs or treatments.

[0423] A "machine learning model" is a collection of algorithms that automatically find patterns and regularities in data and perform predictions and classifications.

[0424] "Simulation in a computer environment" is a method of virtually reproducing real chemical processes on a computer and predicting their effects and properties before conducting an experiment.

[0425] A "generative AI model" is a computational model that uses artificial intelligence to generate new data and structures.

[0426] "Related literature" refers to research papers, databases, and other literature information that are useful in new drug development.

[0427] "Visualization" is a technique that visually represents data and its analysis results, making them easy to understand.

[0428] "Means of recognizing emotions" refers to technologies that use sensors to detect a user's facial expressions, voice, and behavior, and then analyze their psychological state.

[0429] "Methods for applying emotion recognition data to home appliances" refers to technologies that use the obtained emotion data to adjust the operation of home appliances and support users in living comfortably.

[0430] To realize this invention, a system integrating multiple components is required. The server efficiently collects biodata and clinical trial data, preprocesses this data, and then analyzes it using machine learning models. Noise is removed during the analysis process, and useful information to aid in decision-making is extracted. Then, new drug candidates are identified, and simulations are performed on their molecules in a computer environment. Subsequently, new chemical structures are created using generative AI models, and relevant literature is analyzed to gain insights. The obtained results are visualized and provided to the user through a data terminal.

[0431] The device functions as the user interface, displaying analysis results in a visual and user-friendly format. Furthermore, it incorporates an emotion engine to recognize emotions and monitor the user's psychological state in real time. Based on this information, in a home environment, it can adjust the operation of home appliances to match the user's emotional state. Specifically, it can play relaxation music when the user is feeling stressed.

[0432] Users build hypotheses based on visualized data and proceed with planning research and experiments. Furthermore, emotional feedback enables users to make better decisions and gain new insights that take their psychological state into account.

[0433] An example of a prompt that could be applied is: "Please give an example of how a home robot could suggest appropriate relaxation methods based on the user's emotions."

[0434] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0435] Step 1:

[0436] The server collects biodata and clinical trial data from users. This data includes, for example, genetic information and test results. The input data is preprocessed within the program to remove noise and output as a clean dataset.

[0437] Step 2:

[0438] Within the server, a machine learning model analyzes a clean dataset. This analysis uses an algorithm that extracts correlations and features from each data point to identify potential new drugs. The output is a list of identified drug candidates.

[0439] Step 3:

[0440] The server performs simulations of the identified drug candidate molecules in a computing environment. This step involves virtual experiments to determine how the molecular structure and biological activity function. The input is data on the drug candidate, and the output is the simulation results.

[0441] Step 4:

[0442] The server generates new chemical structures using a generative AI model. Based on the simulation results, it searches for the most optimal molecular structure. In this process, existing structural data is taken as input, and new structural data generated by the AI ​​is output.

[0443] Step 5:

[0444] The server analyzes relevant literature and extracts useful insights. Here, information from a literature database is used as input, and text containing insights useful for research is output.

[0445] Step 6:

[0446] The terminal presents the user with visualized analysis results. This output, displayed on the monitor, is converted into graphs and charts that are easy for the user to understand.

[0447] Step 7:

[0448] An emotion engine built into the device monitors the user's psychological state in real time. The user's facial expressions and voice input are analyzed by emotion recognition software and output as data indicating their state.

[0449] Step 8:

[0450] The device adjusts home appliances based on emotional data to support the user's daily life. For example, if it determines that the user is stressed, it will play relaxation music or perform other actions. The input emotional data acts as a trigger for appliance operation, resulting in the operation of the appliances as the output.

[0451] Step 9:

[0452] Users build hypotheses and plan experiments and tests based on data visualized on their devices and emotional feedback. In this step, text information is input to support the user's decision-making, and a specific plan is output.

[0453] The specific processing unit 290 transmits the result of the specific processing to the smart glasses 214. In the smart glasses 214, the control unit 46A causes the speaker 240 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0454] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0455] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the smart glasses 214.

[0456] [Third Embodiment]

[0457] Figure 5 shows an example of the configuration of the data processing system 310 according to the third embodiment.

[0458] As shown in Figure 5, the data processing system 310 includes a data processing device 12 and a headset terminal 314. An example of the data processing device 12 is a server.

[0459] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0460] The headset terminal 314 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a display 343. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and display 343 are also connected to the bus 52.

[0461] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0462] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0463] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0464] Figure 6 shows an example of the main functions of the data processing device 12 and the headset terminal 314. As shown in Figure 6, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0465] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0466] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0467] In the headset terminal 314, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0468] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the headset terminal 314 will be referred to as the "terminal".

[0469] This invention is implemented as an automated system to accelerate new drug development and improve its success rate. The system functions as a core system with three components: a server, a terminal, and a user.

[0470] Server-based functionality

[0471] The server first automatically collects large amounts of biodata and clinical trial data from diverse databases. Next, it preprocesses this data to remove noise and improve quality. The purified data is then analyzed by machine learning models to identify molecules and compounds that could be candidates for new drugs. The server further performs in silico simulations to verify the biological effects of candidate molecules in a virtual environment. It also uses generative AI models to generate novel chemical structures and explore new possibilities. The server also analyzes relevant literature, supporting research by extracting highly relevant insights.

[0472] Device-specific features

[0473] On the terminal, analysis results and generated information sent from the server are displayed visually. The terminal provides an interface that allows users to intuitively understand these results and decide on the next action. Through data visualization, users can easily grasp the relationships and trends between datasets and formulate concrete hypotheses.

[0474] User-defined features

[0475] Based on the data presented by the device, users select the most promising new drug candidates. Furthermore, users can formulate hypotheses based on the analysis results and design and adjust protocols for actual experiments and clinical trials. The results of the experiments conducted by the user are fed back to the server and used to improve the accuracy of the AI ​​model.

[0476] Specific example

[0477] For example, in the early stages of new drug development, the server collects relevant information from diverse gene databases and analyzes it to discover promising molecules for specific disease conditions. Then, the function of these molecules is verified through in silico simulations, and they are evaluated, including any newly generated molecules. Finally, based on the user's experience and expertise, they can plan the next research and development steps from the presented options, enabling efficient new drug development.

[0478] The following describes the processing flow.

[0479] Step 1:

[0480] The server automatically collects necessary data from various biodatabases and clinical trial databases. It retrieves data using APIs and web scraping and stores it in the database.

[0481] Step 2:

[0482] The server preprocesses the collected data. It performs noise reduction and imputation of missing values ​​to improve data quality and ensure the accuracy of the analysis.

[0483] Step 3:

[0484] The server feeds the pre-processed data into a machine learning model for analysis. The model extracts patterns and features from the data to identify potential drug molecules.

[0485] Step 4:

[0486] The server performs in silico simulations to evaluate the biological effects of identified new drug candidate molecules in a virtual environment. The simulation results are then used in the next step.

[0487] Step 5:

[0488] The server uses a generative AI model to generate new chemical structures. This allows for the exploration of further possibilities by comparing them with existing molecules.

[0489] Step 6:

[0490] The server analyzes relevant literature from the bibliographic database and extracts information and insights useful for research. This information is provided to users as support information.

[0491] Step 7:

[0492] The terminal visualizes analysis results, simulation data, generated molecular information, and related literature information sent from the server and displays them on the user interface.

[0493] Step 8:

[0494] Based on the information provided on the device, users formulate hypotheses and plan the next steps in new drug development. They then design experimental plans for the selected drug candidate molecules.

[0495] Step 9:

[0496] The user's experimental results are fed back to the server. This allows the AI ​​model to be further trained, improving the accuracy of the analysis.

[0497] (Example 1)

[0498] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0499] The drug development process consumes a great deal of time and resources, making it necessary to identify and evaluate drug candidates more efficiently and effectively. This invention was developed to improve the success rate of drug development by processing the vast amount and complexity of data generated in this process, and by enabling highly accurate data analysis and hypothesis building.

[0500] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0501] In this invention, the server includes means for collecting and storing information, means for analyzing the information using a machine learning algorithm to identify new product candidates, and means for generating new structures using a generative AI algorithm. This makes it possible to efficiently analyze vast amounts of data and rapidly identify promising new drug candidates.

[0502] "Means for collecting and storing information" refers to a function that automatically retrieves relevant data from various sources and stores that data in storage as needed.

[0503] "Means of analyzing data using machine learning algorithms to identify potential new products" refers to a function that applies a specific algorithm to collected data to detect useful patterns and trends, and then uses this to identify new and potentially valuable products.

[0504] "Means of performing virtual simulations" refers to a function that runs simulated experiments on a computer to predict the impact in the real world and analyzes the results.

[0505] "Methods for generating new structures using generative AI algorithms" refer to processes that utilize artificial intelligence technology to automatically create new compounds and designs based on existing data.

[0506] "Means for analyzing related books and extracting useful insights" refers to a function that scans literature databases and identifies important information and insights related to a specific theme.

[0507] "Means of providing a user interface for visualizing and displaying analysis results" refers to a function that provides an interface for graphically representing complex data and results, enabling users to understand them intuitively.

[0508] "Means for users to construct hypotheses using the results and select candidate products" refers to support functions that enable users to make inferences based on the provided information and to make important decisions based on those inferences.

[0509] This invention is a system for streamlining data analysis and decision-making in new drug development and product design. The system mainly consists of three components: a server, a terminal, and a user.

[0510] The server first has the function of collecting and storing information such as biodata and research results from multiple databases. Specifically, it retrieves data from APIs using the programming language Python and the data processing library Pandas. After collection, the server analyzes the data by applying machine learning algorithms to identify promising new drug candidates. Machine learning frameworks such as TensorFlow and Scikit-learn are used for this analysis.

[0511] Next, the server uses a generative AI model to generate new chemical structures. For example, it uses OpenAI models to generate new ideas through prompts that utilize natural language processing technology. Furthermore, a text analysis library is used to analyze relevant literature based on the collected data and extract useful insights.

[0512] The terminal visualizes the analysis results and generated information from the server, providing them as graphs and charts for easy user understanding. Tableau and Power BI are used as data visualization tools. This enables users to make intuitive decisions.

[0513] Based on the information provided on the device, users construct hypotheses about new products and make selections. The experimental results and observations selected by the users are fed back to the server and used to continuously improve the machine learning model.

[0514] As a concrete example, here is an example of a prompt: "In the development of next-generation anticancer drugs, please tell me the steps necessary to select a compound that is effective against a specific gene mutation." Using this prompt, the AI ​​model provides new insights and supports the user's decision-making.

[0515] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0516] Step 1:

[0517] The server first accesses databases to collect information. Specifically, it retrieves necessary data from biodatabases and clinical trial databases via APIs. Inputs include search queries and access keys, and the output is a structured dataset. This data is stored and prepared for use in the next processing step.

[0518] Step 2:

[0519] The server preprocesses the collected data. Specifically, it uses the Python Pandas library to detect and impute or remove missing or outlier values. The input is the collected raw data, and the output is a clean, unified dataset. This dataset is then prepared for further analysis.

[0520] Step 3:

[0521] The server applies machine learning algorithms to preprocessed data for analysis. Using TensorFlow or Scikit-learn, it identifies promising new drug candidates from the input data. The input is a preprocessed dataset, and the output is the results of identifying new drug candidates. Ranking and scoring are then performed based on these results.

[0522] Step 4:

[0523] The server runs in silico simulations, which are used to evaluate the biological effects in a virtual environment and are performed using simulation software. The input is a new drug candidate identified by machine learning, and the output is the simulation results. These results are used to predict the efficacy and safety of the molecule.

[0524] Step 5:

[0525] The server generates new chemical structures using a generative AI model. This process takes text prompts as input and runs the generative model to produce unique structures. Specifically, it uses OpenAI's generative model, providing prompt text as input to create innovative molecules and compounds.

[0526] Step 6:

[0527] The server analyzes relevant materials and extracts useful insights. It utilizes natural language processing techniques to analyze text data and identify important information. The input is text data from literature, and the output is analyzed key points and insights. These insights serve as a guide for determining the direction of research and development.

[0528] Step 7:

[0529] The terminal visually displays the analysis results and generated information. Data visualization tools are used to represent the results in graphs and charts. The input is the analysis results, and the output is visualized information provided in a user-friendly format.

[0530] Step 8:

[0531] Users build hypotheses and make selections based on information visualized on their devices. They then review the results and make a decision to select the most suitable new drug candidate. The input is visualized information, and the output is the next action and feedback determined by the user. This is fed back to the server and contributes to further improving the accuracy of the AI ​​model.

[0532] (Application Example 1)

[0533] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0534] Traditionally, efficiently analyzing large amounts of biodata and clinical trial data in new drug development and rapidly identifying promising drug candidates has been difficult. Furthermore, visualizing this data and tailoring content generation based on user interests required significant manual effort and time. Therefore, there was a need to automate the delivery of individually optimized content to users simultaneously with the discovery of new drug candidates.

[0535] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0536] In this invention, the server includes means for collecting biometric and clinical trial information, means for analyzing the information with machine learning algorithms to identify new drug candidates, means for generating new chemical structures using generative AI models, and means for generating and recommending customized content based on the user's interests. This enables the user to quickly and accurately identify new drug candidates and receive personalized content.

[0537] "Biometric information" refers to data related to biological or health status, including genetic information and clinical test results.

[0538] "Clinical trial information" refers to data obtained from trials conducted to evaluate the safety and efficacy of pharmaceuticals.

[0539] A "machine learning algorithm" is a set of techniques for learning patterns from data and performing predictions and classifications.

[0540] A "new drug candidate" refers to a molecule or substance that is expected to be developed as a pharmaceutical product.

[0541] "Simulation in a virtual space" is a process that uses computer technology to simulate real-world phenomena.

[0542] A "generative AI model" is a model that generates new data and content based on artificial intelligence technology.

[0543] "Materials" refer to relevant documents and datasets, which are sources of information for gaining insights.

[0544] "Output device means" refers to a device or interface that displays analysis results and other information in a way that allows the user to visually confirm them.

[0545] A "logical hypothesis" is an unproven theory set up to explain observed phenomena.

[0546] "Personal interest-based content" refers to information and entertainment that is individually customized based on a user's past behavior and preferences.

[0547] To implement this invention, a system is constructed that uses three main components: a server, a terminal, and a user.

[0548] The server automatically collects biometric and clinical trial information from various databases and performs preprocessing such as noise reduction to refine the information. Next, machine learning algorithms are used to analyze this information and identify potential new drug candidates. The initially identified candidate substances are then virtually validated for their efficacy and safety through simulations in a virtual space. Simultaneously, generative AI models generate new chemical structures, advancing the exploration of new drug possibilities.

[0549] The terminal visually displays analysis results and new chemical structures generated on the server to the user. This allows the user to easily grasp the overall picture of the data and obtain an intuitive interface for constructing logical hypotheses. Furthermore, the terminal provides content based on the user's interests and preferences, supporting the personalization of information.

[0550] Users utilize information provided via their devices to select the most promising new drug candidates. Furthermore, users build hypotheses based on their expertise and design protocols for actual experiments and clinical trials. The results of the experiments conducted by users are sent back to the server through a feedback function, contributing to the updating and accuracy improvement of the AI ​​model.

[0551] As a concrete example, when generating new movie content, a generative AI model can be used to suggest a new story based on the user's past viewing history. An example of this prompt might be the text, "Please suggest the most suitable new story from all movie genres based on the user's interests."

[0552] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0553] Step 1:

[0554] The server automatically collects biometric and clinical trial information from various databases. It uses various databases as input and outputs an initial, unprocessed dataset. This data contains information necessary for discovering new drug candidates and forms the basis for analysis in subsequent processing steps.

[0555] Step 2:

[0556] The server preprocesses the collected data to remove noise. Using the initial raw dataset as input, it produces improved, clean data as output. Specifically, it identifies outliers and duplicate data and removes or corrects them.

[0557] Step 3:

[0558] The server uses machine learning algorithms to analyze clean data and identify potential new drug candidates. It takes clean data as input and generates a list of potential drug candidates as output. Through the analysis, a model is trained to understand the relationships between data and the effectiveness of specific molecules.

[0559] Step 4:

[0560] The server performs virtual simulations of identified new drug candidates to evaluate their biological effects. It uses a list of new drug candidates as input and collects simulation results as output. The simulator virtually tests the molecular structure and function of the compounds, virtually confirming their effectiveness.

[0561] Step 5:

[0562] The server automatically generates new chemical structures using a generative AI model. It takes a trained AI model and simulation results as input and generates candidate new chemical structures as output. This includes the generative AI exploring new chemical possibilities while referring to prompts.

[0563] Step 6:

[0564] The terminal visually displays analysis results and generated chemical structure information obtained from the server to the user. It uses data provided by the server as input and generates a visual report as output. The user interface presents information in an easily understandable format through data visualization.

[0565] Step 7:

[0566] The user constructs hypotheses and selects new drug candidates based on the information presented on the device. Visual reports are used as input, and the system provides a refined list of candidate substances as output. The user then leverages their expertise to make decisions for designing the protocols for subsequent experiments and tests.

[0567] Step 8:

[0568] The user selection results and experimental protocols are fed back to the server and used to update the AI ​​model. The system receives the user selection results as input and builds a more accurate machine learning model as output. This cycle improves prediction accuracy and the potential for new drug development in subsequent trials.

[0569] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0570] This invention provides a system that enhances the user experience and enables efficient research progress by incorporating emotion recognition into the new drug development process. The system consists of components including a server, terminals, a user interface, and an emotion engine.

[0571] Server-based functionality

[0572] The server performs traditional data collection, preprocessing, analysis using machine learning models, in silico simulations, and molecular generation using generative AI models. Analysis results and the latest research insights are transmitted to the terminal to support user decision-making. The server also analyzes user sentiment data collected by the emotion engine and incorporates it into the research process.

[0573] Device-specific features

[0574] The terminal visually displays data and analysis results sent from the server, providing an interface that is easy for the user to understand and operate. Furthermore, the terminal analyzes the user's facial expressions and voice tone through its built-in emotion engine to recognize their emotional state.

[0575] Functions powered by the emotion engine

[0576] The emotion engine monitors the user's emotions in real time and analyzes changes in the user's stress levels and interests. This information is sent to the server, allowing the system's responses and interface settings to be dynamically adjusted according to the user's emotions. For example, if a user shows excitement in response to a particular analysis result, the system will quickly provide relevant additional information to support the user's curiosity. Also, if a high stress level is detected in the user, the emotion engine can recommend content to promote relaxation.

[0577] User-defined features

[0578] Users utilize the analysis results and emotional feedback displayed on their devices to formulate hypotheses and plan experiments and clinical trials. The insights provided by the emotional engine help users understand their own emotional state and support better decision-making. Furthermore, users can adjust the responsiveness of the emotional engine and the system through the feedback function.

[0579] Specific example

[0580] For example, suppose the system detects tension from the user's facial expression while they are analyzing a new drug candidate. In this case, the device starts playing relaxing background music and simultaneously presents the analysis results again from a different perspective, helping the user consider the most appropriate option without feeling rushed. Furthermore, additional relevant materials may be provided based on the user's interests, further improving research efficiency.

[0581] The following describes the processing flow.

[0582] Step 1:

[0583] The server automatically collects diverse biodata and clinical trial data and stores it in a database. This prepares the rich information necessary for analysis.

[0584] Step 2:

[0585] The server preprocesses the collected data, removing noise and imputing missing values. By creating a clean dataset, it improves the accuracy of the analysis.

[0586] Step 3:

[0587] The server inputs pre-processed data into a machine learning model and performs data analysis. This analysis identifies molecules and targets that could be potential new drug candidates.

[0588] Step 4:

[0589] The server performs in silico simulations on identified candidate molecules and evaluates their biological effects in a virtual environment. This allows for the prediction of the effects of promising molecules.

[0590] Step 5:

[0591] The server generates new chemical structures using a generative AI model. This allows it to explore new possibilities based on existing data.

[0592] Step 6:

[0593] The server analyzes relevant literature and extracts useful insights. This information is provided in a format that is relevant to the research context.

[0594] Step 7:

[0595] The terminal visualizes the analysis results, simulation data, and generated molecular information transmitted from the server, and displays them in an easy-to-understand manner for the user.

[0596] Step 8:

[0597] The emotion engine within the device recognizes the user's emotions in real time from their facial expressions and voice, and sends the analysis results to the server.

[0598] Step 9:

[0599] Based on the emotional data it receives, the server adjusts the interface and presents information according to the user's state. For example, if the user is feeling stressed, it will suggest content on the device that promotes relaxation.

[0600] Step 10:

[0601] Users formulate hypotheses based on the presented analysis results and emotion-based insights, and plan the next steps in new drug development. They develop specific experimental plans and continuously improve the AI ​​model by providing feedback to the system.

[0602] (Example 2)

[0603] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0604] The research process in new drug development requires time-consuming and resource-intensive analysis and the extraction of insights from diverse data. However, conventional methods struggle to provide users with the most relevant information efficiently and dynamically, lacking responsiveness that considers users' emotional states and interests. This leads to a decline in the quality of decision-making, which is a significant challenge.

[0605] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0606] In this invention, the server includes means for collecting biological information and test data, means for identifying new drug candidates using an artificial intelligence model, and computing means for visualizing and displaying the analysis results. This enables efficient processing of the diverse data collected and dynamic information provision that takes into account the user's emotional state.

[0607] "Biological information" refers to data about the structure, function, or evolution of living organisms, and is used in new drug development and health analysis.

[0608] "Test data" refers to data collected through scientific and clinical trials, and is information used to evaluate the efficacy and safety of products and drugs.

[0609] An "artificial intelligence model" is a computer program that learns patterns from data and makes predictions and judgments based on machine learning and deep learning technologies.

[0610] "Virtual simulation" is a technique that uses computer software to mimic real-world processes and systems, and is a method for verifying hypotheses and designs before conducting physical tests.

[0611] A "generative artificial intelligence model" is an AI model that utilizes machine learning to create new data and information, and is used for generating molecular structures of new drugs and natural language text.

[0612] "Computing means" refers to devices and systems that use computers and electronic devices to process, analyze, and visualize data.

[0613] "Emotional data" refers to data that indicates a user's psychological or emotional state, obtained from things like their facial expressions and tone of voice.

[0614] "Solution" refers to technical, methodological, or means of overcoming a specific problem and achieving an objective.

[0615] This invention is a system that enables efficient and dynamic information provision in the new drug development process. The system mainly consists of three components: a server, a terminal, and a user.

[0616] The server first collects biological information and test data from various sources. The collected data is preprocessed using programming languages ​​such as Python and R to prepare it for appropriate analysis. This preprocessing removes noise from the data and makes it easier to extract important features. Next, the server analyzes the data using artificial intelligence models (e.g., Scikit-learn, TensorFlow) to identify potential new drugs. Furthermore, it utilizes generative artificial intelligence models (e.g., OpenAI models) to generate new chemical structures based on the prompt "Generate a potential new drug molecule."

[0617] The terminal plays the role of visually displaying the analysis results to the user. Using data visualization tools (e.g., Tableau, Power BI), it presents information clearly using methods such as heatmaps and box plots. Furthermore, the terminal is equipped with an emotion engine that analyzes the user's facial expressions and tone of voice to evaluate their emotional state. This enables adaptive information presentation tailored to the user's level of tension and interest.

[0618] Users utilize information provided from their devices to formulate hypotheses and select potential new drugs. During this process, feedback, including user sentiment data, is sent to the server, allowing for dynamic adjustments to the overall system's responsiveness. For example, if a user expresses interest during drug development analysis, relevant additional information and materials can be quickly provided, improving research efficiency.

[0619] In this way, this system enables the rapid and effective progress of the new drug development process through information provision and analysis that takes into account the emotional state of the user.

[0620] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0621] Step 1:

[0622] The server collects biological information and test data from external data sources. This input data is often provided as CSV files or database APIs. Based on the collected data, the server uses Python scripts to remove noise and process outliers, formatting the data for analysis. The pre-processed data is then output, and the server proceeds to the next analysis step.

[0623] Step 2:

[0624] The server feeds pre-processed data into an artificial intelligence model. Based on the formatted data, it performs data analysis such as classification, regression analysis, and clustering using machine learning libraries (e.g., Scikit-learn, TensorFlow). The model's accuracy is verified through holdout cross-validation, and useful analysis results are obtained as output.

[0625] Step 3:

[0626] The server inputs the prompt "Generate a new drug candidate molecule" into a generative artificial intelligence model based on the analysis results. Data generation is performed based on this prompt, and a new chemical structure is proposed. The generated molecular data is used as input for in silico simulations.

[0627] Step 4:

[0628] The terminal visually presents the analysis results and generated chemical structures transmitted from the server to the user. Based on the data provided as input, it generates visualized information using data visualization software (e.g., Tableau, Power BI). The results are output in a visually easy-to-understand format to support decision-making.

[0629] Step 5:

[0630] The device uses an emotion engine to analyze the user's facial expressions and voice tone. Emotional data collected through the camera and microphone is evaluated using Affectiva or similar emotion analysis software. Based on this input, the user's stress and interest levels are output and fed back to the server.

[0631] Step 6:

[0632] The server dynamically adjusts the system's response based on emotional data sent from the terminal. If the user is experiencing high stress levels, it generates content to promote relaxation and instructs the terminal to display it. For example, it may provide relaxing music or related materials as output to support the user's exploration activities.

[0633] (Application Example 2)

[0634] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server," and the headset-type terminal 314 will be referred to as the "terminal."

[0635] This invention aims to solve the problem of difficulty in considering the influence of changes in human emotions during the new drug development process. Conventional approaches often involve data analysis and decision-making without considering the psychological state of researchers, and there is a need to improve the user experience. Furthermore, even in the home environment, there is a lack of systems that recognize users' emotions and support their daily activities.

[0636] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0637] In this invention, the server includes means for collecting biodata and clinical trial data, means for analyzing the data with a machine learning model to identify new drug candidates, means for simulating the identified new drug candidate molecules in a computer environment, means for recognizing emotions and analyzing the user's psychological state, and means for applying emotion recognition data to home appliances to support the user's daily activities. This makes it possible to improve the user experience in new drug development and to provide flexible support tailored to the user's emotional state in daily life.

[0638] "Biodata" refers to information obtained from living organisms and plays a crucial role in new drug development and medical treatment.

[0639] "Clinical trial data" refers to data based on trials conducted to confirm the effectiveness and safety of drugs or treatments.

[0640] A "machine learning model" is a collection of algorithms that automatically find patterns and regularities in data and perform predictions and classifications.

[0641] "Simulation in a computer environment" is a method of virtually reproducing real chemical processes on a computer and predicting their effects and properties before conducting an experiment.

[0642] A "generative AI model" is a computational model that uses artificial intelligence to generate new data and structures.

[0643] "Related literature" refers to research papers, databases, and other literature information that are useful in new drug development.

[0644] "Visualization" is a technique that visually represents data and its analysis results, making them easy to understand.

[0645] "Means of recognizing emotions" refers to technologies that use sensors to detect a user's facial expressions, voice, and behavior, and then analyze their psychological state.

[0646] "Methods for applying emotion recognition data to home appliances" refers to technologies that use the obtained emotion data to adjust the operation of home appliances and support users in living comfortably.

[0647] To realize this invention, a system integrating multiple components is required. The server efficiently collects biodata and clinical trial data, preprocesses this data, and then analyzes it using machine learning models. Noise is removed during the analysis process, and useful information to aid in decision-making is extracted. Then, new drug candidates are identified, and simulations are performed on their molecules in a computer environment. Subsequently, new chemical structures are created using generative AI models, and relevant literature is analyzed to gain insights. The obtained results are visualized and provided to the user through a data terminal.

[0648] The device functions as the user interface, displaying analysis results in a visual and user-friendly format. Furthermore, it incorporates an emotion engine to recognize emotions and monitor the user's psychological state in real time. Based on this information, in a home environment, it can adjust the operation of home appliances to match the user's emotional state. Specifically, it can play relaxation music when the user is feeling stressed.

[0649] Users build hypotheses based on visualized data and proceed with planning research and experiments. Furthermore, emotional feedback enables users to make better decisions and gain new insights that take their psychological state into account.

[0650] An example of a prompt that could be applied is: "Please give an example of how a home robot could suggest appropriate relaxation methods based on the user's emotions."

[0651] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0652] Step 1:

[0653] The server collects biodata and clinical trial data from users. This data includes, for example, genetic information and test results. The input data is preprocessed within the program to remove noise and output as a clean dataset.

[0654] Step 2:

[0655] Within the server, a machine learning model analyzes a clean dataset. This analysis uses an algorithm that extracts correlations and features from each data point to identify potential new drugs. The output is a list of identified drug candidates.

[0656] Step 3:

[0657] The server performs simulations of the identified drug candidate molecules in a computing environment. This step involves virtual experiments to determine how the molecular structure and biological activity function. The input is data on the drug candidate, and the output is the simulation results.

[0658] Step 4:

[0659] The server generates new chemical structures using a generative AI model. Based on the simulation results, it searches for the most optimal molecular structure. In this process, existing structural data is taken as input, and new structural data generated by the AI ​​is output.

[0660] Step 5:

[0661] The server analyzes relevant literature and extracts useful insights. Here, information from a literature database is used as input, and text containing insights useful for research is output.

[0662] Step 6:

[0663] The terminal presents the user with visualized analysis results. This output, displayed on the monitor, is converted into graphs and charts that are easy for the user to understand.

[0664] Step 7:

[0665] An emotion engine built into the device monitors the user's psychological state in real time. The user's facial expressions and voice input are analyzed by emotion recognition software and output as data indicating their state.

[0666] Step 8:

[0667] The device adjusts home appliances based on emotional data to support the user's daily life. For example, if it determines that the user is stressed, it will play relaxation music or perform other actions. The input emotional data acts as a trigger for appliance operation, resulting in the operation of the appliances as the output.

[0668] Step 9:

[0669] Users build hypotheses and plan experiments and tests based on data visualized on their devices and emotional feedback. In this step, text information is input to support the user's decision-making, and a specific plan is output.

[0670] The specific processing unit 290 transmits the result of the specific processing to the headset terminal 314. In the headset terminal 314, the control unit 46A causes the speaker 240 and display 343 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0671] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0672] In the above embodiment, an example was given in which specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and specific processing may also be performed by the headset terminal 314.

[0673] [Fourth Embodiment]

[0674] Figure 7 shows an example of the configuration of the data processing system 410 according to the fourth embodiment.

[0675] As shown in Figure 7, the data processing system 410 includes a data processing device 12 and a robot 414. An example of the data processing device 12 is a server.

[0676] The data processing device 12 comprises a computer 22, a database 24, and a communication interface 26. The computer 22 is an example of a "computer" related to the technology of this disclosure. The computer 22 comprises a processor 28, RAM 30, and storage 32. The processor 28, RAM 30, and storage 32 are connected to a bus 34. The database 24 and the communication interface 26 are also connected to the bus 34. The communication interface 26 is connected to a network 54. An example of the network 54 is a WAN (Wide Area Network) and / or a LAN (Local Area Network).

[0677] The robot 414 includes a computer 36, a microphone 238, a speaker 240, a camera 42, a communication interface 44, and a controlled object 443. The computer 36 includes a processor 46, RAM 48, and storage 50. The processor 46, RAM 48, and storage 50 are connected to a bus 52. The microphone 238, speaker 240, camera 42, and controlled object 443 are also connected to the bus 52.

[0678] The microphone 238 receives voice signals from the user 20 and receives instructions from the user 20. The microphone 238 captures the voice signals from the user 20, converts the captured voice into audio data, and outputs it to the processor 46. The speaker 240 outputs audio according to the instructions from the processor 46.

[0679] Camera 42 is a small digital camera equipped with an optical system including a lens, aperture, and shutter, and an image sensor such as a CMOS (Complementary Metal-Oxide-Semiconductor) image sensor or a CCD (Charge Coupled Device) image sensor, and captures images of the area around the user 20 (for example, an imaging range defined by a field of view equivalent to the width of a typical healthy person's field of vision).

[0680] Communication interface 44 is connected to network 54. Communication interfaces 44 and 26 are responsible for the exchange of various information between processor 46 and processor 28 via network 54. The exchange of various information between processor 46 and processor 28 using communication interfaces 44 and 26 is performed in a secure manner.

[0681] The controlled object 443 includes a display device, LEDs in the eyes, and motors that drive the arms, hands, and feet. The posture and gestures of the robot 414 are controlled by controlling the motors of the arms, hands, and feet. Some of the robot 414's emotions can be expressed by controlling these motors. Furthermore, the robot 414's facial expressions can also be expressed by controlling the illumination state of the LEDs in its eyes.

[0682] Figure 8 shows an example of the main functions of the data processing device 12 and the robot 414. As shown in Figure 8, the data processing device 12 performs specific processing using the processor 28. The storage 32 stores the specific processing program 56.

[0683] The specific processing program 56 is an example of a "program" relating to the technology of this disclosure. The processor 28 reads the specific processing program 56 from the storage 32 and executes the read specific processing program 56 on the RAM 30. The specific processing is realized by the processor 28 operating as a specific processing unit 290 in accordance with the specific processing program 56 executed on the RAM 30.

[0684] The storage 32 stores the data generation model 58 and the emotion identification model 59. The data generation model 58 and the emotion identification model 59 are used by the identification processing unit 290.

[0685] In robot 414, the processor 46 performs the reception output processing. The storage 50 stores the reception output program 60. The processor 46 reads the reception output program 60 from the storage 50 and executes the read reception output program 60 on the RAM 48. The reception output processing is realized by the processor 46 operating as a control unit 46A according to the reception output program 60 executed on the RAM 48.

[0686] Next, the specific processing performed by the specific processing unit 290 of the data processing device 12 will be described. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0687] This invention is implemented as an automated system to accelerate new drug development and improve its success rate. The system functions as a core system with three components: a server, a terminal, and a user.

[0688] Server-based functionality

[0689] The server first automatically collects large amounts of biodata and clinical trial data from diverse databases. Next, it preprocesses this data to remove noise and improve quality. The purified data is then analyzed by machine learning models to identify molecules and compounds that could be candidates for new drugs. The server further performs in silico simulations to verify the biological effects of candidate molecules in a virtual environment. It also uses generative AI models to generate novel chemical structures and explore new possibilities. The server also analyzes relevant literature, supporting research by extracting highly relevant insights.

[0690] Device-specific features

[0691] On the terminal, analysis results and generated information sent from the server are displayed visually. The terminal provides an interface that allows users to intuitively understand these results and decide on the next action. Through data visualization, users can easily grasp the relationships and trends between datasets and formulate concrete hypotheses.

[0692] User-defined features

[0693] Based on the data presented by the device, users select the most promising new drug candidates. Furthermore, users can formulate hypotheses based on the analysis results and design and adjust protocols for actual experiments and clinical trials. The results of the experiments conducted by the user are fed back to the server and used to improve the accuracy of the AI ​​model.

[0694] Specific example

[0695] For example, in the early stages of new drug development, the server collects relevant information from diverse gene databases and analyzes it to discover promising molecules for specific disease conditions. Then, the function of these molecules is verified through in silico simulations, and they are evaluated, including any newly generated molecules. Finally, based on the user's experience and expertise, they can plan the next research and development steps from the presented options, enabling efficient new drug development.

[0696] The following describes the processing flow.

[0697] Step 1:

[0698] The server automatically collects necessary data from various biodatabases and clinical trial databases. It retrieves data using APIs and web scraping and stores it in the database.

[0699] Step 2:

[0700] The server preprocesses the collected data. It performs noise reduction and imputation of missing values ​​to improve data quality and ensure the accuracy of the analysis.

[0701] Step 3:

[0702] The server feeds the pre-processed data into a machine learning model for analysis. The model extracts patterns and features from the data to identify potential drug molecules.

[0703] Step 4:

[0704] The server performs in silico simulations to evaluate the biological effects of identified new drug candidate molecules in a virtual environment. The simulation results are then used in the next step.

[0705] Step 5:

[0706] The server uses a generative AI model to generate new chemical structures. This allows for the exploration of further possibilities by comparing them with existing molecules.

[0707] Step 6:

[0708] The server analyzes relevant literature from the bibliographic database and extracts information and insights useful for research. This information is provided to users as support information.

[0709] Step 7:

[0710] The terminal visualizes analysis results, simulation data, generated molecular information, and related literature information sent from the server and displays them on the user interface.

[0711] Step 8:

[0712] Based on the information provided on the device, users formulate hypotheses and plan the next steps in new drug development. They then design experimental plans for the selected drug candidate molecules.

[0713] Step 9:

[0714] The user's experimental results are fed back to the server. This allows the AI ​​model to be further trained, improving the accuracy of the analysis.

[0715] (Example 1)

[0716] Next, we will describe Example 1. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0717] The drug development process consumes a great deal of time and resources, making it necessary to identify and evaluate drug candidates more efficiently and effectively. This invention was developed to improve the success rate of drug development by processing the vast amount and complexity of data generated in this process, and by enabling highly accurate data analysis and hypothesis building.

[0718] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 1 is realized by the following means.

[0719] In this invention, the server includes means for collecting and storing information, means for analyzing the information using a machine learning algorithm to identify new product candidates, and means for generating new structures using a generative AI algorithm. This makes it possible to efficiently analyze vast amounts of data and rapidly identify promising new drug candidates.

[0720] "Means for collecting and storing information" refers to a function that automatically retrieves relevant data from various sources and stores that data in storage as needed.

[0721] "Means of analyzing data using machine learning algorithms to identify potential new products" refers to a function that applies a specific algorithm to collected data to detect useful patterns and trends, and then uses this to identify new and potentially valuable products.

[0722] "Means of performing virtual simulations" refers to a function that runs simulated experiments on a computer to predict the impact in the real world and analyzes the results.

[0723] "Methods for generating new structures using generative AI algorithms" refer to processes that utilize artificial intelligence technology to automatically create new compounds and designs based on existing data.

[0724] "Means for analyzing related books and extracting useful insights" refers to a function that scans literature databases and identifies important information and insights related to a specific theme.

[0725] "Means of providing a user interface for visualizing and displaying analysis results" refers to a function that provides an interface for graphically representing complex data and results, enabling users to understand them intuitively.

[0726] "Means for users to construct hypotheses using the results and select candidate products" refers to support functions that enable users to make inferences based on the provided information and to make important decisions based on those inferences.

[0727] This invention is a system for streamlining data analysis and decision-making in new drug development and product design. The system mainly consists of three components: a server, a terminal, and a user.

[0728] The server first has the function of collecting and storing information such as biodata and research results from multiple databases. Specifically, it retrieves data from APIs using the programming language Python and the data processing library Pandas. After collection, the server analyzes the data by applying machine learning algorithms to identify promising new drug candidates. Machine learning frameworks such as TensorFlow and Scikit-learn are used for this analysis.

[0729] Next, the server uses a generative AI model to generate new chemical structures. For example, it uses OpenAI models to generate new ideas through prompts that utilize natural language processing technology. Furthermore, a text analysis library is used to analyze relevant literature based on the collected data and extract useful insights.

[0730] The terminal visualizes the analysis results and generated information from the server, providing them as graphs and charts for easy user understanding. Tableau and Power BI are used as data visualization tools. This enables users to make intuitive decisions.

[0731] Based on the information provided on the device, users construct hypotheses about new products and make selections. The experimental results and observations selected by the users are fed back to the server and used to continuously improve the machine learning model.

[0732] As a concrete example, here is an example of a prompt: "In the development of next-generation anticancer drugs, please tell me the steps necessary to select a compound that is effective against a specific gene mutation." Using this prompt, the AI ​​model provides new insights and supports the user's decision-making.

[0733] The flow of the specific processing in Example 1 will be explained using Figure 11.

[0734] Step 1:

[0735] The server first accesses databases to collect information. Specifically, it retrieves necessary data from biodatabases and clinical trial databases via APIs. Inputs include search queries and access keys, and the output is a structured dataset. This data is stored and prepared for use in the next processing step.

[0736] Step 2:

[0737] The server preprocesses the collected data. Specifically, it uses the Python Pandas library to detect and impute or remove missing or outlier values. The input is the collected raw data, and the output is a clean, unified dataset. This dataset is then prepared for further analysis.

[0738] Step 3:

[0739] The server applies machine learning algorithms to preprocessed data for analysis. Using TensorFlow or Scikit-learn, it identifies promising new drug candidates from the input data. The input is a preprocessed dataset, and the output is the results of identifying new drug candidates. Ranking and scoring are then performed based on these results.

[0740] Step 4:

[0741] The server runs in silico simulations, which are used to evaluate the biological effects in a virtual environment and are performed using simulation software. The input is a new drug candidate identified by machine learning, and the output is the simulation results. These results are used to predict the efficacy and safety of the molecule.

[0742] Step 5:

[0743] The server generates new chemical structures using a generative AI model. This process takes text prompts as input and runs the generative model to produce unique structures. Specifically, it uses OpenAI's generative model, providing prompt text as input to create innovative molecules and compounds.

[0744] Step 6:

[0745] The server analyzes relevant materials and extracts useful insights. It utilizes natural language processing techniques to analyze text data and identify important information. The input is text data from literature, and the output is analyzed key points and insights. These insights serve as a guide for determining the direction of research and development.

[0746] Step 7:

[0747] The terminal visually displays the analysis results and generated information. Data visualization tools are used to represent the results in graphs and charts. The input is the analysis results, and the output is visualized information provided in a user-friendly format.

[0748] Step 8:

[0749] Users build hypotheses and make selections based on information visualized on their devices. They then review the results and make a decision to select the most suitable new drug candidate. The input is visualized information, and the output is the next action and feedback determined by the user. This is fed back to the server and contributes to further improving the accuracy of the AI ​​model.

[0750] (Application Example 1)

[0751] Next, we will explain Application Example 1. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0752] Traditionally, efficiently analyzing large amounts of biodata and clinical trial data in new drug development and rapidly identifying promising drug candidates has been difficult. Furthermore, visualizing this data and tailoring content generation based on user interests required significant manual effort and time. Therefore, there was a need to automate the delivery of individually optimized content to users simultaneously with the discovery of new drug candidates.

[0753] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 1 is realized by the following means.

[0754] In this invention, the server includes means for collecting biometric and clinical trial information, means for analyzing the information with machine learning algorithms to identify new drug candidates, means for generating new chemical structures using generative AI models, and means for generating and recommending customized content based on the user's interests. This enables the user to quickly and accurately identify new drug candidates and receive personalized content.

[0755] "Biometric information" refers to data related to biological or health status, including genetic information and clinical test results.

[0756] "Clinical trial information" refers to data obtained from trials conducted to evaluate the safety and efficacy of pharmaceuticals.

[0757] A "machine learning algorithm" is a set of techniques for learning patterns from data and performing predictions and classifications.

[0758] A "new drug candidate" refers to a molecule or substance that is expected to be developed as a pharmaceutical product.

[0759] "Simulation in a virtual space" is a process that uses computer technology to simulate real-world phenomena.

[0760] A "generative AI model" is a model that generates new data and content based on artificial intelligence technology.

[0761] "Materials" refer to relevant documents and datasets, which are sources of information for gaining insights.

[0762] "Output device means" refers to a device or interface that displays analysis results and other information in a way that allows the user to visually confirm them.

[0763] A "logical hypothesis" is an unproven theory set up to explain observed phenomena.

[0764] "Personal interest-based content" refers to information and entertainment that is individually customized based on a user's past behavior and preferences.

[0765] To implement this invention, a system is constructed that uses three main components: a server, a terminal, and a user.

[0766] The server automatically collects biometric and clinical trial information from various databases and performs preprocessing such as noise reduction to refine the information. Next, machine learning algorithms are used to analyze this information and identify potential new drug candidates. The initially identified candidate substances are then virtually validated for their efficacy and safety through simulations in a virtual space. Simultaneously, generative AI models generate new chemical structures, advancing the exploration of new drug possibilities.

[0767] The terminal visually displays analysis results and new chemical structures generated on the server to the user. This allows the user to easily grasp the overall picture of the data and obtain an intuitive interface for constructing logical hypotheses. Furthermore, the terminal provides content based on the user's interests and preferences, supporting the personalization of information.

[0768] Users utilize information provided via their devices to select the most promising new drug candidates. Furthermore, users build hypotheses based on their expertise and design protocols for actual experiments and clinical trials. The results of the experiments conducted by users are sent back to the server through a feedback function, contributing to the updating and accuracy improvement of the AI ​​model.

[0769] As a concrete example, when generating new movie content, a generative AI model can be used to suggest a new story based on the user's past viewing history. An example of this prompt might be the text, "Please suggest the most suitable new story from all movie genres based on the user's interests."

[0770] The flow of a specific process in Application Example 1 will be explained using Figure 12.

[0771] Step 1:

[0772] The server automatically collects biometric and clinical trial information from various databases. It uses various databases as input and outputs an initial, unprocessed dataset. This data contains information necessary for discovering new drug candidates and forms the basis for analysis in subsequent processing steps.

[0773] Step 2:

[0774] The server preprocesses the collected data to remove noise. Using the initial raw dataset as input, it produces improved, clean data as output. Specifically, it identifies outliers and duplicate data and removes or corrects them.

[0775] Step 3:

[0776] The server uses machine learning algorithms to analyze clean data and identify potential new drug candidates. It takes clean data as input and generates a list of potential drug candidates as output. Through the analysis, a model is trained to understand the relationships between data and the effectiveness of specific molecules.

[0777] Step 4:

[0778] The server performs virtual simulations of identified new drug candidates to evaluate their biological effects. It uses a list of new drug candidates as input and collects simulation results as output. The simulator virtually tests the molecular structure and function of the compounds, virtually confirming their effectiveness.

[0779] Step 5:

[0780] The server automatically generates new chemical structures using a generative AI model. It takes a trained AI model and simulation results as input and generates candidate new chemical structures as output. This includes the generative AI exploring new chemical possibilities while referring to prompts.

[0781] Step 6:

[0782] The terminal visually displays analysis results and generated chemical structure information obtained from the server to the user. It uses data provided by the server as input and generates a visual report as output. The user interface presents information in an easily understandable format through data visualization.

[0783] Step 7:

[0784] The user constructs hypotheses and selects new drug candidates based on the information presented on the device. Visual reports are used as input, and the system provides a refined list of candidate substances as output. The user then leverages their expertise to make decisions for designing the protocols for subsequent experiments and tests.

[0785] Step 8:

[0786] The user selection results and experimental protocols are fed back to the server and used to update the AI ​​model. The system receives the user selection results as input and builds a more accurate machine learning model as output. This cycle improves prediction accuracy and the potential for new drug development in subsequent trials.

[0787] Furthermore, an emotion engine that estimates the user's emotions may be incorporated. That is, the identification processing unit 290 may use the emotion identification model 59 to estimate the user's emotions and perform identification processing using the user's emotions.

[0788] This invention provides a system that enhances the user experience and enables efficient research progress by incorporating emotion recognition into the new drug development process. The system consists of components including a server, terminals, a user interface, and an emotion engine.

[0789] Server-based functionality

[0790] The server performs traditional data collection, preprocessing, analysis using machine learning models, in silico simulations, and molecular generation using generative AI models. Analysis results and the latest research insights are transmitted to the terminal to support user decision-making. The server also analyzes user sentiment data collected by the emotion engine and incorporates it into the research process.

[0791] Device-specific features

[0792] The terminal visually displays data and analysis results sent from the server, providing an interface that is easy for the user to understand and operate. Furthermore, the terminal analyzes the user's facial expressions and voice tone through its built-in emotion engine to recognize their emotional state.

[0793] Functions powered by the emotion engine

[0794] The emotion engine monitors the user's emotions in real time and analyzes changes in the user's stress levels and interests. This information is sent to the server, allowing the system's responses and interface settings to be dynamically adjusted according to the user's emotions. For example, if a user shows excitement in response to a particular analysis result, the system will quickly provide relevant additional information to support the user's curiosity. Also, if a high stress level is detected in the user, the emotion engine can recommend content to promote relaxation.

[0795] User-defined features

[0796] Users utilize the analysis results and emotional feedback displayed on their devices to formulate hypotheses and plan experiments and clinical trials. The insights provided by the emotional engine help users understand their own emotional state and support better decision-making. Furthermore, users can adjust the responsiveness of the emotional engine and the system through the feedback function.

[0797] Specific example

[0798] For example, suppose the system detects tension from the user's facial expression while they are analyzing a new drug candidate. In this case, the device starts playing relaxing background music and simultaneously presents the analysis results again from a different perspective, helping the user consider the most appropriate option without feeling rushed. Furthermore, additional relevant materials may be provided based on the user's interests, further improving research efficiency.

[0799] The following describes the processing flow.

[0800] Step 1:

[0801] The server automatically collects diverse biodata and clinical trial data and stores it in a database. This prepares the rich information necessary for analysis.

[0802] Step 2:

[0803] The server preprocesses the collected data, removing noise and imputing missing values. By creating a clean dataset, it improves the accuracy of the analysis.

[0804] Step 3:

[0805] The server inputs pre-processed data into a machine learning model and performs data analysis. This analysis identifies molecules and targets that could be potential new drug candidates.

[0806] Step 4:

[0807] The server performs in silico simulations on identified candidate molecules and evaluates their biological effects in a virtual environment. This allows for the prediction of the effects of promising molecules.

[0808] Step 5:

[0809] The server generates new chemical structures using a generative AI model. This allows it to explore new possibilities based on existing data.

[0810] Step 6:

[0811] The server analyzes relevant literature and extracts useful insights. This information is provided in a format that is relevant to the research context.

[0812] Step 7:

[0813] The terminal visualizes the analysis results, simulation data, and generated molecular information transmitted from the server, and displays them in an easy-to-understand manner for the user.

[0814] Step 8:

[0815] The emotion engine within the device recognizes the user's emotions in real time from their facial expressions and voice, and sends the analysis results to the server.

[0816] Step 9:

[0817] Based on the emotional data it receives, the server adjusts the interface and presents information according to the user's state. For example, if the user is feeling stressed, it will suggest content on the device that promotes relaxation.

[0818] Step 10:

[0819] Users formulate hypotheses based on the presented analysis results and emotion-based insights, and plan the next steps in new drug development. They develop specific experimental plans and continuously improve the AI ​​model by providing feedback to the system.

[0820] (Example 2)

[0821] Next, we will describe Example 2. In the following description, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0822] The research process in new drug development requires time-consuming and resource-intensive analysis and the extraction of insights from diverse data. However, conventional methods struggle to provide users with the most relevant information efficiently and dynamically, lacking responsiveness that considers users' emotional states and interests. This leads to a decline in the quality of decision-making, which is a significant challenge.

[0823] The identification process performed by the identification processing unit 290 of the data processing device 12 in Example 2 is realized by the following means.

[0824] In this invention, the server includes means for collecting biological information and test data, means for identifying new drug candidates using an artificial intelligence model, and computing means for visualizing and displaying the analysis results. This enables efficient processing of the diverse data collected and dynamic information provision that takes into account the user's emotional state.

[0825] "Biological information" refers to data about the structure, function, or evolution of living organisms, and is used in new drug development and health analysis.

[0826] "Test data" refers to data collected through scientific and clinical trials, and is information used to evaluate the efficacy and safety of products and drugs.

[0827] An "artificial intelligence model" is a computer program that learns patterns from data and makes predictions and judgments based on machine learning and deep learning technologies.

[0828] "Virtual simulation" is a technique that uses computer software to mimic real-world processes and systems, and is a method for verifying hypotheses and designs before conducting physical tests.

[0829] A "generative artificial intelligence model" is an AI model that utilizes machine learning to create new data and information, and is used for generating molecular structures of new drugs and natural language text.

[0830] "Computing means" refers to devices and systems that use computers and electronic devices to process, analyze, and visualize data.

[0831] "Emotional data" refers to data that indicates a user's psychological or emotional state, obtained from things like their facial expressions and tone of voice.

[0832] "Solution" refers to technical, methodological, or means of overcoming a specific problem and achieving an objective.

[0833] This invention is a system that enables efficient and dynamic information provision in the new drug development process. The system mainly consists of three components: a server, a terminal, and a user.

[0834] The server first collects biological information and test data from various sources. The collected data is preprocessed using programming languages ​​such as Python and R to prepare it for appropriate analysis. This preprocessing removes noise from the data and makes it easier to extract important features. Next, the server analyzes the data using artificial intelligence models (e.g., Scikit-learn, TensorFlow) to identify potential new drugs. Furthermore, it utilizes generative artificial intelligence models (e.g., OpenAI models) to generate new chemical structures based on the prompt "Generate a potential new drug molecule."

[0835] The terminal plays the role of visually displaying the analysis results to the user. Using data visualization tools (e.g., Tableau, Power BI), it presents information clearly using methods such as heatmaps and box plots. Furthermore, the terminal is equipped with an emotion engine that analyzes the user's facial expressions and tone of voice to evaluate their emotional state. This enables adaptive information presentation tailored to the user's level of tension and interest.

[0836] Users utilize information provided from their devices to formulate hypotheses and select potential new drugs. During this process, feedback, including user sentiment data, is sent to the server, allowing for dynamic adjustments to the overall system's responsiveness. For example, if a user expresses interest during drug development analysis, relevant additional information and materials can be quickly provided, improving research efficiency.

[0837] In this way, this system enables the rapid and effective progress of the new drug development process through information provision and analysis that takes into account the emotional state of the user.

[0838] The flow of the specific processing in Example 2 will be explained using Figure 13.

[0839] Step 1:

[0840] The server collects biological information and test data from external data sources. This input data is often provided as CSV files or database APIs. Based on the collected data, the server uses Python scripts to remove noise and process outliers, formatting the data for analysis. The pre-processed data is then output, and the server proceeds to the next analysis step.

[0841] Step 2:

[0842] The server feeds pre-processed data into an artificial intelligence model. Based on the formatted data, it performs data analysis such as classification, regression analysis, and clustering using machine learning libraries (e.g., Scikit-learn, TensorFlow). The model's accuracy is verified through holdout cross-validation, and useful analysis results are obtained as output.

[0843] Step 3:

[0844] The server inputs the prompt "Generate a new drug candidate molecule" into a generative artificial intelligence model based on the analysis results. Data generation is performed based on this prompt, and a new chemical structure is proposed. The generated molecular data is used as input for in silico simulations.

[0845] Step 4:

[0846] The terminal visually presents the analysis results and generated chemical structures transmitted from the server to the user. Based on the data provided as input, it generates visualized information using data visualization software (e.g., Tableau, Power BI). The results are output in a visually easy-to-understand format to support decision-making.

[0847] Step 5:

[0848] The device uses an emotion engine to analyze the user's facial expressions and voice tone. Emotional data collected through the camera and microphone is evaluated using Affectiva or similar emotion analysis software. Based on this input, the user's stress and interest levels are output and fed back to the server.

[0849] Step 6:

[0850] The server dynamically adjusts the system's response based on emotional data sent from the terminal. If the user is experiencing high stress levels, it generates content to promote relaxation and instructs the terminal to display it. For example, it may provide relaxing music or related materials as output to support the user's exploration activities.

[0851] (Application Example 2)

[0852] Next, we will explain application example 2. In the following explanation, the data processing device 12 will be referred to as the "server" and the robot 414 as the "terminal".

[0853] This invention aims to solve the problem of difficulty in considering the influence of changes in human emotions during the new drug development process. Conventional approaches often involve data analysis and decision-making without considering the psychological state of researchers, and there is a need to improve the user experience. Furthermore, even in the home environment, there is a lack of systems that recognize users' emotions and support their daily activities.

[0854] The specific processing performed by the specific processing unit 290 of the data processing device 12 in Application Example 2 is realized by the following means.

[0855] In this invention, the server includes means for collecting biodata and clinical trial data, means for analyzing the data with a machine learning model to identify new drug candidates, means for simulating the identified new drug candidate molecules in a computer environment, means for recognizing emotions and analyzing the user's psychological state, and means for applying emotion recognition data to home appliances to support the user's daily activities. This makes it possible to improve the user experience in new drug development and to provide flexible support tailored to the user's emotional state in daily life.

[0856] "Biodata" refers to information obtained from living organisms and plays a crucial role in new drug development and medical treatment.

[0857] "Clinical trial data" refers to data based on trials conducted to confirm the effectiveness and safety of drugs or treatments.

[0858] A "machine learning model" is a collection of algorithms that automatically find patterns and regularities in data and perform predictions and classifications.

[0859] "Simulation in a computer environment" is a method of virtually reproducing real chemical processes on a computer and predicting their effects and properties before conducting an experiment.

[0860] A "generative AI model" is a computational model that uses artificial intelligence to generate new data and structures.

[0861] "Related literature" refers to research papers, databases, and other literature information that are useful in new drug development.

[0862] "Visualization" is a technique that visually represents data and its analysis results, making them easy to understand.

[0863] "Means of recognizing emotions" refers to technologies that use sensors to detect a user's facial expressions, voice, and behavior, and then analyze their psychological state.

[0864] "Methods for applying emotion recognition data to home appliances" refers to technologies that use the obtained emotion data to adjust the operation of home appliances and support users in living comfortably.

[0865] To realize this invention, a system integrating multiple components is required. The server efficiently collects biodata and clinical trial data, preprocesses this data, and then analyzes it using machine learning models. Noise is removed during the analysis process, and useful information to aid in decision-making is extracted. Then, new drug candidates are identified, and simulations are performed on their molecules in a computer environment. Subsequently, new chemical structures are created using generative AI models, and relevant literature is analyzed to gain insights. The obtained results are visualized and provided to the user through a data terminal.

[0866] The device functions as the user interface, displaying analysis results in a visual and user-friendly format. Furthermore, it incorporates an emotion engine to recognize emotions and monitor the user's psychological state in real time. Based on this information, in a home environment, it can adjust the operation of home appliances to match the user's emotional state. Specifically, it can play relaxation music when the user is feeling stressed.

[0867] Users build hypotheses based on visualized data and proceed with planning research and experiments. Furthermore, emotional feedback enables users to make better decisions and gain new insights that take their psychological state into account.

[0868] An example of a prompt that could be applied is: "Please give an example of how a home robot could suggest appropriate relaxation methods based on the user's emotions."

[0869] The flow of a specific process in Application Example 2 will be explained using Figure 14.

[0870] Step 1:

[0871] The server collects biodata and clinical trial data from users. This data includes, for example, genetic information and test results. The input data is preprocessed within the program to remove noise and output as a clean dataset.

[0872] Step 2:

[0873] Within the server, a machine learning model analyzes a clean dataset. This analysis uses an algorithm that extracts correlations and features from each data point to identify potential new drugs. The output is a list of identified drug candidates.

[0874] Step 3:

[0875] The server performs simulations of the identified drug candidate molecules in a computing environment. This step involves virtual experiments to determine how the molecular structure and biological activity function. The input is data on the drug candidate, and the output is the simulation results.

[0876] Step 4:

[0877] The server generates new chemical structures using a generative AI model. Based on the simulation results, it searches for the most optimal molecular structure. In this process, existing structural data is taken as input, and new structural data generated by the AI ​​is output.

[0878] Step 5:

[0879] The server analyzes relevant literature and extracts useful insights. Here, information from a literature database is used as input, and text containing insights useful for research is output.

[0880] Step 6:

[0881] The terminal presents the user with visualized analysis results. This output, displayed on the monitor, is converted into graphs and charts that are easy for the user to understand.

[0882] Step 7:

[0883] An emotion engine built into the device monitors the user's psychological state in real time. The user's facial expressions and voice input are analyzed by emotion recognition software and output as data indicating their state.

[0884] Step 8:

[0885] The device adjusts home appliances based on emotional data to support the user's daily life. For example, if it determines that the user is stressed, it will play relaxation music or perform other actions. The input emotional data acts as a trigger for appliance operation, resulting in the operation of the appliances as the output.

[0886] Step 9:

[0887] Users build hypotheses and plan experiments and tests based on data visualized on their devices and emotional feedback. In this step, text information is input to support the user's decision-making, and a specific plan is output.

[0888] The specific processing unit 290 transmits the result of the specific processing to the robot 414. In the robot 414, the control unit 46A causes the speaker 240 and the controlled object 443 to output the result of the specific processing. The microphone 238 acquires audio indicating user input for the result of the specific processing. The control unit 46A transmits the audio data indicating user input acquired by the microphone 238 to the data processing unit 12. In the data processing unit 12, the specific processing unit 290 acquires the audio data.

[0889] Data generation model 58 is a type of so-called generative AI (Artificial Intelligence). One example of data generation model 58 is ChatGPT (Internet search<URL: https: / / openai.com / blog / chatgpt> ), Gemini (Internet search) <url: https: gemini.google.com ?hl="ja">Examples of generative AI include the following. The data generation model 58 is obtained by performing deep learning on a neural network. The data generation model 58 is input with prompts containing instructions, and with inference data such as audio data representing speech, text data representing text, and image data representing images. The data generation model 58 infers from the input inference data according to the instructions indicated by the prompts, and outputs the inference results in data formats such as audio data and text data. Here, inference refers to, for example, analysis, classification, prediction, and / or summarization.

[0890] In the above embodiment, an example was given in which the specific processing is performed by the data processing device 12, but the technology of this disclosure is not limited thereto, and the specific processing may also be performed by the robot 414.

[0891] Furthermore, the emotion identification model 59, acting as an emotion engine, may determine the user's emotion according to a specific mapping. Specifically, the emotion identification model 59 may determine the user's emotion according to a specific mapping, which is an emotion map (see Figure 9). Similarly, the emotion identification model 59 may also determine the robot's emotion, and the identification processing unit 290 may perform identification processing using the robot's emotion.

[0892] Figure 9 shows an emotion map 400 in which multiple emotions are mapped. In the emotion map 400, emotions are arranged in concentric circles radiating from the center. The closer to the center of the concentric circles, the more primitive the emotions are located. Further out of the concentric circles, emotions representing states and actions arising from mental states are located. Emotion is a concept that includes feelings and mental states. On the left side of the concentric circles, emotions that are generally generated from reactions occurring in the brain are located. On the right side of the concentric circles, emotions that are generally induced by situational judgment are located. Above and below the concentric circles, emotions that are generally generated from reactions occurring in the brain and induced by situational judgment are located. In addition, the emotion of "pleasure" is located on the upper side of the concentric circles, and the emotion of "displeasure" is located on the lower side. Thus, in the emotion map 400, multiple emotions are mapped based on the structure in which emotions arise, and emotions that are likely to occur simultaneously are mapped close together.

[0893] These emotions are distributed at the 3 o'clock position on the Emotion Map 400, and usually fluctuate between feelings of security and anxiety. In the right half of the Emotion Map 400, situational awareness takes precedence over internal feelings, resulting in a calm impression.

[0894] The inside of the Emotion Map 400 represents inner thoughts, while the outside represents actions. Therefore, the further you go from the outside of the Emotion Map 400, the more visible (expressed in actions) your emotions become.

[0895] Here, human emotions are based on various balances, such as posture and blood sugar levels. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. Similarly, in robots, cars, motorcycles, etc., emotions can be created based on various balances, such as posture and battery level. When these balances deviate from the ideal, it results in discomfort, and when they approach the ideal, it results in pleasure. The emotion map can be generated, for example, based on Dr. Mitsuyoshi's emotion map (Research on a system for analyzing brain physiological signals of speech emotion recognition and emotion, Tokushima University, doctoral dissertation: https: / / ci.nii.ac.jp / naid / 500000375379). The left half of the emotion map contains emotions belonging to a region called "response," where sensation is dominant. The right half of the emotion map contains emotions belonging to a region called "situation," where situational awareness is dominant.

[0896] The emotion map defines two emotions that promote learning. One is the emotion around the middle of the negative "repentance" and "reflection" on the situation side. In other words, it is when the robot experiences negative emotions such as "I never want to feel this way again" or "I don't want to be scolded again." The other is the emotion around the positive "desire" on the reaction side. In other words, it is when the robot has positive feelings such as "I want more" or "I want to know more."

[0897] The emotion identification model 59 inputs user input into a pre-trained neural network, obtains emotion values ​​representing each emotion shown in the emotion map 400, and determines the user's emotion. This neural network is pre-trained based on multiple training data sets, which are combinations of user input and emotion values ​​representing each emotion shown in the emotion map 400. Furthermore, this neural network is trained so that emotions located close together have similar values, as shown in the emotion map 900 in Figure 10. Figure 10 shows an example where multiple emotions such as "reassured," "calm," and "confident" have similar emotion values.

[0898] The above description primarily focuses on the functions of the data processing device 12 in relation to this disclosure. However, the system related to this disclosure is not necessarily implemented on a server. The system related to this disclosure may be implemented as a general information processing system. This disclosure may be implemented, for example, as a software program that runs on a personal computer or as an application that runs on a smartphone. The method related to this disclosure may be provided to users in SaaS (Software as a Service) format.

[0899] In the above embodiment, an example was given in which a specific process is performed by a single computer 22. However, the technology of this disclosure is not limited thereto, and a distributed processing of the specific process may be performed by multiple computers, including computer 22. For example, a data generation model 58 may be provided in an external device of the data processing device 12, and the external device may generate data according to the input data.

[0900] In the above embodiment, an example was given in which the specific processing program 56 is stored in the storage 32, but the technology of this disclosure is not limited thereto. For example, the specific processing program 56 may be stored in a portable, computer-readable, non-temporary storage medium such as a USB (Universal Serial Bus) memory. The specific processing program 56 stored in the non-temporary storage medium is installed in the computer 22 of the data processing device 12. The processor 28 executes specific processing according to the specific processing program 56.

[0901] Alternatively, the specific processing program 56 may be stored in a storage device such as a server connected to the data processing device 12 via the network 54, and the specific processing program 56 may be downloaded and installed on the computer 22 in response to a request from the data processing device 12.

[0902] Furthermore, it is not necessary to store the entirety of the specific processing program 56 in a storage device such as a server connected to the data processing device 12 via the network 54, or to store the entirety of the specific processing program 56 in the storage 32; it is acceptable to store only a portion of the specific processing program 56.

[0903] The following types of processors can be used as hardware resources to perform specific processing. Examples of processors include a CPU, a general-purpose processor that functions as a hardware resource to perform specific processing by executing software, i.e., a program. Other examples of processors include dedicated electrical circuits, such as FPGAs (Field-Programmable Gate Arrays), PLDs (Programmable Logic Devices), or ASICs (Application Specific Integrated Circuits), which have circuit configurations specifically designed to perform specific processing. All of these processors have built-in or connected memory, and all of them perform specific processing by using memory.

[0904] The hardware resource that performs a specific process may consist of one of these various processors, or it may consist of a combination of two or more processors of the same or different types (for example, a combination of multiple FPGAs, or a combination of a CPU and an FPGA). Alternatively, the hardware resource that performs a specific process may consist of a single processor.

[0905] Examples of configurations using a single processor include, firstly, a configuration in which one or more CPUs and software are combined to form a single processor, and this processor functions as a hardware resource that performs a specific process. Secondly, there is a configuration using a processor that realizes the functions of the entire system, including multiple hardware resources that perform a specific process, on a single IC chip, as exemplified by SoCs (System-on-a-chip). In this way, a specific process is realized using one or more of the above types of processors as hardware resources.

[0906] Furthermore, the hardware structure of these various processors can more specifically utilize electrical circuits that combine circuit elements such as semiconductor devices. Also, the specific processing described above is merely an example. Therefore, it goes without saying that unnecessary steps can be deleted, new steps added, or the processing order rearranged, as long as it does not deviate from the main purpose.

[0907] The descriptions and illustrations presented above are detailed explanations of the technical aspects of this disclosure and are merely examples of the technical aspects. For example, the above descriptions of the structure, function, operation, and effect are examples of the structure, function, operation, and effect of the technical aspects of this disclosure. Therefore, it goes without saying that you may delete unnecessary parts, add new elements, or replace elements in the descriptions and illustrations presented above, as long as you do not deviate from the essence of the technical aspects of this disclosure. Furthermore, in order to avoid confusion and facilitate understanding of the technical aspects of this disclosure, explanations of common technical knowledge and the like that do not require special explanation to enable the implementation of the technical aspects of this disclosure have been omitted from the descriptions and illustrations presented above.

[0908] All documents, patent applications, and technical standards described herein are incorporated by reference to the same extent as if each individual document, patent application, and technical standard were specifically and individually noted as being incorporated by reference.

[0909] The following is further disclosed regarding the embodiments described above.

[0910] (Claim 1)

[0911] Means for collecting biodata and clinical trial data,

[0912] A means for analyzing the aforementioned data using a machine learning model to identify new drug candidates,

[0913] A means of performing in silico simulations of identified new drug candidate molecules,

[0914] A means of generating new chemical structures using a generative AI model,

[0915] A means of analyzing related literature to extract useful insights,

[0916] A terminal device for visualizing and displaying the analysis results,

[0917] A means for the user to construct a hypothesis and select candidate molecules using the results described above,

[0918] A system that includes this.

[0919] (Claim 2)

[0920] The system according to claim 1, comprising means for preprocessing collected data and removing noise.

[0921] (Claim 3)

[0922] The system according to claim 1, which includes means for feeding experimental results back into the system and updating the AI ​​model.

[0923] "Example 1"

[0924] (Claim 1)

[0925] Means for collecting and storing information,

[0926] A means for analyzing the aforementioned information using a machine learning algorithm to identify potential new products,

[0927] A means of performing a virtual simulation of explicitly stated new product candidates,

[0928] A means of generating a new structure using a generative AI algorithm,

[0929] A means of analyzing related books and extracting useful insights,

[0930] A means of providing a user interface for visualizing and displaying analysis results,

[0931] A means by which the user constructs a hypothesis using the results and selects candidate products,

[0932] A system that includes this.

[0933] (Claim 2)

[0934] The system according to claim 1, comprising means for preprocessing acquired information and removing unnecessary data.

[0935] (Claim 3)

[0936] The system according to claim 1, which re-enters the test results into the system and updates the artificial intelligence algorithm.

[0937] "Application Example 1"

[0938] (Claim 1)

[0939] Means for collecting biometric information and clinical trial information,

[0940] A means for analyzing the aforementioned information using a machine learning algorithm to identify new drug candidates,

[0941] A means of performing simulations of identified new drug candidates in a virtual space,

[0942] A method for generating new chemical structures using generative AI models,

[0943] A means of analyzing related materials and extracting useful insights,

[0944] An output device means for visualizing and displaying the analysis results,

[0945] A means for the user to construct a logical hypothesis using the results and select candidate substances,

[0946] A means of generating and recommending customized content based on user interests,

[0947] A system that includes this.

[0948] (Claim 2)

[0949] The system according to claim 1, comprising means for preprocessing collected information and removing unnecessary parts.

[0950] (Claim 3)

[0951] The system according to claim 1, comprising means for feeding experimental results back into the system and updating the artificial intelligence model.

[0952] "Example 2 of combining an emotion engine"

[0953] (Claim 1)

[0954] Means for collecting biological information and test data,

[0955] A means for analyzing the aforementioned data using an artificial intelligence model to identify new drug candidates,

[0956] A means of performing virtual simulations of identified new drug candidates,

[0957] A means of generating new chemical structures using a generative artificial intelligence model,

[0958] A means of analyzing relevant literature and extracting useful insights,

[0959] A computer means for visualizing and displaying the analysis results,

[0960] A means for analyzing emotional data received from a computer and dynamically adjusting the system's response,

[0961] A means for the user to construct a hypothesis and select candidates using the aforementioned analysis results,

[0962] A system that includes this.

[0963] (Claim 2)

[0964] The system according to claim 1, comprising means for preprocessing collected data and removing noise.

[0965] (Claim 3)

[0966] The system according to claim 1, comprising means for feeding test results back into the system and updating the generative artificial intelligence model.

[0967] "Application example 2 when combining with an emotional engine"

[0968] (Claim 1)

[0969] Means for collecting biodata and clinical trial data,

[0970] A means for analyzing the aforementioned data using a machine learning model to identify new drug candidates,

[0971] A means of performing computer simulations of identified new drug candidate molecules,

[0972] A means of generating new chemical structures using a generative AI model,

[0973] A means of analyzing related literature to extract useful insights,

[0974] A terminal device for visualizing and displaying the analysis results,

[0975] A means for the user to construct a hypothesis and select candidate molecules using the results described above,

[0976] A means of recognizing emotions and analyzing the user's psychological state,

[0977] A means of supporting users' daily activities by applying emotion recognition data to home appliances,

[0978] A system that includes this.

[0979] (Claim 2)

[0980] The system according to claim 1, comprising means for preprocessing collected data and removing noise.

[0981] (Claim 3)

[0982] The system according to claim 1, comprising means for dynamically adjusting the analysis results based on the user's emotional state and presenting information to support the user's choices. [Explanation of Symbols]

[0983] 10, 210, 310, 410 Data Processing Systems 12 Data Processing Devices 14 Smart Devices 214 Smart Glasses 314 Headset-type terminal 414 Robots< / url:> < / url:> < / url:> < / url:>

Claims

1. Means for collecting biometric information and clinical trial information, A means for analyzing the aforementioned information using a machine learning algorithm to identify new drug candidates, A means of performing simulations of identified new drug candidates in a virtual space, A method for generating new chemical structures using generative AI models, A means of analyzing related materials and extracting useful insights, An output device means for visualizing and displaying the analysis results, A means for the user to construct a logical hypothesis using the results and select candidate substances, A means of generating and recommending customized content based on user interests, A system that includes this.

2. The system according to claim 1, comprising means for preprocessing collected information and removing unnecessary parts.

3. The system according to claim 1, which includes means for feeding experimental results back into the system and updating the artificial intelligence model.