Generating chemical data associated with a target material
A data-driven model combining measured and historical data efficiently generates chemical data for target materials, enhancing production monitoring and control.
Patent Information
- Authority / Receiving Office
- WO · WO
- Patent Type
- Applications
- Current Assignee / Owner
- BASF SE
- Filing Date
- 2025-12-16
- Publication Date
- 2026-07-02
Smart Images

Figure EP2025087397_02072026_PF_FP_ABST
Abstract
Description
[0001] 231252
[0002] GENERATING CHEMICAL DATA ASSOCIATED WITH A TARGET MATERIAL
[0003] FIELD OF THE INVENTION
[0004] The present disclosure relates to co-pilot systems, methods and computer programs for generating chemical data associated with a target material, and to a use of any of the foregoing for monitoring and / or controlling a production of a target material.
[0005] BACKGROUND OF THE INVENTION
[0006] It is known that analytical data can be analyzed by using neural networks built for receiving defined analytic data and providing defined analytical results. In this way, a lot of resources are required for building a variety of specific tools to provide further insights into analytical data.
[0007] SUMMARY OF THE INVENTION
[0008] One object addressed by the present disclosure is to provide more insightful chemical data associated with a target material in a resource-efficient manner.
[0009] In a first aspect, a method for generating chemical data associated with a target material is disclosed, the method comprising i) receiving measured experimental data associated with a target material and indicative of a property of the target material, ii) receiving historical experimental data associated with the target material and indicative of a historical property of the target material, iii) providing task instructions for determining chemical data to a data-driven model configured to follow task instructions, the task instructions provided to the data-driven model comprising the measured experimental data and the historical experimental data, and iv) providing the chemical data. It will be understood that the method can particularly be a computer-implemented method.
[0010] By instructing a data-driven model with task instructions for determining the chemical data, wherein the task instructions comprise both measured and historical experimental data, chemical data providing further insights into the target material can be generated in a resource-efficient manner. As an example, the measured experimental data may comprise a spectrum obtained by UV / Vis spectrophotometry and the historical experimental data may comprise a spectrum obtained by infrared spectroscopy. The historical experimental data may be suitable for deriving a qualitative composition of a target chemical and the measured experimental data may be suitable for determining a quantitative composition of the target chemical product. Hence, the chemical data may comprise a qualitative and a quantitative composition. According to another example, the measured experimental data may comprise chromatographic data, and the historical experimental data may231252
[0011] 2
[0012] comprise nuclear magnetic resonance data. The chromatographic data may be suitable for determining a presence of one or more different chemicals, wherein a target material may comprise two or more different chemicals. The nuclear magnetic resonance data may be suitable for determining functional groups associated with the target material. Hence, combining the historical experimental data and the measured experimental data may result in identifying the two or more different chemicals. Thus, the chemical data may be indicative of two of more different chemicals of a target chemical.
[0013] The receiving of the measured and the historical experimental data could also be regarded as an obtaining of the measured and the historical experimental data. The measured experimental data may be indicative of one or more properties of the target material. The one or more properties indicated by the measured experimental data may be biological properties, physical properties and / or chemical properties. Similarly, the historical property indicated by the historical experimental data may be a biological property, a physical property and / or a chemical property.
[0014] The chemical data could be provided for monitoring and / or controlling a production of materials, particularly a target material. Thus, the chemical data could be data for monitoring and / or controlling producing and / or processing a target material based on chemical data associated with a target material. Accordingly, also a method for monitoring and / or controlling a production of a material, particularly a target material, is presented, wherein the method includes a step of receiving measured experimental data associated with the material and indicative of a property of the material, a step of receiving historical experimental data associated with the target material and indicative of a historical property of the target material, and a step of providing task instructions for determining chemical data to a data-driven model configured to follow task instructions, the task instructions provided to the data-driven model comprising the measured experimental data and the historical experimental data. It will be understood that also this method can particularly be computer-implemented.
[0015] The chemical data could be provided via a user interface, for instance. In this case, based on the chemical data, a user may initiate monitoring and / or control measures for a production of the target material. It is also possible that the chemical data is provided via a respective interface to a monitoring and / or control apparatus configured to carry out a monitoring and / or controlling of a production of the target material based on the chemical data.
[0016] In an embodiment, the measured experimental data may be obtained after having obtained the historical experimental data. Preferably, the historical property associated with the historical experi-mental data may be obtained prior to obtaining the measured experimental data and / or prior to determining the chemical data associated with the measure experimental data. The historical experimental data may be obtained before the measured experimental data may be obtained.
[0017] The task instruction for determining the chemical data may further comprise the historical property of the target material. The historical experimental data may further comprise and / or may further be associated with the historical property of the target material.
[0018] The measured experimental data may be received via a user interface. This allows a high degree of user control. In particular, the receiving of the measured experimental data may refer to receiving a user request including the measured experimental data and / or include receiving a user request including an indication of the measured experimental data, wherein the indication is suitable for obtaining the measured experimental data. The user request may be a user request for obtaining chemical data. Based on an indication of measured experimental data, wherein the indication is received as part of a user request, the measured experimental data may be retrieved from a database. Thus, in an embodiment, the measured experimental data are retrieved from a database, preferably upon receiving an indication of the measured experimental data, wherein the indication may be received via a user interface, and may particularly be included in a user request for obtaining chemical data, for instance.
[0019] Additionally or alternatively, the measured experimental data may, in an embodiment, be received via an interface to a measurement device. This can make it simpler for users to have the chemical data provided. A combination of both options, in which the measured experimental data could be received, for instance, based on an interaction of a user with the user interface and also a data transfer via the interface to the measurement device, can be a good compromise between user control and usability. The two options could also correspond to two separate modes of receiving the measured experimental data, wherein in each mode the measured experimental data would only be received according to one of the options.
[0020] The measurement device may be a sensor. The sensor may be installed in an environment for performing a chemical reaction associated with the measured experimental data. Measured experimental data may be suitable for determining a property of a material, which may particularly be a chemical. Measured experimental data may be indicative of the property of the material. In case the material is a chemical, the measured experimental data may be recorded while synthesizing the chemical from one or more educts. Moreover, in that case the measured experimental data may comprise analytical data associated with the chemical. The measured experimental data may com-231252
[0021] 4
[0022] prise at least one of numerical data, in particular tabular data, string data, image data or a combination thereof. The measured experimental data may be indicative of an interaction of a material, particularly a chemical, with a probe. Probes can be electromagnetic radiation, particles such as electrons, neutrons or the like.
[0023] In an embodiment, the historical experimental data may be received from a database comprising historical experimental data associated with a plurality of materials. Without such a database, also the historical experimental data might need to be provided by the user, which could render the providing of the chemical data laborious.
[0024] For instance, the historical experimental data may be received from the database by providing a retrieval request to the database. The retrieval request may indicate the historical experimental data which shall be retrieved from the database, which could also be referred to as target historical experimental data. The request may be obtained from the target historical experimental data. The request may also be obtained from the measured experimental data. For instance, historical experimental data stored in the database may be selected by determining a similarity score indicative of a distance between the historical experimental data and the measured experimental data, in particular a representation of the measured experimental data and the historical experimental data. The representation of the measured experimental data and the historical experimental data may be a numerical representation such as a tensor, in particular a first-rank tensor. Determining a distance may comprise determining a Euclidean and / or cosine distance of the representation of the measured experimental data and the historical experimental data.
[0025] Thus, in an embodiment, the historical experimental data may satisfy a predefined relation with respect to the measured experimental data. The predefined relation may be formulated in terms of a similarity score as indicated above. However, the predefined relation may also be chosen in a different way. For instance, the historical experimental data may be required, based on the predefined relation, to match each other regarding an acquisition modality. On the other hand, the historical experimental data may also be required to not match each other in at least one regard. In particular, the historical experimental data may be required to be indicative of at least one property of the target material that is different from properties of which the measured experimental data are indicative. This at least one property may be referred to herein as the historical property.
[0026] The historical experimental data and the measured experimental data may be obtained by sensors installed in an environment for analyzing one or more target materials associated with the historical experimental data and the measured experimental data. The historical experimental data may have been obtained before the measured experimental data.231252
[0027] 5
[0028] The historical experimental data are not necessarily measured data, particularly not necessarily sensor data. In an embodiment, the historical experimental data may include data indicating a functional relationship satisfied by a property of the target material. The functional relationship may indicate a dependence of the property on an experimental context, wherein the experimental context may be expressed in terms of one or more physical, biological and / or chemical parameters, for instance. The historical property may in that case refer to the property whose dependence on the parameters is indicated by the functional relationship, or to the dependence. The functional relationship may be encoded in text form, one or more equations and / or one or more graphical representations, such as a graph of a function or a diagram showing the functional relationship. Thus, particularly the historical experimental data may not only include numerical data, but also text data, symbolic data and / or image data. Symbolic data may refer to text data, to image data or to a data form of its own kind. Equations comprising variables, such as for indicating the historical property and parameters expressing an experimental context on which it depends, may be provided as at least partially symbolic data, since symbols may be used in the equations to indicate the variables and their relation with respect to each other.
[0029] Functional relationships can efficiently encode information which might otherwise only be expressible by relatively large amounts of data. Therefore, if the historical experimental data include data indicating a functional relationship satisfied by the historical property of the target material, less historical experimental data may need to be received for the same amount of additional information which can be extracted from the received historical experimental data. Conversely, for a given amount of historical experimental data received, more additional information can be extracted from it, and thus more useful chemical data can be generated. In other words, more useful chemical data associated with the target material may be generated more efficiently.
[0030] According to an illustrative example, the measured chemical data may indicate a conductivity of the target material at a certain temperature T. The historical experimental data may correspond to an equation, graph or diagram indicating how the conductivity of the target material depends on temperature for different chemical and / or physical compositions of the target material. Based on these two kinds of experimental data, the conductivity of the target material at a temperature different from T may be determined, and / or it may be determined that the target material should have a different chemical and / or physical composition in order to achieve a desired conductivity at temperature T or a desired temperature dependence of the conductivity. Both would be understood as chemical data herein, and may be obtained by instructing a data-driven model and / or one or more analysis engines accordingly. A production and / or processing of the target material may then be changed based on such chemical data, such as in a manner affecting the chemical and / or physical composition of the target material.231252
[0031] 6
[0032] In an embodiment, the chemical data can be determined by determining at least one of a) the property of the target material of which the measured experimental data are indicative and b) the historical property, and determining the chemical data based on the determined at least one property. Thus, the chemical data can be determined based on the property of the target material indicated by the measured experimental data, and / or on the historical property, which is indicated by the historical experimental data. This can be preferred, since the determination of the respective property can function as an intermediate step guiding the determination of the chemical data by the data-driven model. For instance, the property indicated by the measured experimental data may be determined based on the measured experimental data and used for determining the chemical data. Likewise, the historical property may be determined based on the historical experimental data and used for determining the chemical data.
[0033] For determining the property indicated by the measured experimental data and / or for determining the historical property based on the respective experimental data, one or more analysis engines may be used. Thus, the analysis of the respective experimental data with respect to the respective property of the target material does not have to be carried out by the data-driven model itself. Instead, existing analysis engines can be used, which can save resources. In particular, it is not necessary that a new data-driven model is established for each type of measured experimental data or historical experimental data.
[0034] In an embodiment, the task instructions may further comprise selection task instructions for selecting an analysis engine configured to determine the at least one property based on the respective experimental data, which are indicative of the at least one property, wherein the selection task instructions comprise one or more indications of a) a plurality of candidate engines including the analysis engine and b) one or more functions associated with the plurality of candidate engines and the respective experimental data. In this embodiment, the method may further comprise i) providing the task instructions to the data-driven model for selecting the analysis engine, and ii) using the analysis engine for determining the at least one property.
[0035] Selecting, from a plurality of candidates, an analysis engine suitable for determining the at least one property from the respective experimental data can be a challenging task. A data-driven model can accomplish this task in an efficient manner. Moreover, the selection can be very precise if the model is instructed using instructions indicating functions associated with the plurality of candidate engines and the respective experimental data.231252
[0036] 7
[0037] The candidate engines indicated by the selection task instructions may refer to available engines. The one or more functions indicated by the selection task instructions may refer to a respective capability of the candidate engines regarding an analysis of the respective experimental data with respect to the at least one property. For instance, the selection task instructions may indicate, for each of a plurality of available analysis engines, what types of experimental data the analysis engine is configured to analyse. A type of the measured experimental data and / or of the historical experimental data may be indicated by the task instructions. For instance, the task instructions may comprise metadata associated with the respective type of the experimental data. Similarly, the selection task instructions may indicate the plurality of candidate engines in terms of a representation comprising metadata associated with the type of experimental data to be received by a respective engine.
[0038] In an embodiment, the analysis engine is used for determining the at least one property by providing the respective experimental data to the analysis engine, and receiving the at least one property in response. In particular, it may not be necessary to provide the analysis engine with further instructions, since it may be specifically adapted for analysing experimental data of the respective type with respect to the at least one property. Instead, if selected correctly, the only input that might be needed by the analysis engine may be the respective experimental data. A respective analysis engine may be an application-specific data-driven model such as a classification model, or may be a mechanistic model comprising one or more equations for relating respective experimental data to the at least one property of the target material.
[0039] The analysis engine may require a predefined data structure associated with the experimental data. In an embodiment, the respective experimental data may therefore be structured according to an input structure required by the analysis engine, wherein the respective structured experimental data are provided to the analysis engine. For instance, it can be preferred that the experimental data is provided, preferably as part of task instructions to transform the experimental data into a required structure, to a structure data-driven model together with a) structure requirements associated with the selected analysis engine or b) a plurality of structure requirements associated with the plurality of (available) analysis engines and preferably also the selection of an analysis engine provided as output by the data-driven model instructed with the selection task instructions. The structure data-driven model, which may be configured to follow task instructions, may generate experimental data associated with the predefined data structure required by the selected analysis engine. The transformed experimental data generated by the structure data-driven model may be referred to as structured experimental data. Further, the structured experimental data may be provided from the structure data-driven model to a structure validation unit. The structure validation unit may determine if the structured experimental data may be suitable for being received by the selected analysis231252
[0040] 8
[0041] engine. In response to determining that the structured experimental data may be suitable for being received by the selected analysis engine, the structured experimental data may be provided to the selected analysis engine.
[0042] Alternatively, the selection task instructions may also indicate structure requirements associated with the plurality of analysis engines. In this case, the data-driven model being instructed with the selection task instructions, which could also be referred to as a selection data-driven model, may select the analysis engine in accordance with the structure requirements, such that no transformation of the respective experimental data may be necessary. It is also possible that the selection data-driven model provides, along with an output indicating the selected analysis engine, structured experimental data, i.e., the respective experimental data after transformation in accordance with the structure requirements.
[0043] While the data-driven model may be configured to output a selection of an analysis engine to be used for determining the chemical data, also the data-driven model may be configured to output the chemical data. Thus, separate analysis engines may not be needed. Still, it may be preferred that the chemical data are determined based on at least one of the properties of the target material of which the experimental data are indicative. Accordingly, in an embodiment, the task instructions may further comprise analysis instructions instructing the data-driven model to determine the at least one property based on the respective experimental data, which are indicative of the at least one property.
[0044] The one or more analysis engines may be configured to perform symbolic and / or algebraic manipulations of received input. In particular, the one or more analysis engines may include one or more computer algebra systems. Analysis engines of this type may be particularly suitable for analysing historical experimental data indicating functional relationships. Thus, for instance, for analysing historical experimental data indicating a functional relationship in terms of one or more equations, an analysis engine configured to perform symbolic and / or algebraic manipulations may be selected. A data-driven model may be trained and / or instructed accordingly, i.e., to make analysis engine selections of this kind.
[0045] In an embodiment, the at least one property is used for determining the chemical data by instructing the data-driven model to determine the chemical data based on the at least one property. Hence, irrespective of whether the at least one property is determined by an analysis engine or the data-driven model itself, it may subsequently be used by the data-driven model for determining the chemical data. The determined at least one property as such might not yet be suitable to be provided, for instance, as output to a user or a monitoring and / or control apparatus. This may be due,231252
[0046] 9
[0047] for instance, to an output format in which the analysis engine or the data-driven model outputs the at least one property. By use of the data-driven model, the determined property can be efficiently converted into chemical data that are suitable to be provided, for instance, as output to a user or a monitoring and / or control apparatus.
[0048] In an embodiment, the chemical data may comprise the property associated with and / or indicated by the experimental data.
[0049] It can be preferred that the chemical data are determined based on both the property of which the measured experimental data are indicative, and the historical property. Thus, in the above, "the at least one property" can refer to both the property of which the measured experimental data are indicative and the historical property. According to one option, both properties are determined by use of one or more analysis engines. According to another option, both properties are determined by the data-driven model. According to yet other options, any one of the two properties is determined by use of one or more analysis engines, and the other of the two options is determined by use of the data-driven model. If both properties are determined by use of one or more analysis engines, at least two analysis engines may be selected for doing so by the data-driven model. It can be preferred that a first and a second analysis engine is selected, wherein the first analysis engine may be configured to analyse the measured experimental data with respect to the property for which the experimental data are indicative, and the second analysis engine may be configured to analyse the received historical experimental data with respect to the historical property. On the other hand, it is also possible that a single analysis engine is selected which is configured to analyse both measured experimental data as well as the historical experimental data.
[0050] It is also possible that none of the properties for which the experimental data are indicative is determined as a basis for determining the chemical data. If one or both of the properties for which the experimental data are indicative are not used as a basis for determining the chemical data, the respective experimental data may be used directly by the data-driven model to determine the chemical data. Whether the data-driven model uses the respective experimental data first for determining the respective property of the target material in order to determine the chemical data based thereon, or whether it uses the experimental data directly for doing so, may be indicated by the task instructions.
[0051] The data-driven model may be configured for receiving task instructions comprising at least two of numerical data, in particular tabular data, string data or image data. The data-driven model may be configured for mapping the task instructions to a machine-processable representation thereof. The231252
[0052] 10
[0053] machine-processable representation may be a numerical representation such as a tensor, in particular a first-rank tensor. The data-driven model may be trained for mapping the machine-processable representation of the task instructions to the at least one property of the target material and / or to the chemical data. The training may include supervised training, wherein training experimental data associated with properties of materials, particularly chemicals, and / or associated chemical data may be provided, preferably together with corresponding task instructions, to the data-driven model for adapting the parameters of the data-driven model to reduce the deviation between the properties and / or chemical data determined by the data-driven model from the training experimental data and the properties and / or associated chemical data associated with the training experimental data. For further optional details regarding the training, reference is made to the article " VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text” by H. Akbari, 35th Conference on Neural Information Processing Systems (2021), https: / / arxiv.org / pdf / 2104.11178v3.pdf, which is herewith incorporated herein by reference in its entirety.
[0054] In a further aspect, a system for generating chemical data associated with a target material is disclosed, the system comprising a) a measured experimental data receiving unit for receiving measured experimental data associated with a target material and indicative of a property of the target material, b) a historical experimental data receiving unit for receiving historical experimental data associated with the target material and indicative of a historical property of the target material, c) a model instructor for providing task instructions for determining chemical data to a data-driven model configured to follow task instructions, the task instructions provided to the data-driven model comprising the measured experimental data and the historical experimental data, and d) a data providing unit for providing the chemical data.
[0055] Furthermore, an aspect disclosed herein relates to a use of chemical data associated with a target material for monitoring and / or controlling a production of materials, wherein the chemical data have been generated according to a method as defined above and / or with a system as defined above.
[0056] Moreover, the present disclosure also relates to a method for monitoring and / or controlling producing and / or processing a target material based on chemical data associated with a target material.
[0057] The methods presented herein may particularly be computer-implemented. Accordingly, further aspects disclosed herein relate to one or more computer programs comprising instructions which, when executed by a computer, cause the computer to carry out the respective method.231252
[0058] 11
[0059] It shall be understood that the method of claim 1, the system of claim 14 and the use according to claim 15 have similar and / or identical preferred embodiments as defined in the dependent claims.
[0060] It shall be understood that a preferred embodiment can also be any combination of the dependent claims with the respective independent claim.
[0061] These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.
[0062] In the following, the present disclosure is further described with reference to the enclosed figures. The same reference numbers in the drawings and this disclosure are intended to refer to the same or like elements, components, and / or parts.BRIEF DESCRIPTION OF THE DRAWINGS
[0063] FIG. 1 illustrates an embodiment of a method for generating chemical data associated with a target material.
[0064] FIG. 2 illustrates an embodiment of a system for generating chemical data associated with the target material.
[0065] FIG. 3 illustrates an embodiment of a training of an embedding layer.
[0066] FIG. 4 illustrates an embodiment of a transformer encoder architecture.
[0067] FIG. 5 illustrates an embodiment of a transformer decoder architecture.
[0068] FIG. 6 illustrates an embodiment of a transformer encoder-decoder architecture.
[0069] FIG. 7 illustrates an embodiment of training and / or deploying the transformer encoder, the transformer decoder and / or the transformer encoder-decoder.
[0070] FIG. 8 illustrates an embodiment of input embedding.
[0071] FIG. 9 illustrates a further embodiment of input embedding.
[0072] DETAILED DESCRIPTION OF EMBODIMENTS
[0073] The following embodiments are mere examples for implementing the method, system and use disclosed herein and shall not be considered limiting.
[0074] FIG. 1 shows schematically and exemplarily a computer-implemented method for generating chemical data associated with a target material. FIG. 2 shows schematically and exemplarily a system for carrying out the method.
[0075] In step 10 of the method, measured experimental data associated with the target material are received. Step 10 is carried out by a measured experimental data receiving unit 1. As seen in FIG. 2, for instance, the measured experimental data may be input by a user via a user interface, and / or received via an interface to a measurement device with which the measured experimental datahave been acquired. The measured experimental data are indicative of a property of the target material.
[0076] In step 20, historical experimental data associated with the target material are received by a historical experimental data receiving unit 2. The historical experimental data are indicative of a historical property of the target material. The historical property may or may not be the property of which the measured experimental data are indicative. As seen in FIG. 2, the historical experimental data may be received from a database comprising historical experimental data associated with a plurality of materials, one of which may be the target material.
[0077] For instance, the historical experimental data may be received by having the receiving unit 2 send a retrieval request to the database. If the historical experimental data should have a particular relation with respect to the measured experimental data, this relation may be specified by the retrieval request. For instance, the historical experimental data may be required to be similar to the measured experimental data, wherein the similarity between data in the database and the measured experimental data may be measured according to a predefined similarity score. The similarity score may be defined in a predefined numerical representation space for experimental data, into which the measured experimental data and the experimental data from the database may be mapped for determining the similarity score. In response to the retrieval request, the historical experimental data may be provided from the database to the receiving unit 2.
[0078] The measured experimental data and the historical experimental data may be of different types and may thus complement each other, thereby allowing for chemical data to be generated that could not be generated based on either of the two types of experimental data alone. For instance, the experimental data may refer to spectral data acquired with different spectroscopic techniques, or to image data acquired with different imaging modalities.
[0079] In a step 25, it may be decided how the chemical data associated with the target material should be generated, wherein the options include at least a first option a) and a second option b). The decision between the available options may be taken by a user in terms of a corresponding user input provided to the system, for instance. In other embodiments, the decision may be predefined. The two options are not mutually exclusive. For instance, the received experimental data may be partially processed according to the first option, and partially according to the second option. Both the first and the second option assume that the chemical data are determined by determining a property of the target material from the experimental data, and determining the chemical data based on the property. For instance, the property may be inherently linked to the respective type of experimental data. Examples of this would be a qualitative or quantitative composition of a target materialderivable from spectroscopic data, a presence of one or more chemicals derivable from chromatographic data, or functional groups of the target material derivable from magnetic resonance images. However, in other embodiments, the intermediate step of determining a property of the target material could also be abandoned.
[0080] In a step 30, a first data-driven model is instructed by a model instructor 3 with first task instructions as far as option a) has been taken, and a second data-driven model is instructed by the model instructor 3 with second task instructions as far as option b) has been taken. In the following it will be assumed that the first data-driven model and the second data-driven model are different models, although in principle the models could also be the same. Thus, it is also possible that differences in subsequent steps only arise from differences between the first task instructions and the second task instructions. The first task instructions and the second task instructions may differ in which part of the experimental data they comprise. However, it is also possible that the first task instructions and the second task instructions both comprise the measured experimental data and the historical experimental data. In both cases, the first task instructions and the second task instructions may differ from each other in further instruction parts, which may be added by the model instructor 3 to the experimental data. Both the first data-driven model and the second data-driven model are preferentially configured to follow general task instructions, which may be provided to the respective model as multi-modal input comprising, for instance, both text and numerical data. The first and the second data-driven model could be, for instance, large language models (LLMs). More generally, they could be any generative artificial intelligence.
[0081] According to the first option a), the task instructions may comprise selection task instructions for selecting an analysis engine from a plurality of available analysis engines. The analysis engine to be selected should be configured to determine, based on the respective experimental data, the property of the target material of which the respective experimental data are indicative. In other words, the selected analysis engine should have the required analysis capabilities. Therefore, the selection task instructions may indicate, apart from the available analysis engines, respective functions regarding the capabilities of the analysis engines to analyse experimental data. For instance, the selection task instructions may indicate an available library containing analysis engines having functions for analysing different types of experimental data, among them the type of the experimental data based on which the property of the target material is to be determined using the analysis engine to be selected. In an example, the available engines in the library may be represented including metadata indicating their functions, particularly which types of experimental data they are configured to analyse with respect to which material properties. The data-driven model may match this metadata to the experimental data received as part of the task instructions. For this purpose, the task instructions may include metadata indicating a type of the experimental data, but the data-231252
[0082] 15
[0083] driven may also be configured for inferring the type of the experimental data without being provided with such metadata as input.
[0084] Upon providing the task instructions including the selection task instructions to the data-driven model in step 30, the model may output the selected analysis engine from the library of available analysis engines or an indication thereof in a step 40a, the selection being made such that the selected analysis engine has the required analysis capability. In a subsequent step 41a, the respective experimental data are provided by the instructor 3 to the selected analysis engine, whereupon the analysis engine analyses the data and provides, in a step 42a, a result of the analysis, being the property of the target material to be determined, back to the instructor 3. The instructor 3 could therefore also be referred to as an analysis engine instructor or operator. More generally, the instructor 3 could be referred to as an agent orchestrating how, when and by which system units the method steps described herein are carried out.
[0085] Even if an analysis engine is carefully selected to accept experimental data of the respective type received as input, the respective experimental data may still require a restructuring to meet the particular input requirements of the selected analysis engine. For instance, the measured experimental data may be received via the user interface in a form freely chosen by the user, and the historical experimental data may be stored in the database with a structure not identical to the input structure of the analysis engine. For instance, while not shown in FIG. 2, a structure data-driven model may be used for the restructuring of the respective experimental data. The structure data-driven model may be different from the shown data-driven models, or may be one of them. The instructor 3 may be configured to instruct the structure data-driven model using restructuring instructions comprising, apart from the respective experimental data, an indication of a required structure. The indication may also be a reference to the selected analysis engine. In order to increase the likelihood that the restructured experimental data can indeed be provided as input to the selected analysis engine, a structure validation unit may be provided (also not shown in FIG. 2).
[0086] Based on the property of the target material obtained by the selected analysis engine, the instructor 3 instructs the data-driven model to determine the chemical data in a step 50. The determined chemical data received in response from the model may then be forwarded by the instructor 3 to a data provider 4, wherein the data provider 4 provides the chemical data, in a step 60, to an output interface of the system. The output interface may, for instance, be the user interface via which the measured experimental data may have been received, or it may be an interface to a monitoring and / or control apparatus monitoring and / or controlling a production of the target material based on the chemical data.According to the second option b) selectable in step 25, the data-driven model is instructed by the instructor 3 to determine the chemical data itself. As according to the first option a), however, the determining of the chemical data may again be split into a determining of the respective property of the target material based on the respective experimental data and a subsequent determining of the chemical data based on the respective property. For instance, as illustrated in FIG. 1, the instructor 3 may, in step 30, provide first task instructions to the data-driven model for determining the respective property based on the respective experimental data, wherein the determined property is received by the instructor 3 from the data-driven model in step 40b. The method may then continue with steps 50 and 60 as indicated above for option a).
[0087] Thus, one or more data-driven models may, by providing respective instructions to them, be used to carry out one or more tasks, wherein the tasks can particularly include selecting an analysis engine that is to be used for determining a respective property of the target material based on respective experimental data, determining the property based on the experimental data by itself, and / or determining the chemical data based on the property, wherein it is also possible for a respective data-driven model to determine the chemical data based on the experimental data without immediately determining a property of the target material as a basis for determining the chemical data.
[0088] For configuring a respective data-driven model for following the task instructions, the data-driven model may be trained based on training data. Training data may be unstructured data. Unstructured data may comprise one or more sequences. A sequence may comprise two or more elements. The elements may be associated with the historical property and the property of which the measured experimental data are indicative. The data-driven model may be a foundation model. The data-driven model may be parametrized for mapping the historical experimental data and the measured experimental data to a machine-processable representation of the historical experimental data and the measured experimental data. Further, the data-driven model may be configured for mapping the machine-processable representation to properties of the target material and / or to chemical product data.
[0089] In the following, particular features of possible data-driven models as considered herein will be described with reference to FIG. 3 to FIG. 9.
[0090] FIG. 3 illustrates an embodiment of obtaining an embedding layer usable in a data-driven model. The embedding layer may be obtained by training for example a continuous bag of words model (CBOW) or a skip-gram model. The embedding layer may be suitable for generating embedded input data based on input data. Generating embedded input data may refer to embedding input data.231252
[0091] 17
[0092] As is the case throughout the subsequent description of FIG. 3 to FIG. 9, the input data may be unstructured or structured data. For instance, the input data may be or comprise general text, numerical and / or image data. However, more particularly, the input data throughout the subsequent description of FIG. 3 to FIG. 9 may also refer to task instructions as indicated above, which may, in step 30, include the respective measured and historical experimental data associated with a target material and / or respective indications regarding possible analysis engines to be selected, or, in step 50, include the respective one or more properties based on which the chemical data should be generated. Such particular input data may comprise text, numerical and / or image data structured in a particular form. An exemplary form would be, for instance, such that the input data split into a first part corresponding to the experimental data or the one or more properties and a second part corresponding to model instructions further specifying how the first part is to be handled by the model. In such a case, the model instructions may or may not be specific to the first part.
[0093] Embedding input data may result in a representation associated with the input data. Thus, the embedded input 114 may be the representation associated with the input data. The input data may comprise one or more elements. The one or more elements may be represented by the input vector 106. In particular, the embedded input 114 and / or the input vector 106 may be machine-readable and / or processable by a processor. For this purpose, the embedded input 114 and / or the input vector 106 may be a tensor, in particular a first-rank tensor. Specifically, the input vector 106 may be a one-hot vector or a summation of a plurality of one-hot vectors. A one-hot vector may be a vector with one entry unequal to zero. Examples for one-hot vectors may be 108, 110 and 112. The entries unequal to zero in the one-hot vector and / or in the input vector 106 may indicate the element. For example, a lookup table may define the relation between the position of the entries unequal to zero and the element indicated by the one-hot vector. The lookup table may specify a plurality of different elements. The number of different elements may be equal to the number of entries in the one-hot vector. The number of different elements may be referred to as vocabulary size. In an example, the elements may be represented by tokens and a sequence of elements may refer to at least a part of a sentence. The at least a part of the sentence may be represented by a plurality of tokens. A token may represent at least a part of the element and / or word. For example, where one element would be associated with only one word, words such as "embeddings", "embedding” or "embed” would constitute different elements. A first token may represent the stem "embed” and the endings, typically appearing in a plurality of words, may be represented by a second token, a third token and a fourth token. The second token, the third token and the fourth token may be used for representing other words such as "look”, "looking” or the like, preferably together with a fifth token representing the stem "look”. Ultimately, this tokenization of elements associated with a plurality of stems and a plurality of endings results in less tokens to be used for representing a plurality of elements and thus, uses less computational resources.231252
[0094] 18
[0095] A lookup table specifying a subset of the vocabulary size e.g. of the English language may comprise 10,000 words or more. The embedded input 114 may be a lower-dimensional representation than the input vector 106. For example, typical embedded inputs 114 may comprise some hundreds of different entries. Followingly, the embedded inputs 114 constitute a densified representation of one or more elements using less computational resources. More than that, the embedded input 114 may represent a relation between two or more elements. For example, the words " Italy” and " Germany” may be similar or may be more closely related since they both define European countries, whereas the word "embodiment” may be very different from the two respective words. The smaller the dot product between two embedded inputs 114 may be, the more similar the two elements associated with the embedded inputs 114 may be. Hence, the embedded inputs 114 may represent one or more elements accurately and lead to accurate results based on processing the embedded inputs 114.
[0096] For transforming the input vector 106 into the embedded input 114, the embedding layer may comprise a number of neurons equal to the number of entries in the embedded input 114. Based on the embedded inputs 114, the output layer may generate the output vector 116. The output vector may be a vector and / or may indicate one or more elements. The output vector 116 may indicate one or more elements different from the input vector 106 and / or the one-hot vectors associated with the input vector 106. For this purpose, the output layer may comprise a number of neurons equal to the number of entries of the input vector 106 and / or the output vector 116. The output layer may apply a softmax function to the embedded inputs 114. By doing so, the output vector may comprise the probabilities associated with the elements associated with the entries of the output vector 116 unequal to zero. Hence, from the output vector 116 one or more elements may be obtained with a corresponding probability. Where the input vector 106 may specify one or more sequence(s) of elements, the output vector 116 may specify one or more elements corresponding to the sequence(s) of elements specified by the input vector 106. In the example of FIG. 3, the element associated with vector 118 may correspond to the input vector with a probability of 71 %. Additional or alternative elements may correspond to the input vector as indicated by the output vector with lower probability. By defining a threshold to which the probability may be compared, the selection of the corresponding elements may be tailored to the needs of the user. The elements generated by the model comprising the embedding layer 102 and the output layer 104 may refer to the most probable elements indicated by the output vector 116. Hence, the model depicted in FIG. 3 may generate the element associated with the vector 118 with a confidence score of 71 %.
[0097] The model of FIG. 3 may be a continuous bag of words (CBOW) model. The CBOW model may be trained based on a training data set comprising a plurality of input vectors and corresponding output vectors. As the training data set may not be labeled, the training of the CBOW model may be231252
[0098] 19
[0099] referred to as self-supervised. Before training of the CBOW model, the CBOW model may be initialized with random values assigned to the weights of the neurons. During the training of the CBOW model, the input vectors may be passed through the initialized embedding layer and the output layer and a loss may be determined by comparing the output vector obtained by passing the input vector 106 through the model to the output vector corresponding to the input vector 106 as specified by the training data set. Based on the determined loss, backpropagation may be applied to determine the gradients associated with the neurons of the embedding layer 102 and the output layer 104 to lower the loss. According to the determined gradients, the weights of the neurons may be updated by using a gradient descent algorithm. If a predetermined loss may be achieved by the CBOW model, the training may be terminated and a trained CBOW model may be obtained. From the trained CBOW model, the embedding layer 102 may be suitable for embedding input data comprising one or more elements. This embedding layer 102 may be used in other machine-learning architectures requiring an embedding layer 102 such as a transformer encoder, transformer decoder or transformer encoder decoder architecture as described within the context of FIG. 4, FIG. 5 and FIG. 6, all of which are possible architecture for the data-driven models considered herein. For training these architectures, a trained embedding layer 102 may be required. Hence, a model such as a CBOW model may be trained prior to training the transformer encoder, transformer decoder or transformer encoder decoder architecture.
[0100] FIG. 4 illustrates an embodiment of a transformer encoder architecture. The transformer encoder comprises an encoder input 278, one or more encoder blocks 274, 214 and an encoder output. The transformer encoder architecture may be derived from the transformer encoder-decoder architecture as known in the art and shown in FIG. 6. In particular, the transformer encoder may be referred to as X-former. The transformer encoder architecture may correspond to the encoder architecture associated with the transformer encoder-decoder architecture with an additional encoder output instead of connecting the encoder block directly to the decoder of the transformer encoder-decoder architecture. A plurality of transformer encoder architectures are available in the art, such as the bidirectional encoder representations from transformers (BERT).
[0101] The input data may be received at the encoder input 278. The encoder input 278 may apply an input embedding 202. Applying the input embedding 202 may refer to passing the input data through an embedding layer, e.g. as described within the context of FIG. 3. Further, the encoder input 278 may apply positional encoding 203. Applying positional encoding 203 may refer to adding a positional factor to the embedded input obtained via input embedding. Preferably, the input data may specify a sequence of elements. The positional factor Ppo may be indicative of the position of the elements within the sequence. For example, the positional factor Ppos may be obtained based on the following equation:231252
[0102] 20
[0103] pos
[0104] 10000^(2i / d)
[0105] pos
[0106] p_pos(2i + 1) = cos
[0107]
[0108] 10000
[0109] where pos may refer to the position of the element within the sequence, i may refer to the dimension associated with the input embedding and d may refer to the dimension of the model, e.g. transformer decoder, transformer encoder or transformer encoder-decoder. This may be referred to as absolute positional embeddings. Alternatively, the positional encoding may be based on rotary positional embeddings (RoPE). Positional encoding is beneficial since it enables the processing of sequential data without requiring further dimensions indicating the position of each element. Follow-ingly, the positional encoding 203 reduces the computational resources needed for embedding the input data. By passing the input data through the encoder input, the input data may be transformed into a second-rank tensor representing the sequence of elements. This second-rank tensor may be referred to as embedded input data. The embedded input data may be processed by the encoder block. The embedded input data may be provided to the layer normalization 208 by a residual connection.
[0110] Multi-head self-attention 206 may be applied to the embedded input data. Multi-head self-attention 206 may comprise the two components multi-head and self-attention. Self-attention may be understood as being a filter applied to the embedded input data. By applying the filter to the embedded input data, the elements associated with the embedded input data contributing to the to be generated output data may be identified for generating the output data. Hence, the filter may represent the degree of contributing to the to be generated output data by the elements associated with the embedded input data. Applying the filter may be referred to as weighting the elements associated with the embedded input data. This is advantageous specifically regarding long sequences of elements. The filter may be learned and improved during the training by learning to identify the contribution of elements associated with the embedded input data. For example, in the partial sentence " I went to the bakery to buy a” the last word may be generated by the data-driven model such as the transformer encoder. The self-attention may focus the transformer encoder to attend to the word "bakery” and "buy” mostly to generate the word "bread”. Self-attention may refer to attention generated based on the input data. Hence, the filter may be determined based on the input data, preferably the embedded input data.
[0111] The embedded input data may serve as query Q, key K and value V with respect to the self-attention operation. The self-attention may refer to attention based on the received input data. Hence,231252
[0112] 21
[0113] the filter may be calculated based on the following formula by inserting the respective tensors based on the embedded input data:
[0114] Attention(Q, K, V) = softmax
[0115] Attention(Q, K, V) = softmax\left(\frac{QK^T}{\sqrt{d_k}}\right)V
[0116] √d_k
[0117] where dkcorresponds to the dimension of the key.
[0118] For improving the efficiency of the transformer encoder further, the multiple heads are used to apply the filter resulting in the multi-head self-attention 206. Multi-head self-attention 206 may comprise applying the filter to two or more parts of the embedded input data. Hence, the tensor may be split into two or more parts and the filter may be applied to the two or more parts separately by two or more heads according to the following equation:
[0119] headi= Attention(QWiQ, KWiK, VWiV)
[0120] with parameter matrices W e ]Rd><dQ, W e ]Rd><dfc, W e ]Rd><di’, where i may refer to the number of heads, and dv, dkand dqmay refer to the dimensions of the value, key and query.
[0121] The result of the two or more heads may be concatenated according to the following equation:
[0122] MultiHead(Q, K, V) = Concat(head1,...,headh)WO
[0123] where WO∈ ℝhd×dand h may refer to the number of heads.
[0124] For further details regarding possible self-attention operations which may be applied in block 206, reference is made to the article " Attention Is All You Need” by A. Vaswani et al., Advances in Neural Information Processing Systems (2017), https: / / arxiv.org / abs / 1706.03762, which is herewith incorporated herein by reference in its entirety.
[0125] The embedded input data may be transformed via the multi-head self-attention 206 into a context tensor. The context tensor may represent the sequence of elements and the relation between two or more elements of the input data. The context tensor may be a second rank tensor and / or may comprise one or more first rank tensor(s). After the multi-head self-attention 206, layer normalization 208 may be applied based on the context tensor and / or the embedded input data from the residual connection. Applying layer normalization 208 may refer to normalizing the context tensor.Normalizing the context tensor may lower the values of the entries of the context tensor. This reduces the computational cost associated with processing the context tensor. Further, it improves the training by contributing the loss to converge and preventing instabilities.
[0126] Layer normalization 208 may be followed by passing the context tensor to a feed-forward layer 210, again followed by layer normalization 212 based on the residual connection to the context tensor and / or the output of the feed-forward layer 210. The feed-forward layer 210 may be a feed-forward neural network. The feed-forward neural network may comprise of a plurality of fully connected neurons. Passing the context tensor through the feed-forward neural network may result in transforming the context tensor linearly. Additionally or alternatively, the neural network may comprise one or more activation functions such as a rectified linear unit (ReLU). Hence, the neural network may be configured for performing one or more non-linear operations to the context tensor and / or transforming the context tensor non-linearly. After the context tensor has been transformed and / or normalized by the feed-forward layer 210 and the layer normalization 212, the context tensor may be provided to one or more further encoder blocks 214. Having passed the context tensor through the feed-forward layer 210 may adapt the context tensor for the processing by a further attention layer of the one or more further encoder blocks 214 for applying a self-attention filter, preferably multi-head self-attention 206. The context vector after being transformed by the layer normalization 212 and the feed-forward layer 210 may be referred to as hidden state.
[0127] The encoder output 276 comprises of a linear layer 216 and a softmax layer 218. The linear layer 216 may transform the context vector into a logits vector. The linear layer may be fully-connected. The logits vector obtained by passing the context tensor through the linear layer 216 may be passed through the softmax layer 218. Passing the logits vector through the softmax layer 218 may refer to applying the softmax function to the logits vector. Applying the softmax function to the logits vector may result in a probability distribution of one or more elements corresponding to the sequence of elements in the input data. From the probability distribution based on predefined selection criteria, one or more elements may be chosen. The one or more chosen elements may be referred to as the one or more elements generated by the transformer encoder. The one or more generated elements may be provided to the encoder input for generating further one or more elements corresponding to the sequence of the input data and the one or more elements generated by the transformer encoder as described within the context of FIG. 7.
[0128] FIG. 5 illustrates an embodiment of a transformer decoder architecture.The transformer decoder comprises a decoder input 284, one or more decoder blocks 280, 232 and a decoder output 292. The transformer decoder architecture may be derived from the transformer encoder-decoder architecture as known in the art and shown in FIG. 6. The transformer decoder may be referred to as X-former. The transformer decoder architecture may correspond to the decoder architecture associated with the transformer encoder-decoder architecture independent of receiving one or more hidden states from the encoder of the transformer encoder-decoder. A plurality of transformer decoder architectures are available in the art, such as the generative pretrained transformers (GPT).
[0129] The decoder input 284 may apply input embedding 220 and positional encoding 222 analogous to the input embedding 202 and the positional encoding 203 as described within the context of FIG. 4.
[0130] The decoder block 280 may comprise the layer normalizations 226, the masked multi-head selfattention 224, the feed-forward layers 228 and / or the layer normalization 230. The embedded input data resulting from passing the input data through the decoder input 284 may be provided to the layer normalization 226 via a residual connection. Further, masked multi-head self-attention 224 may be applied to the embedded input data. Masked multi-head self-attention 224 corresponds to the multi-head self-attention 206 as described within the context of FIG. 4 with additionally masking a part of the embedded input data associated with elements later in the sequence than the element to be generated. Additionally or alternatively, the part of the input data associated with elements later in the sequence than the element to be generated may not be received and / or transformed into the embedded input data. Thus, the transformer decoder may be suitable for generating a subsequent element to a sequence, whereas the transformer encoder may be suitable for generating a missing element within one sequence and / or between two or more sequences. Therefore, the transformer encoder may be configured for classification tasks. The transformer decoder may be configured for text generation.
[0131] Similar to the transformer encoder as described within the context of FIG. 4, a context tensor may be generated by applying the masked multi-head self-attention 224 and the layer normalization 226. The context tensor may be provided to the layer normalization 230 via a residual connection. Further, the feed-forward layer 228 and the layer normalization 230 may be analogous to the feedforward layer 210 and the layer normalization 212 as described within the context of FIG. 4. The context tensor may be provided to one or more further decoder blocks 232.
[0132] The decoder output 292 may comprise a linear layer 234 and a softmax layer 236. The linear layer 234 and the softmax layer 236 may be analogous to the linear layer 216 and the softmax layer 218 as described within the context of FIG. 4.FIG. 6 illustrates an embodiment of a transformer encoder-decoder architecture. The transformer encoder-decoder may comprise the encoder input 288, the one or more encoder blocks 286, 264, the decoder input 294, the decoder block 290 and the decoder output 292. The encoder input 288 may correspond to the encoder input 278 of FIG. 4. The one or more encoder block(s) 286, 264 may correspond to the one or more encoder blocks 274, 214 of FIG. 4. The decoder input 294 may correspond to the decoder input 284 of FIG. 5.
[0133] The decoder block 290 may comprise a masked multi-head self-attention 270, a layer normalization 272, a feed-forward layer 238 and a layer normalization 240 analogous to the masked multi-head self-attention 224, the layer normalization 226, the feed-forward layer 228 and the layer normalization 230 as described within the context of FIG. 5. The decoder block 290 may further comprise a multi-head self-attention 250 and a layer normalization 248. Analogous to the description of FIG. 5, the context tensor may be obtained from the masked multi-head self-attention 270 and the layer normalization 272. Multi-head self-attention 250 analogous to the multi-head self-attention 206 of FIG. 4 may be applied to the context vector obtained from the layer normalization 272 and the hidden states of the one or more encoder blocks 286, 264. Layer normalization 248 may be applied to the context vector obtained from the multi-head self-attention 250 and the context vector obtained from the layer normalization 272 provided via a residual connection. The context vector resulting from the layer normalization 248 may be processed via the feed-forward layer 238 and the layer normalization 240 analogous to the description of FIG. 5. The context vector resulting from the layer normalization 240 may be provided to further decoder blocks 242 analogous to the decoder block 290. The context vector obtained from the one or more decoder blocks 290, 242 may be provided to the decoder output 292. The decoder output 292 may correspond to the decoder output 282 of FIG. 5.
[0134] With the above-described architecture, the transformer encoder-decoder may receive and process input data at the encoder input 288 and the one or more encoder blocks 286, 264 and the decoder block 290 and the decoder output 292. Based on the input data, the transformer encoder-decoder may generate output data part by part or sequentially. The sequentially generated output data may be provided to and / or may be processed by the decoder input 294, the one or more decoder blocks 290, 242 and the decoder output 292. Preferably, a sequence may be provided to the encoder input 288 and after having generated at least a part of the output data, the decoder input 294 may be provided with at least the part of the elements of the output data already generated. By doing so, the next elements of the output data may be generated with a higher accuracy by taking the input data and the generated output data into account since more data may be received by the transformer encoder-decoder over time.Because of the transformer encoder-decoder architecture, the transformer encoder-decoder may be configured for transforming a sequence into another representation of the sequence. An example for transforming one sequence into another representation may be translation of one sentence into another language. A plurality of transformer encoder-decoders are available in the art, such as BART, T5 or the like.
[0135] In an embodiment, the layer normalization 208, 212 may be applied prior to the masked multi-head self-attention 224, multi-head self-attention 206 and / or the feed-forward layer 210 in the transformer decoder, the transformer encoder and / or the transformer encoder-decoder. By doing so, the computational resources for applying the multi-head self-attention 206 and / or the feed-forward layer 210 to the embedded input data and / or the context tensor may be decreased as the entries of the respective tensors may be lower after normalization.
[0136] In an embodiment, the decoder output 292 may comprise a classification neural network, further feedforward layers, convolutional layers, fully connected layers or the like. For example, the transformer encoder-decoder may be configured for choosing between a plurality of options. For this purpose, the transformer encoder-decoder may be provided with three different input data sets and may classify the context vectors obtained from the one or more decoder blocks 290 via one or more linear layers. Followingly, the architecture may be extended depending on the use case to be solved.
[0137] FIG. 7 illustrates an embodiment of training and / or deploying the transformer encoder, the transformer decoder and / or the transformer encoder-decoder.
[0138] The encoder / decoder / encoder-decoder architecture 302 may correspond to the transformer decoder, the transformer encoder and / or the transformer encoder-decoder as described within the context of FIG. 4- FIG. 6.
[0139] The output data generated by the encoder / decoder / encoder-decoder architecture 302 may comprise one or more elements, in particular a sequence of elements. The previously generated elements of the output data may be provided as input for generating the next element in the sequence of the output data.
[0140] If, for instance, the input data did correspond to task instructions for determining chemical data associated with a target material from measured and historical experimental data, then the output data may correspond to the chemical data or intermediate output data based on which the chemicaldata is subsequently determined, such as a selected analysis engine or a property of the target material obtained based on the experimental data.
[0141] In the example of FIG. 7, the input data may comprise N elements, in particular input tokens. For instance, any request for generating chemical product data and / or associated model instructions, which may initially be input in a format comprising text and / or numerical data, may be tokenized, thereby converting it into a sequence of tokens. An input token may be a token dedicated to be inputted into a data-driven model such as the transformer decoder, the transformer encoder or the transformer encoder-decoder. The output data to be generated may comprise M elements. The en-coder / decoder / encoder-decoder architecture 302 may generate one element of the output data based on receiving the input data and optionally previously generated elements of the output data at a timestep. Hence, for generating M elements M time steps are required. A time step comprises providing input 310, 312, 314 to the encoder / decoder / encoder-decoder architecture 302 and receiving output data 304, 308, 306 from the encoder / decoder / encoder-decoder architecture 302. In a first timestep, the input 310 may comprise of N input tokens. The N input tokens may be associated e.g. with N words, stems or endings. Preferably, the N input tokens may specify a question or other form of request, and / or associated model instructions. One or more input tokens may specify the beginning of the sequence of tokens and / or the end of the sequence of tokens. The input 310 may be processed by the encoder / decoder / encoder-decoder architecture 302. Based on the input 310 at least a part of the output data 304 may be generated. The at least a part of the output data may comprise a first output token. In the next timestep, the generated first output token may be provided together with the input 312. Specifically, where the input 312 may be received by a transformer encoder-decoder the input tokens may be received at the encoder input 288 and the first output token may be received at the decoder input 294. Where the input 312 may be received by the transformer encoder, the input 312 may be received by the encoder input 278 and analogously regarding the transformer decoder and the decoder input 284. Based on the input 312, the output data 308 comprising the first output token and a second output token may be generated. Generating the output data 308 based on the input 312 may refer to generating the second token based on the first token and the N input tokens, wherein the first token may have been generated based on the N input tokens. This process may be repeated until the last token in the sequence of the output data 306 may be generated. Preferably, the last token may be an end token. The end token may terminate the generation of a further output token.
[0142] Similarly to the data processing during deployment of the encoder / decoder / encoder-decoder architecture 302, the encoder / decoder / encoder-decoder architecture 302 may be trained. The training data set may comprise a plurality of sequences comprising a plurality of elements. The sequences27
[0143] may be associated with the input data and / or the output data. Additionally or alternatively, the sequences may be independent of the input data and / or the output data. For example, where the input data and the output data may refer to chemical compositions represented via text, the training data set may comprise sequential text data independent of chemical compositions. In this example, the training data set may comprise sequences of words originating from a conversation. In an embodiment, the training data set may comprise at least partially input data sets and / or output data sets.
[0144] The training may be initialized by initializing the encoder / decoder / encoder-decoder architecture 302. In an embodiment, the parameters associated with the encoder / decoder / encoder-decoder architecture 302 may be initialized randomly. Additionally or alternatively, the input embedding of the encoder / decoder / encoder-decoder architecture 302 may be obtained by training a CBOW model or a skip gram model as described within the context of FIG. 3. The trained embedding layer may be used during training. The parameters associated with the embedding layer may be kept constant and / or may be updated after a predefined number of training epochs. By doing so, the number of parameters to be updated is lower, enabling a faster and less computational resources-consuming training. Further, the accuracy associated with the embedding layer may be constant and / or may be increased by avoiding error compensation in relation to the just initialized encoder / decoder / en-coder-decoder architecture 302.
[0145] During the training of the encoder / decoder / encoder-decoder architecture 302, at least a part of the sequences of the training data set may be provided to the encoder / decoder / encoder-decoder architecture 302 one by another and one or more elements may be generated based on the sequences of the training data set one by another. The elements generated based on the sequences may follow the elements of the parts of sequences the encoder / decoder / encoder-decoder architecture 302 may have been provided with. The generated one or more elements may be compared to the one or more elements following the at least a part of the sequences provided to the encoder / de-coder / encoder-decoder architecture 302 as specified by the training data set. Hence, during the training the encoder / decoder / encoder-decoder architecture 302 may generate a guess on the next element and the guess on the next element in a sequence may be compared to the ground truth specifying the actual next element according to the training data set. Based on the guess on the next element and the ground truth a loss may be determined. The loss may define the similarity between the guess on the next element and the ground truth. The loss may be determined by forming a vector dot product between the token associated with the one or more elements and the token associated with the ground truth. A loss unequal to zero may result in updating the parameters associated with encoder / decoder / encoder-decoder architecture 302. Preferably the parameters asso-231252
[0146] 28
[0147] ciated with the encoder / decoder / encoder-decoder architecture 302 may be independent of the embedding layer. For example, the parameters associated with the encoder / decoder / encoder-decoder architecture 302 may be weights of the neurons of the encoder / decoder / encoder-decoder architecture 302.
[0148] Based on the determined loss, backpropagation may be applied to determine the gradients associated with the parameters of the parameters associated with encoder / decoder / encoder-decoder architecture 302 to lower the loss. According to the determined gradients, the parameters associated with the encoder / decoder / encoder-decoder architecture 302, preferably the weights of the neurons associated with the encoder / decoder / encoder-decoder architecture 302, may be updated by using a gradient descent algorithm.
[0149] The training data set may be unlabeled. The sequences of elements within the training data set may inherently comprise the ground truth for determining the loss with respect to the one or more elements generated during the training of the encoder / decoder / encoder-decoder architecture 302. Hence, the encoder / decoder / encoder-decoder architecture 302 may be trained self-supervised. This is advantageous since time and resources for creating a labeled training data set may be saved. Furthermore, this enables the usage of large training data sets associated with a size of several terabytes. Consequently, the data-driven model may be accurate in generating elements of a sequence. In addition, the large training data set enables few shot predictions or even zero shot predictions. Hence, the data-driven models trained as described above are versatile contributing to saving resources needed for training and / or hosting a plurality of purpose-driven models such as convolutional neural networks. The training described above may be referred to as pretraining. The data-driven model may be configured for performing few shot or even zero shot predictions with respect to a plurality of use cases after pretraining. The performance of the data-driven model may be increased further by additional training referred to as fine-tuning. The training data used for fine-tuning may comprise pairs of training input data and training output data. For fine-tuning the model to generate chemical product data as output upon being provided with a request for doing so and / or associated model instructions as input, the training input data may comprise historical task instructions of, for instance, the types used in the above indicated method steps 30 and 50, and the training output data may comprise verified model responses of, for instance, the types received in above indicated steps 40a, 40b and 60, i.e., model responses that have been considered to be useful, suitable and / or desirable outputs for the respective historical task instructions.
[0150] FIG. 8 illustrates an embodiment of input embedding. Where the sequence of elements associated with the input data, preferably comprised in the input data, may be of one type, the input embed-231252
[0151] 29
[0152] ding 202, 220, 252, 266 as described within the context of FIG. 4 - FIG. 6 may be used. For example, a type of input data may be text where the elements may be associated with at least a part of a word, a punctuation character, a start token specifying the beginning of one or more sequences associated with the input data and / or the end token. In another example, the input data may be at least partially numerical. Hence, the input data may comprise a plurality of numbers. A request for generating chemical product data, for instance, may comprise both text and numerical data. Numerical input data may be for example tabular data. Tabular data may specify one or more rows and / or one or more columns. Hence, the tabular data may comprise one or more cells, wherein the cells may be associated with one or more numerical values.
[0153] Numerical input data may require a different embedding than text input data. Input embeddings for numerical input data may comprise a token embedding, a positional embedding, a column embedding, a row embedding or a combination thereof.
[0154] Applying a token embedding to one or more elements, in particular tokens associated with the input data may result in a machine-processable representation associated with the one or more elements, in particular tokens. Applying the token embedding to one or more elements may refer to passing the one or more elements through the embedding layer, e.g. as described within the context of FIG. 3. Hence, token embeddings may specify the one or more elements, in particular tokens in a machine-processable representation. For example, the token embedding may transform a numerical value into a vector. This is advantageous since this representation can be enriched by further information such as the position of the token within the sequence and / or within a table associated with the sequence of tokens. The positional embedding may be analogous to the positional embedding as described within the context of FIG. 3, FIG. 4 - FIG: 6. Where the input data may be tabular data, column embedding may be applied. Applying a column embedding to one or more elements, in particular tokens associated with the input data may result in a machine-processable representation specifying the location of the one or more elements within a table 402, preferably within the columns of the table 402. Applying the column embedding may refer to adding a column factor to the input data embedded via token embeddings, in particular the embedded input data. The column factor may be the same for elements associated with the same column and / or may differ between two or more elements associated with different columns. Analogous, row embeddings may be applied where the input data may be tabular data. Applying a row embedding to one or more elements, in particular tokens associated with the input data may result in a machine-processable representation specifying the location of the one or more elements within a table 402, preferably within the rows of the table 402. Applying the row embedding may refer to adding a column factor to the input data embedded via token embeddings, in particular the embedded input231252
[0155] 30
[0156] data. The row factor may be the same for elements associated with the same row and / or may differ between two or more elements associated with different rows.
[0157] In an embodiment, input data may be at least partially numerical and at least partially text. As indicated above, this may be the case for a request for generating chemical product data. Hence, the input data may comprise two or more types of data. A type of data may refer to a modality. Follow-ingly, different embeddings may be applied to the input data. To parts of the input data comprising text the input embedding referred to in FIG. 3, FIG. 4 - FIG. 6 may be applied. To parts of the input data being numerical token embeddings, positional embeddings, column embeddings and row embeddings may be applied. Further, segment embeddings may be applied to the input data independent of the type of input data. The segment embedding may specify the type of input data one or more elements may be associated to. For example, if the input data comprises of text and numbers, the input data may comprise of two types of input data. Applying the segment embedding to the input data may refer to adding a segment factor to the input data, preferably the embedded input data and / or the input data after having applied the token embedding. The segment factor may specify the type of data associated with the one or more elements. The segment factor may be the same for one or more elements associated with the same type of input data and / or may differ between two or more elements associated with different types of input data.
[0158] Applying the token embedding, the positional embedding, the segment embedding, the column embedding, the row embedding or a combination thereof may result in embedded input data and / or may be the output of any one of the encoder input 278, 284, 288 or decoder input 284, 294. The data obtained by applying the token embedding, the positional embedding, the segment embedding, the column embedding, the row embedding or a combination thereof may be processed by the encoder block 274, 286, decoder block 280, 290, encoder output 276, decoder output 292, 282.
[0159] FIG. 9 illustrates a further embodiment of input embedding.
[0160] Input data to the data-driven model, in particular to the encoder input and / or the decoder input as described in the context of FIG. 4 - FIG. 6, may comprise image data. Also the task instructions considered herein may comprise image data, such as in the form of an image associated with the measured and / or historical experimental data. The data-driven model may be parametrized to receive image data. For processing image data as input data, the data-driven model may comprise one or more encoder blocks and / or one or more decoder blocks and / or one or more encoder outputs and / or one or more decoder outputs as described within the context of FIG. 4 - FIG. 6. FIG. 9 may show an embodiment of an encoder input and / or a decoder input. When processing image data, the encoder input and / or the decoder input of the data-driven model may be as describedwithin the context of FIG. 9. The encoder input and / or decoder input may comprise one or more linear projection layers 514 for a linear projection of one or more images, preferably one or more partial images, more preferably a sequence of two or more partial images. The one or more linear projection layers 514 may be suitable for changing the dimension of the one or more received images, preferably one or more partial images, preferably passing the one or more images, preferably partial images, through the one or more linear projection layers 514 may result in applying image embedding, preferably partial image embedding to the one or more images and / or partial images.
[0161] Furthermore, when a sequence of two or more images and / or partial images may be received, positional embedding may be applied to the sequence, preferably by passing the sequence of one or more images and / or partial images through the one or more linear projection layers 514. Applying positional embedding may refer to adding a positional factor. The positional factor may be different depending on the position of the image and / or the partial image within the sequence. In particular, the positional factor added to a first element of the sequence may be different to the positional factor added to a second element of the sequence. The first element of the sequence may be a first image and / or first partial image. The second element of the sequence may be a second image and / or a second partial image.
[0162] The representation of the one or more images, preferably one or more partial images, may be obtained based on the following equation:
[0163] ZQ= x® p-®;..., X 4~ Epos,
[0164]
[0165] E G R(p2c,)xP, EposG R(2V+1)XD
[0166] wherexdass is the image class embedding 528,xp is the n-th image, in particular partial image in the sequence, z is the representation of the one or more images, preferably one or more partial images, (H, W) are the resolution of the image, in particular the image the partial images are generated on, C is the number of channels associated with the one or more image, in particular the one or more partial images and D is the dimension of the representation of the one or more images, preferably one or more partial images. Applying the partial image embedding may refer to forming the product ofxNp with E above-described equation. Applying the positional embedding may refer to adding the factor pos according to the above-described equation. By doing so, textbased data, numerical data, tabular data, image data or the like may be processed by one data-driven model.The present disclosure has been described in conjunction with preferred embodiments and examples as well. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from a study of the present disclosure, including the drawings, the above description and the claims.
[0167] In particular, it will be understood that the present disclosure relates to a co-pilot for analyzing analytical data of a predefined material by combining insights from a variety of different and each other completing analytical measurements. This can be thought of as a service for users of the co-pilot. Possible users include parties aiming to obtain starting materials such as monomers or starting materials for a foam. The generated chemical data can assist users to find the right starting materials, for instance.
[0168] It should also be noted that any steps presented can be performed in any order, i.e. the present invention is not limited to a specific order of these steps. Moreover, it is also not required that the different steps are performed at a certain place or at one node of a distributed system, i.e. each of the steps may be performed at different nodes using different equipment / data processing.
[0169] In particular, procedures like the receiving of measured or historical experimental data, the providing of task instructions to a data-driven model, the receiving of a response to the task instructions from the data-driven model, the providing of an input to an analysis engine, the receiving of an output of the analysis engine, etc., performed by one or several units or devices, can be performed by any other number of units or devices. These procedures can be implemented as program code means of a computer program and / or as dedicated hardware. A computer program product may be stored / distributed on a suitable medium, such as an optical storage medium or a solid-state medium, supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems.
[0170] As used herein..determining" also includes..initiating or causing to determine", "generating" also includes..initiating and / or causing to generate" and "providing” also includes "initiating or causing to determine, generate, select, send and / or receive”. " Initiating or causing to perform an action” includes any processing signal that triggers a computing node or device to perform the respective action.
[0171] In the claims as well as in the description the word "comprising” does not exclude other elements or steps. The indefinite article "a” or "an” and the definite article "the” does not exclude a plurality. In particular, indefinite article "a” or "an” may be replaced with one or more and the definite article "the” may be replaced with the one or more. A single element or other unit may fulfill the functions33
[0172] of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.
[0173] Any disclosure and embodiments described herein relate to the methods, the systems, devices, any computer program element lined out above and vice versa. Advantageously, the benefits provided by any of the embodiments and examples equally apply to all other embodiments and examples and vice versa.
[0174] Any reference signs in the claims should not be construed as limiting the scope.
[0175] Disclosed is a method for generating chemical data associated with a target material, including a) receiving measured experimental data associated with a target material and indicative of a property of the target material, b) receiving historical experimental data associated with the target material and indicative of a historical property of the target material, and c) providing task instructions for determining chemical data to a data-driven model configured to follow task instructions. The task instructions provided to the data-driven model comprise the measured experimental data and the his-torical experimental data. The method also includes d) providing the chemical data. By this method, more insightful chemical data associated with a target material can be provided in a resource-efficient manner.
Claims
23125234CLAIMS1. A, in particular computer-implemented, method for generating chemical data associated with a target material, the method comprising:receiving (10) measured experimental data associated with a target material and indicative of a property of the target material,receiving (20) historical experimental data associated with the target material and indicative of a historical property of the target material, wherein the historical experimental data is obtained before the measured experimental data.providing (30) task instructions for determining chemical data to a data-driven model configured to follow task instructions, the task instructions provided to the data-driven model comprising the measured experimental data and the historical experimental data, wherein the chemical data is determined based on the property of the target material indicated by the measured experimental data, and / or on the historical experimental data andproviding (60) the chemical data.
2. The method as defined in any preceding claim, wherein the measured experimental data are retrieved from a database upon receiving an indication of the measured experimental data via, for instance, a user interface.
3. The method as defined in any preceding claim, wherein the measured experimental data are received via an interface to a measurement device.
4. The method as defined in any preceding claim, wherein the historical experimental data are received from a database comprising historical experimental data associated with a plurality of materials.
5. The method as defined in any preceding claim, wherein the historical experimental data satisfy a predefined relation with respect to the measured experimental data.
6. The method as defined in any preceding claim, wherein the historical experimental data include data indicating a functional relationship satisfied by the historical property of the target material.
7. The method as defined in any preceding claim, wherein the chemical data are determined by23125235determining (30, 40a, 41 a, 42a; 30, 40b) at least one of a) the property of the target material of which the measured experimental data are indicative and b) the historical property, anddetermining (50) the chemical data based on the determined at least one property.
8. The method as defined in claim 7, wherein the task instructions further comprise selection task instructions for selecting an analysis engine configured to determine the at least one property based on the respective experimental data, which are indicative of the at least one property, wherein the selection task instructions comprise one or more indications of a) a plurality of candidate engines including the analysis engine and b) one or more functions associated with the plurality of candidate engines and the respective experimental data, wherein the method further comprises:providing (30) the task instructions to the data-driven model for selecting (40a) the analysis engine, andusing (41 a, 42a) the analysis engine for determining the at least one property.
9. The method as defined in claim 8, wherein the analysis engine is used for determining the at least one property by:providing (41 a) the respective experimental data to the analysis engine, and receiving (42a) the at least one property in response.
10. The method as defined in claim 9, wherein the method further includes:structuring the respective experimental data according to an input structure required by the analysis engine, wherein the respective structured experimental data are provided to the analysis engine.
11. The method as defined in any of claims 7 to 10, wherein the task instructions further comprise analysis instructions instructing the data-driven model to determine (30, 40b) the at least one property based on the respective experimental data, which are indicative of the at least one property.
12. The method as defined in any of claims 7 to 11, wherein the at least one property is used for determining the chemical data by:instructing (50) the data-driven model to determine the chemical data based on the at least one property.2312523613. The method as defined in any of the preceding claims, wherein the chemical data are provided for monitoring and / or controlling a production and / or a processing of the target material.
14. A system for generating chemical data associated with a target material, the system comprising:a measured experimental data receiving unit (1) for receiving measured experimental data associated with a target material and indicative of a property of the target material,a historical experimental data receiving unit (2) for receiving historical experimental data associated with the target material and indicative of a historical property of the target material,a model instructor (3) for providing task instructions for determining chemical data to a data-driven model configured to follow task instructions, the task instructions provided to the data-driven model comprising the measured experimental data and the historical experimental data, anda data providing unit (4) for providing the chemical data.
15. Use of chemical data associated with a target material for monitoring and / or controlling a production of materials, wherein the chemical data have been generated according to a method as defined in any of claim 1 to 13 and / or with a system as defined in claim 14.