Apparatus and method for determining characteristics of input information for a machine learning model

By using a first machine learning model to determine the input characteristics of a second machine learning model, and combining backpropagation algorithm and neural network to process contextual information, the balance between prediction accuracy and operating cost of machine learning models is solved, achieving efficient optimization in different contextual environments.

CN122242810APending Publication Date: 2026-06-19NOKIA NETWORKS OY

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
NOKIA NETWORKS OY
Filing Date
2025-12-17
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing machine learning models struggle to balance prediction accuracy and operational costs when processing input information, especially in different contexts, which limits the model's performance and efficiency.

Method used

The characteristics of the input information of the second machine learning model are determined by using the first machine learning model. The input characteristics are dynamically adjusted to optimize prediction accuracy and operating cost. The model is trained by combining the backpropagation algorithm and uses a neural network to encode and process contextual information.

Benefits of technology

It achieves a combination of improving the prediction accuracy of machine learning models and reducing operational costs in different contexts, dynamically adjusting input characteristics to optimize model performance and save energy consumption.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122242810A_ABST
    Figure CN122242810A_ABST
Patent Text Reader

Abstract

Example embodiments of this disclosure relate to apparatus and methods for determining characteristics of input information for a machine learning model. One apparatus includes: at least one processor and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus to at least: use a first machine learning model to determine one or more characteristics of input information for a second machine learning model based on first information characterizing a context associated with the second machine learning model; and provide the second machine learning model with input information having the determined one or more characteristics.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to an apparatus for machine learning models.

[0002] This disclosure also relates to a method for machine learning models. Background Technology

[0003] Machine learning (ML) is a field of computer science that relates to algorithms and models that can be used to process input information, such as making predictions based on that input. Predictions provided by machine learning models can, for example, be used to control the operation of technological systems and / or components of those systems. Summary of the Invention

[0004] Various exemplary embodiments of this disclosure are defined by the independent claims. Exemplary embodiments and features (if any) described in this specification that are not within the scope of the independent claims should be interpreted as examples that aid in understanding the various exemplary embodiments of this disclosure.

[0005] Some examples relate to an apparatus comprising: at least one processor and at least one memory storing instructions that, when executed by the at least one processor, cause the apparatus to at least: use a first machine learning model to determine one or more features of input information for a second machine learning model based on first information characterizing the context associated with the second machine learning model, and provide the second machine learning model with input information having the determined one or more features. In some examples, this may enable improvements to at least one of the following of the second machine learning model: a) prediction accuracy; or b) operating cost (e.g., reduced operating cost), for example, depending on the context; or c) a combination of prediction accuracy and operating cost, for example, depending on the context associated with the second machine learning model.

[0006] As an example, a second machine learning model can be configured to perform one or more predictions based on input information, and the operation of the second machine learning model can be optimized, for example, depending on the context, by adapting one or more features to the input information. In some examples, the context can characterize the context in which the prediction will be made by the second machine learning model.

[0007] In some examples, when executed by at least one processor, the instructions cause the device to: determine second information based on input information having one or more determined characteristics by processing it with a second machine learning model, the second information representing at least one of the following: a) the predictive accuracy of the second machine learning model with respect to the input information, or b) the cost associated with the execution of the second machine learning model using the input information.

[0008] In some examples, the second information may also combine aspects a) and b) above. As an example, the second information may, for instance, characterize a combination of the predictive accuracy of the second machine learning model with respect to the input information and the associated cost of executing the second machine learning model using that input information. In some examples, a predetermined function may be used to combine predictive accuracy with cost, wherein the predetermined function may, for instance, include or represent at least one of the following: a ratio, or a sum, such as a weighted sum.

[0009] In some examples, when executed by at least one processor, the instructions cause the device to: train at least one of a first machine learning model or a second machine learning model, based at least on second information. As an example, conventional training techniques based on the backpropagation algorithm can be used to perform training, for example, based on the second information and (optionally) on reference information.

[0010] In some examples, training at least one of the first or second machine learning models may include at least one of the following: a) training the first machine learning model to reduce the operational cost (e.g., execution cost) of the second machine learning model, for example, for a given context; or b) training the second machine learning model to achieve a predetermined prediction accuracy; or c) training (e.g., joint training) both the first and second machine learning models to reduce operational costs and / or achieve a predetermined prediction accuracy.

[0011] In some examples, the training may include: providing training data with multiple different characteristics to a second machine learning model, and optionally, adapting one or more parameters of at least one of the first or second machine learning models based on second information obtained by processing the training data with multiple different characteristics using the second machine learning model.

[0012] In some examples, one or more characteristics of the input information characterize at least one of the following: a) the dimension of the input information; or b) the resolution of the input information; or c) the modality of the input information; or d) the numerical range of the input information; or e) the noise of the input information; or f) context metadata, such as context-dependent metadata. In some examples, providing context metadata can, for example, enable the provision of context-dependent information to a second machine learning model.

[0013] In some examples, when executed by at least one processor, the instruction causes the device to perform at least one of the following: a) adapting the input of the second machine learning model to one or more determined features; or b) using the second machine learning model to process the input information having one or more determined features, for example, using the second machine learning model to perform inference based on the input information having one or more determined features.

[0014] In some examples, where the second machine learning model includes at least one input layer, adapting the input of the second machine learning model to one or more determined features may include adapting at least one input layer (e.g., multiple processing units, such as artificial neurons, and / or any other aspect of at least one input layer) to one or more determined features.

[0015] As an example, the second machine learning model can be an artificial neural network (NN), such as a deep neural network (DNN) of the type convolutional neural network (CNN). In some other examples, other types of neural networks or topologies used for machine learning can be used to implement the second machine learning model.

[0016] Similarly, in some examples, the first machine learning model can be an artificial neural network (NN), such as a deep neural network (DNN) of the type convolutional neural network (CNN). In other examples, other types of neural networks or topologies used for machine learning can be used to implement the first machine learning model.

[0017] In some examples, when executed by at least one processor, the instruction causes the device to perform at least one of the following: a) process context information representing the context associated with the second machine learning model; or b) encode the context information to obtain a simplified representation of the context. In some examples, the processing of the context information may include at least one of the following: a1) determining the context information; or a2) receiving the context information (e.g., from another entity); or a3) modifying the context information (e.g., at least based on the second information).

[0018] In some examples, encoding context information to obtain a simplified representation of the context can be performed using an encoder (e.g., another machine learning model of the encoder type, such as a neural network, such as a DNN), which can be configured to receive and encode context information to provide a simplified representation of the context.

[0019] In some examples, the first information may include contextual information or a simplified representation of the context.

[0020] In some examples, the context represents at least one of the following: a) the strategy used for the second machine learning model; or b) temporal context; or c) environmental context; or d) information related to at least one data source configured to provide input information; or e) the type of input information; or f) human-centered context; or g) traffic-related context information; h) energy cost-related context information; or i) safety-related context information. In some examples, the context is not limited to aspects a) through i) above. Rather, in some examples, the context can be represented by any internal or external data related to the ML model that is relevant to, and / or can influence, certain behavioral changes in the second machine learning model itself. In some examples, the information related to or representing the context can be obtained from a combination of multiple information sources.

[0021] In some examples, when executed by at least one processor, the instruction causes the device to: manage data for a second machine learning model based on at least one of: a) first information; or b) one or more characteristics determined for the input information.

[0022] In some examples, managing data includes at least one of the following: a) collecting data; or b) processing data; or c) providing at least a portion of the data as input information having one or more defined characteristics to a second machine learning model; or d) modifying at least one aspect of at least one data source to provide data. As an example, modifying at least one aspect of at least one data source may, for example, include modifying the frequency at which the at least one data source used to provide data reports data.

[0023] In some examples, when executed by at least one processor, the instructions cause the device to perform at least one of the following: a) provide at least one of the following: a1) a first machine learning model, or a2) a second machine learning model; or b) operate at least one of the following: b1) the first machine learning model, or b2) the second machine learning model, for example, for at least one of training or inference.

[0024] Therefore, in other words, in some embodiments, the apparatus according to this disclosure can perform at least one of providing or operating a first machine learning model or a second machine learning model (and / or an optional encoder model for encoding context information). However, it should be noted that in some other embodiments, the apparatus according to this disclosure does not necessarily provide or operate either the first machine learning model or the second machine learning model. Instead, in some embodiments, the first machine learning model or the second machine learning model may be provided and / or operated, for example, by at least one additional device or apparatus.

[0025] In some examples, when executed by at least one processor, the instructions cause the device to perform at least one of the following: a) dynamically (e.g., during operation, such as during at least one of training or inference) change one or more characteristics of the input information for the second machine learning model, based at least on first information (and optionally also on second information); or b) dynamically modify at least one aspect of the management data for the second machine learning model, based at least on second information (and optionally also on first information).

[0026] In some examples, based at least on first information, one or more characteristics of the input information used for the second machine learning model are dynamically modified to improve, for example, at least one of: a) prediction accuracy, or b) the operating cost of the second machine learning model; or c) a combination of the prediction accuracy and operating cost of the second machine learning model (e.g., depending on the context). Therefore, in some examples, aspects affecting the prediction accuracy or operating cost of the second machine learning model (e.g., during the runtime of the second machine learning model, such as during inference) enable, for example, the dynamic optimization of the operating cost of the second machine learning model, thereby saving, for example, energy (e.g., for performing inference). In some examples, significant energy savings can thus be achieved, for example, by dynamically adapting one or more characteristics of the input information used for the second machine learning model based at least on first information, depending on changes in context.

[0027] In some examples, the management of data for a second machine learning model based at least on second information may include at least one of the following: a) collecting data; or b) processing data; or c) storing data; or d) processing at least one data source configured to provide data.

[0028] Some examples relate to an apparatus that includes components for: using a first machine learning model, determining one or more characteristics of input information for a second machine learning model based on first information representing the context associated with the second machine learning model, and providing the second machine learning model with the determined one or more characteristics.

[0029] In some examples, the components for determining one or more characteristics and for providing input information having the determined one or more characteristics to a second machine learning model may include: at least one processor and at least one memory storing instructions that, when executed by the at least one processor, cause the means to perform at least one of the aspects of determining one or more characteristics and providing input information having the determined one or more characteristics to a second machine learning model.

[0030] In some examples, the components for determining one or more characteristics and for providing input information having the determined one or more characteristics to a second machine learning model may, for example, include a circuit system configured to perform at least one of the aspects of determining one or more characteristics and providing input information having the determined one or more characteristics to a second machine learning model.

[0031] In some examples, as used in this application, the term "circuit system" may refer to one or more or all of the following: (a) Hardware circuit implementation only (such as implementation in analog and / or digital circuit systems only), or (b) A combination of hardware circuits and software, such as: (b1) A combination of (multiple) analog and / or digital hardware circuits and software / firmware, and (b2) Any portion of a plurality of hardware processors (including a plurality of digital signal processors), software, and a plurality of memories, which work together to enable the apparatus to perform various aspects of this disclosure, or (c) (Multiple) hardware circuits and / or (multiple) processors, such as (multiple) microprocessors or a portion thereof, which require software (e.g. firmware) to operate, but may be absent when operation does not require software.

[0032] In some examples, this definition of "circuit system" applies to all uses of the term in this application, including in any claim. As another example, as used in this application, the term "circuit system" also covers only the hardware circuitry or processor (or multiple processors) or portions of the hardware circuitry or processor and its attached software and / or firmware implementation.

[0033] Some examples involve a method that includes: using a first machine learning model, determining one or more features of input information for the second machine learning model based on first information representing the context associated with the second machine learning model, and providing the second machine learning model with the determined one or more features.

[0034] Some examples relate to a computer program that includes instructions that, when executed by a device, cause the device to perform a method according to the present disclosure.

[0035] Some examples relate to a computer-readable storage medium, such as a non-transitory computer-readable storage medium, including a computer program according to this disclosure.

[0036] Some examples involve a data carrier signal that carries and / or represents a computer program according to this disclosure. Attached Figure Description

[0037] Figure 1A Simplified block diagrams based on some examples are shown.

[0038] Figure 1B Simplified block diagrams based on some examples are shown.

[0039] Figure 2 Simplified block diagrams based on some examples are shown.

[0040] Figure 3 A simplified flowchart based on some examples is shown.

[0041] Figure 4 A simplified flowchart based on some examples is shown.

[0042] Figure 5 Simplified block diagrams based on some examples are shown.

[0043] Figure 6 A simplified flowchart based on some examples is shown.

[0044] Figure 7 A simplified flowchart based on some examples is shown.

[0045] Figure 8 A simplified flowchart based on some examples is shown.

[0046] Figure 9 A simplified flowchart based on some examples is shown.

[0047] Figure 10 A simplified flowchart based on some examples is shown.

[0048] Figure 11 The following are aspects of the architecture illustrated based on some examples.

[0049] Figure 12 A simplified flowchart based on some examples is shown.

[0050] Figure 13 A simplified flowchart based on some examples is shown.

[0051] Figure 14 A simplified flowchart based on some examples is shown.

[0052] Figure 15 A simplified block diagram based on some examples is shown. Detailed Implementation

[0053] Some examples (for example, see...) Figure 1A , 23) A device 100 is involved, comprising: at least one processor 102 and at least one memory 104, the at least one memory 104 storing instructions 106, which, when executed by the at least one processor 102, cause the device 100 to at least: determine 200 (…) using a first machine learning model ML-1, based on first information I-1 representing the context CTX associated with a second machine learning model ML-2. Figure 3 For one or more features ML-2-IN-char of the input information ML-2-IN used for the second machine learning model ML-2, provide the second machine learning model ML-2 with input information ML-2-IN having the determined one or more features ML-2-IN-char. Figure 3 Optional block 204 in the middle represents the determination (e.g., prediction) of the output information ML-2-OUT of the second machine learning model ML-2 based on the input information ML-2-IN with one or more determined characteristics.

[0054] In some examples, Figure 3 The aspects of blocks 200 and 202 may enable improvements to the second machine learning model ML-2 in at least one of the following: a) prediction accuracy; or b) operational cost (e.g., reduced operating costs), for example, depending on the context CTX; or c) a combination of prediction accuracy and operational cost, for example, depending on the context CTX associated with the second machine learning model.

[0055] As an example, Figure 2 The second machine learning model ML-2 can be configured, for example, to perform one or more predictions based on the input information ML-2-IN, such as to obtain the output information ML-2-OUT. The operation of the second machine learning model ML-2 can be optimized, for example, depending on the context CTX, by adapting one or more features ML-2-IN-char to the input information ML-2-IN. In some examples, the context CTX can characterize the context of the predictions to be made by the second machine learning model ML-2. Further details and aspects of the context CTX according to other examples are explained below.

[0056] It should be noted that the potential input information or input data (e.g., “raw input data”), for example at least for the second machine learning model ML-2, may be unrelated to one or more features subsequently provided to the second machine learning model ML-2. In other words, in some examples, the collection of potential input data for at least the second machine learning model ML-2 may be performed independently of the context CTX or the first information I-1, for example, using one or more data sources DS. In some examples, the input information ML-2-IN for the second machine learning model ML-2, which has one or more determined features ML-2-IN-char, may be determined based on the raw input data thus collected, for example, by processing the raw input data, or by using only a portion of the collected raw input data as the input information ML-2-IN for the second machine learning model ML-2.

[0057] In some examples, Figure 4 When executed by at least one processor 102, instruction 106 causes device 100 to: determine 205 second information I-2 based on processing input information ML-2-IN having one or more determined characteristics by using the second machine learning model ML-2, the second information representing at least one of the following: a) the prediction accuracy of the second machine learning model ML-2 with respect to the input information ML-2-IN, or b) the cost associated with the execution of the second machine learning model ML-2 using the input information ML-2-IN.

[0058] In some examples, the second information I-2 may also combine aspects a) and b) above. As an example, the second information I-2 may characterize, for instance, a combination of the predictive accuracy of the second machine learning model ML-2 with respect to the input information and the cost associated with executing the second machine learning model ML-2 using that input information. In some examples, a predetermined function may be used to combine the predictive accuracy with the cost, wherein the predetermined function may include or represent at least one of the following: a ratio or a sum, such as a weighted sum.

[0059] In some examples, Figure 4 When instruction 106 is executed by at least one processor 102, the apparatus 100: trains 207 at least one of a first machine learning model ML-1 or a second machine learning model ML-2, based at least on second information I-2. As an example, conventional training techniques based on the backpropagation algorithm can be used to perform training 207, for example, based on second information I-2, and optionally based on reference information, such as training data and associated real data.

[0060] In some examples, Figure 4Training 207 of at least one of the first machine learning model ML-1 or the second machine learning model ML-2 may include, for example, at least one of the following: a) training the first machine learning model to reduce the operational cost (e.g., execution cost) of the second machine learning model ML-2, for example, for a given context CTX; or b) training the second machine learning model ML-2 to achieve a predetermined prediction accuracy; or c) training (e.g., jointly training) both the first machine learning model ML-1 and the second machine learning model ML-2 to reduce operational costs and / or achieve a predetermined prediction accuracy.

[0061] In some examples, Figure 4 Training 207 may include: providing training data TD with multiple different characteristics to the second machine learning model ML-2, and (optionally) adapting one or more parameters (not shown) of at least one of the first machine learning model ML-1 or the second machine learning model ML-2 based on second information I-2 obtained by processing the training data TD with multiple different characteristics using the second machine learning model ML-2.

[0062] In some examples, Figure 5 The ML-2-IN-char property used for input information ML-2-IN represents at least one of the following: a) the dimension DIM of the input information; or b) the resolution RES of the input information (e.g., temporal, spatial, or other type of resolution); or c) the modality MOD of the input information (e.g., the type of input data (such as image data, text data, time-series data, or graphical data); or d) the numerical range RNG of the input information; or e) the noise NOI of the input information; or f) the context metadata CTX-META, such as metadata related to the context CTX. In some examples, providing context metadata can, for example, enable the provision of context-related information to a second machine learning model ML-2.

[0063] In some examples, Figure 6 When executed by at least one processor 102, instruction 106 causes device 100 to perform at least one of the following: a) adapting input 210 of the second machine learning model ML-2 to one or more determined features ML-2-IN-char; or b) processing input information ML-2-IN having one or more determined features through the second machine learning model ML-2, for example, performing inference on the second machine learning model ML-2 using the second machine learning model ML-2, for example, based on input information ML-2-IN having one or more determined features.

[0064] In some examples, Figure 2The second machine learning model ML-2 includes at least one input layer (not shown), which adapts the input of the second machine learning model ML-2 to 210 ( Figure 6 The determination of one or more characteristics may include: adapting at least one input layer (e.g., multiple processing elements, such as artificial neurons, and / or any other aspect of at least one input layer) to the determination of one or more characteristics.

[0065] As an example, Figure 2 The second machine learning model ML-2 can be an artificial neural network (NN or ANN), such as a convolutional neural network (CNN) or a deep neural network (DNN). In some other examples, other types of neural networks or machine learning topologies can be used to implement the second machine learning model ML-2.

[0066] Similarly, in some examples, Figure 2 The first machine learning model ML-1 can be an artificial neural network (NN), such as a convolutional neural network (CNN) or a deep neural network (DNN). In some other examples, other types of neural networks or machine learning topologies can be used to implement the first machine learning model ML-1.

[0067] In some examples, Figure 7 When executed by at least one processor 102, instruction 106 causes device 100 to perform at least one of the following: a) process 220 context information I-CTX, which characterizes the context CTX associated with the second machine learning model ML-2. Figure 2 (a) or (b) encoding the context information I-CTX 222 to obtain a simplified representation I-CTX' of the context CTX. In some examples, the processing of the context information 220 may include at least one of the following: a1) determining the context information I-CTX locally, for example, by device 100; or a2) receiving the context information (not shown) from another entity, for example; or a3) modifying the context information I-CTX, for example, at least based on the second information I-2.

[0068] In some examples, Figure 7 Encoding the context information I-CTX 222 to obtain a simplified representation of the context CTX I-CTX' can be done using the encoder ENC ( Figure 2 (For example, another machine learning model of the encoder type, such as a neural network, such as a DNN) is executed, and the encoder ENC can be configured to receive and encode context information I-CTX, thereby providing a simplified representation of the context I-CTX'.

[0069] In some examples, Figure 2 The first information I-1 may include context information I-CTX or a simplified representation of context CTX, I-CTX'.

[0070] In some examples, Figure 2 The context CTX represents at least one of the following: a) the policy used for the second machine learning model ML-2; or b) the temporal context; or c) the environmental context; or d) the relationship with at least one data source DS ( Figure 2 (e) the type of input information ML-2-IN; (f) human-centric context; (g) traffic-related context; (h) energy cost-related context; or (i) safety-related context.

[0071] In some examples, Figure 2 The context CTX is not limited to aspects a) through i) above. Rather, in some examples, the context CTX can be characterized by any internal or external data that is relevant to and / or may affect certain behavioral changes in the second machine learning model ML-2. In some examples, information relevant to or characterizing the context CTX can be obtained from a combination of multiple information sources (not shown).

[0072] In some examples, Figure 8 When executed by at least one processor 102, instruction 106 causes device 100 to manage data 230 for a second machine learning model ML-2 based on at least one of the following (e.g., the data may potentially be used as input information ML-2-IN for the second machine learning model ML-2): a) first information I-1, or b) one or more characteristics ML-2-IN-char for the input information.

[0073] In some examples, Figure 8 Managing data 230 includes at least one of the following: a) collecting data 230a; or b) processing data 230b; or c) providing at least a portion of the data as input information ML-2-IN having one or more defined characteristics to a second machine learning model ML-2; or d) modifying data 230d for providing at least one data source DS ( Figure 2 At least one aspect of ). As an example, Figure 8 Modifying at least one aspect of at least one data source DS in 230d may include, for example, modifying the frequency at which at least one data source DS reports data for providing data.

[0074] Figure 8Optional block 232 indicates optional uses of the data managed according to block 230, such as as potential input information for a second machine learning model ML-2.

[0075] In some examples, Figure 9 When executed by at least one processor 102, instruction 106 causes device 100 to perform at least one of the following: a) providing at least one of the following: a1) a first machine learning model ML-1 ( Figure 2 ), or a2) a second machine learning model ML-2 (or optional encoder ENC, Figure 2 ); or b) at least one of the following operations 242: b1) a first machine learning model ML-1, or b2) a second machine learning model ML-2, for example, at least one of the following used for training or inference.

[0076] Therefore, in other words, in some embodiments, the device 100 according to this disclosure ( Figure 1A , 2 It can perform at least one of the following: provide 240 ( Figure 9 ) or operation 242 at least one of the first machine learning model ML-1 or the second machine learning model ML-2 (and / or the optional encoder ENC for encoding context information I-CTX), for example, except according to Figure 3 Beyond execution aspects 200 and 202. However, it should be noted that in some other examples, the apparatus 100 according to this disclosure does not necessarily provide or operate either the first machine learning model ML-1 or the second machine learning model ML-2. Rather, in some embodiments, the first machine learning model ML-1 or the second machine learning model ML-2 (e.g., and optionally the encoder ENC) may be provided and / or operated, for example, by at least one additional device or apparatus (not shown).

[0077] In some examples, Figure 10 When executed by at least one processor 102, the instruction 106 causes the device 100 to perform at least one of the following: (a) (e.g., during operation, such as during training or inference of the second machine learning model ML-2) dynamically changing one or more characteristics ML-2-IN-char of the input information ML-2-IN for the second machine learning model ML-2, for example, for at least one of training or inference, based at least on the first information I-1 (and optionally, also on the second information I-2); or (b) dynamically modifying at least one aspect of the management data for the second machine learning model ML-2, based at least on the second information I-2 (and optionally, also on the first information I-1).

[0078] In some examples, Figure 10 Based at least on first information, one or more characteristics of the input information used for the second machine learning model are dynamically changed to improve (e.g., dynamically improve) at least one of the following: a) prediction accuracy; b) operating cost; or c) a combination of prediction accuracy and operating cost of the second machine learning model, for example, depending on the context CTX. Thus, in some examples, aspects affecting at least one of the prediction accuracy or operating cost of the second machine learning model ML-2 can be dynamically changed, for example during runtime (e.g., during inference or training of the second machine learning model ML-2). This, in some examples, enables, for example, the dynamic optimization of the operating cost of the second machine learning model ML-2, thereby saving, for example, energy (e.g., for performing inference). In some examples, for example, significant energy savings can therefore be achieved by dynamically adapting one or more characteristics of the input information used for the second machine learning model based at least on the first information, depending on changes in the context CTX.

[0079] In some examples, Figure 10 The management of data for a second machine learning model ML-2 based at least on second information I-2 (see also block 252) may include at least one of the following: a) collecting data; or b) processing data; or c) storing data; or d) processing at least one data source DS configured to provide data. Figure 2 ).

[0080] Figure 1B Some examples involve an apparatus 100' that includes a component 102' for: determining 200 (using a first machine learning model ML-1, based on first information representing the context associated with a second machine learning model ML-1) Figure 3 ) One or more features used as input information for the second machine learning model ML-2; providing the second machine learning model ML-2 with input information having the determined one or more features.

[0081] In some examples, Figure 1B The component 102' for determining one or more features and providing input information having the determined one or more features to a second machine learning model may, for example, include at least one processor 102 (see also, for example...). Figure 1A ) and at least one memory 104, which stores instructions 106, which, when executed by at least one processor 102, cause device 100' ( Figure 1B The process involves determining 200 or more features and providing a second machine learning model with 202 input information having one or more features, including at least one of the aforementioned aspects.

[0082] In some examples, Figure 1B The component 102' for determining 200 or more features and for providing 202 input information having the determined one or more features to a second machine learning model may include a circuit system 104' configured to perform at least one of the above aspects of determining 200 or more features and providing 202 input information having one or more features to the second machine learning model.

[0083] In some examples, as used in this application, the term "circuit system" 104' may refer to one or more or all of the following, for example: (a) Hardware circuit implementation only (such as implementation using only analog and / or digital circuit systems), or (b) A combination of hardware circuits and software, such as: (b1) A combination of (multiple) analog and / or digital hardware circuits and software / firmware, and (b2) Any portion of a plurality of hardware processors (including a plurality of digital signal processors), software, and a plurality of memories, which work together to enable the apparatus to perform various aspects of this disclosure, or (c) (Multiple) hardware circuits and / or (multiple) processors, such as (multiple) microprocessors or a portion thereof, which require software (e.g. firmware) to operate, but may be absent when operation does not require software.

[0084] In some examples, the definition of this circuit system applies to all uses of the term in this application, including in any claim. As another example, as used in this application, the term "circuit system" also covers (e.g., only covers) hardware circuitry or processors (or processors) or portions thereof and their accompanying software and / or firmware implementations.

[0085] Figure 3 Some examples involve a method that includes: using a first machine learning model, determining 200 one or more features for input information of the second machine learning model based on first information representing the context associated with the second machine learning model; and providing the second machine learning model with 202 input information having the determined one or more features.

[0086] In the following, additional aspects and examples are disclosed, and in some examples, these aspects and examples may be combined with at least one of the aspects and / or examples disclosed above.

[0087] Figure 11The architecture, including a first machine learning model ML-1 and a second machine learning model ML-2, is schematically depicted as described above.

[0088] The dashed rectangle 10 represents, based on some examples and context, CTX ( Figure 2 Regarding the related aspects, the context CTX is associated with the second machine learning model ML-2. Element 11 represents context information, and element 12 represents the optional encoder (see also...). Figure 2 In the block ENC), element 13 represents the encoding context information that may be obtained by encoder 12.

[0089] The dashed rectangle 20 represents aspects of the first machine learning model ML-1 and different input information 21, 22 for the second machine learning model ML-2, which have different characteristics, such as different dimensions. As an example, input information 21 represents time-series information organized in a matrix form with T columns and M rows, while input information 22 represents time-series information organized in a matrix form with T' columns and M' rows, where, for example, T' is less than or equal to T, and where, for example, M' is less than or equal to M. According to the principles of this disclosure, the first machine learning model ML-1 can determine one or more characteristics of the input information to be provided to the second machine learning model ML-2 based on encoded context information 13. Figure 11 In this example, based on this, it is determined that input information 21 or input information 22 can be provided to the second machine learning model ML-2.

[0090] The dashed rectangle 30 represents an aspect of the second machine learning model ML-2, where element 31 represents the output (e.g., prediction) of the second machine learning model ML-2, for example, provided by the second machine learning model ML-2 based on the corresponding input information provided as if based on the determination using the first machine learning model ML-1.

[0091] The dashed rectangle 40 represents an aspect of data processing, for example, for providing input information ML-2-IN with one or more characteristics, as determined using a first machine learning model ML-1. Element 41 represents an aspect of data collection, which in some examples may be influenced by the first machine learning model ML-1, for example, by or based on one or more characteristics ML-2-IN-char for the input information ML-2-IN of a second machine learning model ML-2, as determined by the first machine learning model ML-1. It should be noted that in some other embodiments, data collection 41 may be performed, for example, independently of one or more characteristics ML-2-IN-char. Element 42 represents an aspect of data processing, which in some examples may be influenced by, for example, by or based on one or more characteristics ML-2-IN-char for the input information ML-2-IN of a second machine learning model ML-2, as determined by the first machine learning model ML-1. Element 43 represents a database for at least temporarily storing data that may be provided as input information ML-2-IN to the second machine learning model ML-2. In this respect, arrow a1 generally indicates... Figure 11 At least one aspect of the data processing 40 may be performed temporarily based on the operation of the first machine learning model ML-1, for example, based on one or more features ML-2-IN-char of the input information ML-2-IN for a second machine learning model ML-2, which may be determined by the first machine learning model ML-1.

[0092] Figure 11Element 50 represents the aggregation cost function, which can represent at least one of the following: a) the loss of the second machine learning model ML-2, for example, characterizing the predictive accuracy of the second machine learning model ML-2; or b) the operational cost of the second machine learning model ML-2 (for example, characterizing training time or energy consumption), which may depend, for example, on one or more input characteristics of the input information ML-2-IN. Arrow a2 represents the backpropagation of at least one of the loss and operational costs. In some examples, they may also be combined to represent the aggregation cost function, where backpropagation may be used, for example, to at least train the second machine learning model ML-2, for example, to adapt one or more parameters of the second machine learning model ML-2, such as, for example, weights. Similarly, arrow a3 represents an aspect of the backpropagation of the loss function and / or operational cost (or aggregation cost), where backpropagation may be used, for example, to at least train the first machine learning model ML-1, for example, to adapt one or more parameters of the first machine learning model ML-1, such as, for example, weights. In some examples, for instance, based on the aggregation cost function 50, two machine learning models ML-1 and ML-2 can be jointly trained, where the parameters of the first machine learning model ML-1 and the second machine learning model ML-2 can be adapted.

[0093] In some examples, with context CTX ( Figure 2 The relevant information can be retrieved repeatedly, for example, periodically or continuously. In some examples, the collected contextual information can be used for training or inference of at least one of the machine learning models ML-1 and ML-2. In some examples, the contextual information can also be used to determine which strategies should be taken or applied, for example, at each time step, for example by a second machine learning model ML-2. In some examples, the strategy can, for example, determine that the values ​​of the time series should be predicted, for example, using a measurement with higher granularity (e.g., at 1-minute intervals) or using a measurement with lower granularity (e.g., at 1-hour intervals).

[0094] In some examples, the data types that can be used as contextual information may include at least one of the following: a) temporal context (e.g., hours of the day, days of the week, months, etc.); or b) environmental context (e.g., temperature, weather conditions, air pressure, humidity, etc.); or c) data source context (e.g., the device used to retrieve / generate the data, the nature of the data (e.g., synthetic / real), etc.); or d) human-centric context, such as, for example, work context (e.g., bank holidays, factory / office opening hours, etc.); or e) external context (e.g., traffic in a city, gasoline prices, city / country crime rate index, etc.), to name just a few. In some examples, the contextual information may be of different natures and may vary considerably in size, for example, depending on the use case. In some examples, for cases where the contextual information is relatively complex, encoding techniques may be used to derive a compact representation of the context (i.e., encoded context), see [link to relevant documentation]. Figure 2 Optional encoder ENC.

[0095] In some examples, one or more features of the input information for the second machine learning model can be determined based on contextual information, such as based on the first information I-1, as described above. In some examples, the first machine learning model ML-1 determines (e.g., decides) one or more features ML-2-IN-char of the input information ML-2-IN for the second machine learning model ML-2, such as during inference (and / or training). In some examples, the first machine learning model ML-1 may also determine information controlling how data (e.g., representing potential input information for the second machine learning model ML-2) is managed, for example, see [link to relevant documentation]. Figure 11 Arrow a1.

[0096] In some training-related examples, as described above, a first machine learning model ML-1 and a second machine learning model ML-2 can be trained together, for example, through joint training, where, for example, encoded contextual information (e.g., first information I-1) can be used as input to the first machine learning model ML-1, while additional training data is provided to the second machine learning model ML-2.

[0097] In some examples, the first machine learning model ML-1 can also be combined with the second machine learning model ML-2, for example, to obtain an aggregated machine learning model (not shown). In some examples, the aspects described above related to the training and / or inference of machine learning models ML-1 and ML-2 may also be applied accordingly to such an aggregated machine learning model.

[0098] In some examples, ML-2-IN is used to input information. Figure 2One or more characteristics of the input information can characterize its shape. As an example, using time-series type input information, a first input shape can be characterized, for example, by M feature values ​​taken during T measurements, while a second input shape (different from the first) can, for example, have M' features (where M' <> M), and / or include T' measurements (where T' <> T), or any other considered shape. Similar to these examples related to the shape of the input information, one or more additional characteristics of the input information can, for example, depend on the context CTX (…). Figure 2 () is selected or changed, for example, as indicated by the first information I-1.

[0099] In some examples, once one or more features for the input information of the second machine learning model ML-2 are determined, the input information ML-2-IN with the determined one or more features can be provided to the second machine learning model ML-2, for example, for processing by the second machine learning model ML-2. In some examples, the second machine learning model ML-2 can be a DNN.

[0100] As described above, during training, for example after generating output information ML-2-OUT (e.g., predictions), the costs associated with the generation of predictions (e.g., both in terms of loss (e.g., prediction accuracy) and cost (e.g., operational cost)) can be backpropagated, for example, to adapt one or more weights of the second machine learning model ML-2. In some examples, backpropagation can also be used to adapt one or more weights of the first machine learning model ML-1. In some examples, in order to train at least the second machine learning model ML-2, training data TD (see also...) Figure 4 Block 207 can be presented to the second machine learning model ML-2, and the training data TD includes different features.

[0101] In some examples, Figure 2 During the inference phase, the first machine learning model ML-1 determines one or more characteristics (e.g., characterizing, but not limited to, the shape of the input information ML-2-IN) for the second machine learning model ML-2 based on context (e.g., based on first information I-1), and thus triggers one or more mechanisms associated with at least one of: a) data collection, or b) data processing, or c) data storage, such as modifying the rate of data processing and / or any other method. In some examples, this enables aspects of performing data processing based on context, where, for example, the frequency of data collection can be adapted to a specific context, which in some examples can achieve energy savings and / or storage space savings.

[0102] Figure 12 A schematic depiction of... Figure 11The example architecture relates to aspects of training. Element 60 represents the initialization of at least one of the machine learning models ML-1 and ML-2, for example, along with training parameters (e.g., number of training epochs, data partitions to be used, etc.). Element 61 represents retrieving training data and contextual information from one or more databases 61a, for example, feeding it into at least one of the machine learning models ML-1 and ML-2. Element 62 represents iteratively selecting different input information and / or one or more features for the input information to train, for example, at least the second machine learning model ML-2. Element 63 represents optional adaptation of the inputs (e.g., input layers) of the second machine learning model ML-2, for example, based on one or more features for the input information determined by element 62. Element 64 represents performing forward propagation using the second machine learning model ML-2, for example, to obtain a predicted output from the second machine learning model ML-2. In some examples, element 64 may also include performing forward propagation using the first machine learning model ML-1, for example, using the contextual information obtained by element 61. Element 65 represents determining the predictive loss and the cost of running the second machine learning model ML-2, where in some examples, the predictive loss and cost can be aggregated, for example, using a given cost function to obtain an aggregated cost. In some examples, the aggregated cost can be used, for example, to update the parameters (e.g., weights) of at least one of the machine learning models ML-1 and ML-2. Element 66 represents determining whether a predetermined amount of training data and / or features used for input information have been used for training. If yes, the process proceeds to element 67, for example, completing training. If no, for example, if there is additional training data and / or features used for input information that can be used for training, the process can proceed to element 61, for example, retrieving additional training data and / or features used for input information.

[0103] In some examples, at least one additional criterion for determining whether training is complete can be used, such as a criterion based on an early stopping procedure. In some examples, early stopping refers to a regularization technique that can be used during the training of a deep neural network to prevent overfitting by stopping training when the model's performance on the validation set stops improving.

[0104] Figure 13 A schematic depiction of... Figure 11 The example architecture is associated with the reasoning aspect.

[0105] Element 70 represents one or more data sources, for example, these data sources provide data on a continuous basis, such as in the form of at least one data stream a4. Element 71 represents the data retrieved from one or more data sources 70 for inference, along with contextual information. Contextual information is provided to element 72 (see also arrow a5), where element 72 represents the first machine learning model ML-1 (…). Figure 2 The first machine learning model ML-1 determines the input shape a6 for the second machine learning model 73, for example, based on context information a5. In other words, in some examples, the input (e.g., the input layer) of the second machine learning model ML-2 can be reconfigured, for example, based on context information a5. Arrow a7 represents data forwarded from retrieval block 71 for inference by the second machine learning model 73, for example, having one or more characteristics as determined by the first machine learning model 72. Element 73a represents monitoring of the second machine learning model 73, for example, to determine the cost (e.g., energy consumption) of running the second machine learning model 73. Element 74 represents determining the aggregation cost based on the prediction accuracy of the second machine learning model 73 and the cost as determined by the model monitoring block 73a. In some examples, the aggregation cost as determined by element 74 can be used to update at least one aspect of the data source 70 (see arrow a8). In some examples, the context information can be modified or updated accordingly based on the aggregation cost as determined by element 74. Element 75 represents detecting changes in one or more characteristics of the input information for the second machine learning model 73, as determined, for example, by the first machine learning model 72. If such a change in one or more characteristics used for input information is detected, the corresponding change may be applied to at least one of the following: a) data collection 76; or b) data processing; or c) data storage, thereby, for example, modifying the behavior of at least one of the data sources e70 (see also dashed arrow a9).

[0106] Figure 14 An aspect of an integrated machine learning pipeline based on the principles of this disclosure is schematically depicted, the pipeline having data processing and machine learning model processing and deployment capabilities. Element 80 represents a machine learning model in a first variant, for example, a second machine learning model ML-2 (… Figure 2 ). Figure 14 The machine learning model 80 can be characterized, for example, by a first function F(I;W;B;A), where I represents the input information, such as input features, where W represents the weights, such as being organized as a weight matrix, where B represents the bias, and where A represents the activation function(s) associated with the machine learning model 80. Figure 14 Element 81 represents a context-based transformation, which can be performed, for example, based on some examples, such as, according to the first information I-1 ( Figure 2 The context represented. Context-based transformation 81 transforms function F into a modified function F' (see...) Figure 14 In block 82), the modified function F' can be represented, for example, as F'(I'; W; B; A). The element I' of the modified function F' represents the modified input information, for example, at least one characteristic of the input information is changed with respect to the input information I of the function F (see block 80). In some examples, the modified input information I' can be provided by, for example, block 83, which represents at least one of the following: a) data acquisition, b) data processing, or c) data storage.

[0107] In some examples, Figure 14 The configuration represents a structure that can, for example, automatically (e.g., without human interaction) select one or more features (e.g., but not limited to shape) for the input information I, I' for a second machine learning model ML-2. In some examples, the structure can also adjust, for example, the speed at which data is collected, processed, and / or stored.

[0108] As mentioned above, the context CTX predicted by the second machine learning model ML-2 ( Figure 2 The context CTX can be used to modify at least one aspect of the input information and / or data processing for at least one of the machine learning models ML-1 and ML-2. In some examples, the context CTX can also characterize at least one time aspect (e.g., an hour in a day, a day in a week, etc.) and / or potential constraints, such as deployment costs at least associated with the second machine learning model ML-2.

[0109] According to the principles of this disclosure, the characteristics of the input information used for at least the second machine learning model ML-2 can be dynamically influenced (e.g., modified), for example, during at least one of a) training or b) inference, based on at least one of performance (e.g., prediction performance) or cost (e.g., operating cost).

[0110] In some examples, the intensity of data being collected, processed, and / or stored for inference using at least a second machine learning model ML-2 can be controlled (e.g., adjusted) according to the principles of this disclosure.

[0111] In some examples, the principles of this disclosure enable cost-effective machine learning-based solutions that, for example, provide relatively high performance, while adapting themselves to different situations and environments (e.g., characterized by contextual CTX), reduce their operating costs and / or improve or maintain predictive accuracy, even under operations with reduced operating costs.

[0112] Figure 15Some examples involve computer programs, PRGs, including the instruction INSTR, which, when executed by devices 100, 100', will perform the method according to this disclosure.

[0113] Figure 15 Some examples involve computer-readable storage media ST-M, such as non-transitory computer-readable storage media, including computer programs PRG according to this disclosure.

[0114] Figure 15 Some examples involve data carrier signals DCS that carry and / or characterize computer programs PRGs according to this disclosure.

Claims

1. A communication apparatus (100), comprising: At least one processor (102) and at least one memory (104), the at least one memory (104) storing instructions (106), the instructions (106), when executed by the at least one processor (102), causing the apparatus (100) to at least: use a first machine learning model (ML-1), based on first information (I-1) characterizing a context (CTX) associated with a second machine learning model (ML-2), determine (200) one or more features (ML-2-IN-char) of input information (ML-2-IN) for the second machine learning model (ML-2), and provide (202) the input information (ML-2-IN) having the determined one or more features (ML-2-IN-char) to the second machine learning model (ML-2).

2. The apparatus (100) of claim 1, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to: determine (205) second information (I-2) based on processing the input information (ML-2-IN) having the determined one or more characteristics (ML-2-IN-char) using the second machine learning model (ML-2), the second information (I-2) characterizing at least one of: a) the prediction accuracy of the second machine learning model (ML-2) with respect to the input information (ML-2-IN), or b) the cost associated with the execution of the second machine learning model (ML-2) using the input information (ML-2-IN).

3. The apparatus (100) according to claim 2, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to: train (207) at least one of the first machine learning model (ML-1) or the second machine learning model (ML-2) based at least on the second information (I-2).

4. The apparatus (100) according to claim 3, wherein the training (207) comprises: The second machine learning model (ML-2) is provided with training data (TD) with multiple different characteristics.

5. The apparatus (100) according to any one of the preceding claims, wherein the one or more characteristics (ML-2-IN-char) for the input information (ML-2-IN) characterize at least one of the following: a) the dimension (DIM) of the input information (ML-2-IN); or b) the resolution (RES) of the input information (ML-2-IN); or c) the modality (MOD) of the input information (ML-2-IN); or d) the numerical range (RNG) of the input information (ML-2-IN); or e) the noise (NOI) of the input information (ML-2-IN); or f) context metadata (CTX-META).

6. The apparatus (100) according to any one of the preceding claims, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to perform at least one of the following: a) adapting (210) the input of the second machine learning model (ML-2) to the determined one or more features (ML-2-IN-char); or b) processing (212) the input information (ML-2-IN) having the determined one or more features (ML-2-IN-char) using the second machine learning model (ML-2).

7. The apparatus (100) according to any one of the preceding claims, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to perform at least one of the following: a) processing (220) context information (I-CTX) characterizing the context associated with the second machine learning model (ML-2); or b) encoding (222) the context information (I-CTX) to obtain a simplified representation (I-CTX') of the context.

8. The apparatus (100) according to any one of the preceding claims, wherein the context (CTX) characterizes at least one of: a) a strategy for the second machine learning model (ML-2); or b) a temporal context; or c) an environmental context; or d) information related to at least one data source configured to provide the input information (ML-2-IN); or e) the type of the input information (ML-2-IN); or f) a human-centered context; or g) traffic-related context information; h) energy cost-related context information; or i) safety-related context information.

9. The apparatus (100) according to any one of the preceding claims, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to: manage (230) data for the second machine learning model (ML-2) based on at least one of: a) the first information (I-1); or b) the determined one or more characteristics (ML-2-IN-char) for the input information (ML-2-IN).

10. The apparatus (100) of claim 9, wherein managing (230) the data comprises at least one of: a) collecting (230a) the data; or b) processing (230b) the data; or c) providing (230c) at least a portion of the data as input information (ML-2-IN) having the determined one or more characteristics (ML-2-IN-char) to the second machine learning model (ML-2); or d) modifying (230d) at least one aspect of at least one data source (DS) for providing the data.

11. The apparatus (100) according to any one of the preceding claims, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to perform at least one of the following: a) providing (240) at least one of the following: a1) the first machine learning model (ML-1), or a2) the second machine learning model (ML-2); or b) operating (242) at least one of the following: b1) the first machine learning model (ML-1), or b2) the second machine learning model (ML-2).

12. The apparatus (100) according to any one of the preceding claims, wherein the instruction (106), when executed by the at least one processor (102), causes the apparatus (100) to perform at least one of the following: a) dynamically changing (250) one or more characteristics (ML-2-IN-char) of the input information (ML-2-IN) for the second machine learning model (ML-2) based at least on the first information (I-1); or b) dynamically modifying (252) at least one aspect of the management data for the second machine learning model (ML-2) based at least on the second information (I-2).

13. An apparatus (100') for communication, comprising (102') for: using a first machine learning model (ML-1), determining (200) one or more characteristics (ML-2-IN-char) of input information (ML-2-IN) for the second machine learning model (ML-2) based on first information (I-1) characterizing the context associated with the second machine learning model (ML-2), and providing (202) the input information (ML-2-IN) having the determined one or more characteristics (ML-2-IN-char) to the second machine learning model (ML-2).

14. A method for communication, comprising: Using a first machine learning model (ML-1), based on first information (I-1) representing the context associated with a second machine learning model (ML-2), one or more features (ML-2-IN-char) for the input information (ML-2-IN) of the second machine learning model (ML-2) are determined (200), and the input information (ML-2-IN) having the determined one or more features (ML-2-IN-char) is provided to the second machine learning model (ML-2) (202).

15. A computer program (PRG) product comprising instructions (INSTR) that, when executed by a device (100; 100'), cause the device (100; 100') to perform the method according to claim 14.

16. A computer-readable storage medium (ST-M), such as a non-transitory computer-readable storage medium, comprising a computer program (PRG) product according to claim 15.

17. A data carrier signal (DCS) carrying and / or characterizing a computer program (PRG) product according to claim 15.