Method and system for providing current or predicted data from a data source

The system addresses sensor network inefficiencies by using generative and decay models to dynamically assess data accuracy, providing timely and reliable data while conserving resources.

WO2026125404A1PCT designated stage Publication Date: 2026-06-18KONINK KPN NV +1

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
KONINK KPN NV
Filing Date
2025-12-09
Publication Date
2026-06-18

AI Technical Summary

Technical Problem

Existing systems struggle to efficiently manage data freshness and resource consumption in sensor networks, leading to overloaded sensors and unresponsive applications due to insufficient sensor capabilities.

Method used

A method and system that utilize a generative model to generate predicted data and a decay model to estimate its accuracy, allowing the selection of actual or predicted data based on reliability, thereby reducing resource consumption and ensuring timely responses.

🎯Benefits of technology

The system effectively balances resource usage by leveraging generative and decay models to provide high-fidelity predictions, ensuring data accuracy within acceptable thresholds, reducing bandwidth and computational pressure.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure EP2025086179_18062026_PF_FP_ABST
    Figure EP2025086179_18062026_PF_FP_ABST
Patent Text Reader

Abstract

Some embodiments are directed to method (400) for providing current or predicted data from a data source The method includes obtaining (440) from a decay model an estimate of the quantitative measure of the accuracy of a prediction by a generative model of the current data of a data source, and based on the estimate selecting (450) to obtain actual current data from the data source, or to obtain predicted current data from the generative model.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] METHOD AND SYSTEM FOR PROVIDING CURRENT OR PREDICTED DATA

[0002] FROM A DATA SOURCE

[0003] TECHNICAL FIELD

[0004] The presently disclosed subject matter relates to a method for providing current or predicted data from a data source, a system for providing current or predicted data from a data source, and a computer storage medium.

[0005] BACKGROUND

[0006] The article “Age of Information-Aware Scheduling for Timely and Scalable Internet of Things Applications”, by Lorenzo Corneo, Christian Rohner, and Per Gunningberg (included herein by reference), describes a known age of information-aware scheduling policy for a physical device to push sensor updates.

[0007] The known system has applications, an edge device, and physical sensing devices. The physical sensing devices comprise a sensor network that provides sensor readings. The applications specify an interest in particular sensors but have requirements on data freshness.

[0008] The edge device is located close to the sensor network. It aggregates all the requirements of the applications and combines them together, creating a schedule. The schedule instructs the sensors on when to push updates toward the edge server. Such a schedule aims to minimize the number of sensor update transmissions while providing the required level of information freshness.

[0009] However, even with the smart schedule, the capabilities of the sensors may not be enough to satisfy the demands of all applications. This can result in overloaded, and possibly unresponsive, sensors. The applications may not receive the updates that they requested with the desired timeliness.

[0010] SUMMARY

[0011] A method and system for providing current or predicted data from a data source are described in the accompanying claims. Specific embodiments of the invention are set forth in the dependent claims. For example, the method may comprise maintaining a generative model for modeling of the data of the data source, the generative model being configured to generate predicted data for the data source at a given time point.

[0012] Typically, the generative model is configured for forward modeling of the data of the data source; the generative model being configured to generate predicted data for the data source at a given future time point. In this context, the future time point is relative to the most recent time point actual current data was obtained from the data source.

[0013] For example, the method may comprise maintaining a decay model, the decay model being configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model.

[0014] For example, the method may comprise receiving a request for current data from the data source. In response, one may obtain from the decay model an estimate of the quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source, and based on the estimate, select to obtain actual current data from the data source or to obtain predicted current data from the generative model.

[0015] For example, the method may comprise returning actual current data or predicted current data.

[0016] Various advantages of such a method are set out herein; For example, advantages include reducing resource consumption and ensuring timely responses. By leveraging a generative model, the system can provide high-fidelity predictions when direct access to the data source is limited or resource-intensive. Additionally, incorporating a decay model allows dynamic assessment of the predicted data's reliability, ensuring that applications receive data within acceptable accuracy thresholds.

[0017] Some embodiments are directed to a method for providing current or predicted data from a data source. The method includes obtaining from a decay model an estimate of the quantitative measure of the accuracy of a prediction by a generative model of the current data of a data source, and based on the estimate, selecting to obtain actual current data from the data source or to obtain predicted current data from the generative model.

[0018] The method may be computer implemented. The system may be an electronic system, e.g., an electronic device, e.g., comprising one or more computers.

[0019] An embodiment of the method may be implemented on a computer as a computer implemented method, or in dedicated hardware, or in a combination of both. Executable code for an embodiment of the method may be stored on a computer program product. Examples of computer program products include memory devices, optical storage devices, integrated circuits, servers, online software, etc. Preferably, the computer program product comprises non-transitory program code stored on a computer readable medium for performing an embodiment of the method when said program product is executed on a computer.

[0020] In an embodiment, the computer program comprises computer program code adapted to perform all or part of the steps of an embodiment of the method when the computer program is run on a computer. Preferably, the computer program is embodied on a computer readable medium.

[0021] BRIEF DESCRIPTION OF DRAWINGS

[0022] Further details, aspects, and embodiments will be described, by way of example only, with reference to the drawings. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. In the figures, elements which correspond to elements already described may have the same reference numerals. In the drawings,

[0023] Figure la schematically shows an example of an embodiment of a data system,

[0024] Figure lb schematically shows an example of an embodiment of a data system,

[0025] Figure 2a schematically shows an example of an embodiment of a system for providing current or predicted data from a data source,

[0026] Figure 2b schematically shows an example of an embodiment of a system for providing current or predicted data from a data source,

[0027] Figure 3 schematically shows an example of a graph schematically showing model accuracy versus time,

[0028] Figure 4 schematically shows an example of an embodiment of a method for providing current or predicted data from a data source,

[0029] Figure 5 schematically shows an example of an embodiment of a method for providing current or predicted data from a data source,

[0030] Figure 6a schematically shows a computer readable medium having a writable part comprising a computer program according to an embodiment,

[0031] Figure 6b schematically shows a representation of a processor system according to an embodiment. Reference signs list

[0032] The following list of references and abbreviations corresponds to Figures la- 3, and is provided for facilitating the interpretation of the drawings. It shall not be construed as limiting the claims.

[0033] 100, 102 a data system

[0034] 110, 110.1, 110.2 an application device

[0035] 120 a scheduling device

[0036] 130, 130.1, 130.2 a model management device

[0037] 140.1-140.3 a data source

[0038] 111 a processor system

[0039] 112 storage

[0040] 113 a communication interface

[0041] 121 a processor system

[0042] 122 a storage

[0043] 123 a communication interface

[0044] 131 a processor system

[0045] 132 a storage

[0046] 133 a communication interface

[0047] 172 a network

[0048] 200, 201 a data system

[0049] 210 an application

[0050] 220 a scheduler

[0051] 231 a generative model

[0052] 232 a decay model

[0053] 240 one or more data sources

[0054] 211 a request for current data from a data source

[0055] 212 an accuracy query

[0056] 213 a return of the actual current data or predicted current data

[0057] 241 a data source polling

[0058] 230 model manager

[0059] 221 providing actual current data

[0060] 222 retrieving predicted current data 223 retrieving an estimate of a quantitative measure of the accuracy of the predicted data

[0061] 300 a graph schematically showing accuracy versus time

[0062] 311 a time axis

[0063] 312 an accuracy axis

[0064] 1000, 1001 a computer readable medium

[0065] 1010 a writable part

[0066] 1020 a computer program

[0067] 1110 one or more integrated circuits

[0068] 1120 a processing unit

[0069] 1122 a memory

[0070] 1124 a dedicated integrated circuit

[0071] 1126 a communication element

[0072] 1130 an interconnect

[0073] 1140 a processor system

[0074] DESCRIPTION OF EMBODIMENTS

[0075] While the presently disclosed subject matter is susceptible of embodiment in many different forms, there are shown in the drawings and will herein be described in detail one or more specific embodiments, with the understanding that the present disclosure is to be considered as exemplary of the principles of the presently disclosed subject matter and not intended to limit it to the specific embodiments shown and described.

[0076] In the following, for the sake of understanding, elements of embodiments are described in operation. However, it will be apparent that the respective elements are arranged to perform the functions being described as performed by them.

[0077] Further, the subject matter that is presently disclosed is not limited to the embodiments only, but also includes every other combination of features described herein or recited in mutually different dependent claims. Figure la schematically shows an example of a data system 100, including an embodiment of an application device 110, an embodiment of a scheduling device 120, and an embodiment of a model management device 130.

[0078] Application device 110 is configured to request scheduling device 120 for current data from a data source. Application device 110 receives from scheduling device 120 either actual current data from the data source, or predicted current data from a generative model.

[0079] Scheduling device 120 is configured to receive the request for current data, and is configured to decide whether to return to application device 110 actual current data from the data source or predicted current data from the generative model. The request for current data means data which is more recent than the last data received by application device 110 previously. Current data may refer to data that is at the data source at the time of the request, or at a time after the request, e.g., when actual current data is obtained, or predicted current data is generated.

[0080] To provide actual current data from the data source, scheduling device 120 may have to schedule polling the data source, and / or wait for the data source to push the current data. Actual current data is also referred to as raw data. Predicted current data is also referred to as modeled data. Instead of polling from a data source, the data source may be configured to push data, e.g., to scheduling device 120; the push schedule of the data source may be configured by the scheduling device as needed.

[0081] Scheduling device 120 may obtain from a decay model an estimate of a quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source. Based on the estimate scheduling device 120 may then decide to use the actual current data or the predicted current data.

[0082] Model management device 130 is configured to generate predicted data for the data source at a given time point. Model management device 130 may also be configured to maintain the generative model for modeling, e.g., forward modeling, of the data of the data source. Model management device 130 may be configured to generate predicted data for the data source at a given future time point, with respect to the last update of the generative model.

[0083] Scheduling device 120 and / or model management device 130 is configured to maintain a decay model. The decay model is configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model. Having the decay model in the scheduling device 120 has the advantage that scheduling device 120 can use the decay model to determine whether or not to obtain a prediction from the generative model. On the other hand, efficiency benefits are available by having the generative model and decay model both in model management device 130.

[0084] Scheduling device 120 may be split into multiple systems, e.g., a cloud instance to export predicted or actual data to the applications, and an edge device located close to the data sources. The edge device does the actual interfacing with the data sources.

[0085] For example, data system 100 may be used in computer networks to reduce pressure on data sources, e.g., sensors, video sources, and the like. For example, embodiments reduce bandwidth pressure, computation pressure at the data source, etc. Embodiments may be applied to data collection scheduling or application data analytics. In particular, embodiments are useful for so-called digital twins in which a digital twin is trained to simulate the behavior of a physical entity, system, or process based on real-time or historical data from its corresponding physical counterpart. A digital twin may be a virtual representation of a physical object or system that is updated continuously using data from its physical counterpart to reflect its current state and behavior. Digital twins are used in a variety of industries to predict outcomes, optimize performance, and plan maintenance by analyzing both real-world and simulated data.

[0086] In this context, the digital twin relies on accurate and timely data from the data source or sources to maintain fidelity with the physical counterpart. However, querying the data source for real-time updates can impose significant resource demands, especially when multiple digital twins rely on the same data source. This creates potential bottlenecks in bandwidth, computational resources, and system efficiency.

[0087] Data system 100 addresses these challenges by leveraging scheduling device 120 and model management device 130 to balance the use of actual and predicted data. By incorporating a decay model, the system dynamically evaluates the accuracy of predicted data generated by the generative model. If the decay model indicates that the predicted data is likely to be accurate within an acceptable threshold, the system can avoid querying the data source, thereby reducing resource pressure.

[0088] In an embodiment, a system is provided that integrates generative or predictive modeling of data likely to reside at a given data source, e.g., an endpoint, with the scheduling of data retrieval from that data source. The model-generated data may be used not only for scheduling but also to influence decisions such as the selection of data sources, the polling order of multiple data sources, and the number of data sources polled within a specified timeframe. Application device 110 may comprise a processor system 111, a storage 112, and a communication interface 113. Scheduling device 120 may comprise a processor system 121, a storage 122, and a communication interface 123. Model management device 130 may comprise a processor system 131, a storage 132, and a communication interface 133.

[0089] In the various embodiments of communication interfaces 113, 123, and / or 133, the communication interfaces may be selected from various alternatives. For example, the interface may be a network interface to a local or wide area network, e.g., the Internet, a storage interface to an internal or external data storage, or an application programming interface (API), etc.

[0090] Application device 110, scheduling device 120, and model management device 130 are represented here as single devices. However, either one or both could just as well be implemented as systems, e.g., geographically distributed systems, e.g., cloud computing systems, e.g., a system comprising multiple computers. Furthermore, any one of application device 110, scheduling device 120, and model management device 130 could be implemented as a process running in a computer, e.g., a cloud computing system.

[0091] Storage 112, 122 and 132 may comprise, e.g., electronic storage, magnetic storage, etc. The storage may comprise local storage, e.g., a local hard drive or electronic memory. Storage 112, 122 and 132 may comprise non-local storage, e.g., cloud storage. In the latter case, storage 112, 122 and 132 may comprise a storage interface to the nonlocal storage. Storage may comprise multiple discrete sub-storages that together form storage 112, 122, and 132.

[0092] Storage 112, 122, and / or 132 may comprise, e.g., non-transitory storage. For example, storage 112, 122, and / or 132 may store data both in the presence and absence of power, such as a volatile memory device, e.g., a Random Access Memory (RAM). For example, storage 112, 122, and / or 132 may store data in the presence of power as well as outside the presence of power such as a non-volatile memory device, e.g., Flash memory. Storage may comprise a volatile writable part, say a RAM, and / or a non-volatile writable part, e.g., Flash. Storage may comprise a non-volatile non-writable part, e.g., ROM, e.g., storing part of the software. Scheduling device 120 may have access to a database, e.g., for storing actual data obtained from the data source, which may be used for responding to a request for current data, e.g., if the stored data is recent enough.

[0093] Devices 110, 120, and / or 130 may communicate internally, with each other, with other devices, external storage, input devices, output devices, and / or one or more sensors over a computer network. The computer network may be an internet, an intranet, a LAN, a WLAN, a WAN, etc. The computer network may be the Internet. Devices 110, 120, and / or 130 may comprise a connection interface arranged to communicate within data system 100 or outside of data system 100 as needed. For example, the connection interface may comprise a connector, e.g., a wired connector, e.g., an Ethernet connector, an optical connector, etc., or a wireless connector, e.g., an antenna, e.g., a Wi-Fi, 4G, or 5G antenna.

[0094] The communication interface 113 may be used to send or receive digital data, e.g., application data requests, responses from scheduling devices, or notifications regarding data updates. Communication interface 123 may be used to send or receive digital data, e.g., requests for generative model predictions, decay model accuracy estimates, or scheduling decisions regarding data polling. Communication interface 133 may be used to send or receive digital data, e.g., current data to update the generative models, model updates, e.g., generative or decay model updates, e.g., updated parameters, or predicted data.

[0095] Application device 110, scheduling device 120, and model management device 130 may have a user interface, which may include well-known elements such as one or more buttons, a keyboard, a display, a touchscreen, etc. The user interface may be arranged to accommodate user interaction for performing configuration of model parameters, monitoring system performance, or issuing manual overrides to automated scheduling processes.

[0096] The communication interface 123 may be used to communicate with other scheduling devices, e.g., for coordination of data polling schedules, exchange of shared model parameters, or balancing workloads across distributed systems, etc.

[0097] The execution of devices 110, 120, and / or 130 may be implemented in a processor system. Devices 110, 120, and / or 130 may comprise functional units to implement aspects of embodiments. The functional units may be part of the processor system. For example, functional units shown herein may be wholly or partially implemented in computer instructions that are stored in a storage of the device and executable by the processor system.

[0098] The processor system may comprise one or more processor circuits, e.g., microprocessors, CPUs, GPUs, etc. Devices 110, 120, and / or 130 may comprise multiple processors. A processor circuit may be implemented in a distributed fashion, e.g., as multiple sub-processor circuits. For example, devices 110, 120, and / or 130 may use cloud computing. Typically, application device 110, scheduling device 120, and model management device 130 each comprise one or more microprocessors which execute appropriate software stored at the device; for example, that software may have been downloaded and / or stored in a corresponding memory, e.g., a volatile memory such as RAM or a non-volatile memory such as Flash.

[0099] Instead of using software to implement a function, devices 110, 120 and / or 130 may, in whole or in part, be implemented in programmable logic, e.g., as field- programmable gate array (FPGA). The devices may be implemented, in whole or in part, as a so-called application-specific integrated circuit (ASIC), e.g., an integrated circuit (IC) customized for their particular use. For example, the circuits may be implemented in CMOS, e.g., using a hardware description language such as Verilog, VHDL, etc. In particular, application device 110, scheduling device 120 and model management device 130 may comprise circuits, e.g., for cryptographic processing, and / or arithmetic processing.

[0100] In hybrid embodiments, functional units are implemented partially in hardware, e.g., as coprocessors, e.g., cryptographic, arithmetic, or networking coprocessors, and partially in software stored and executed on the device.

[0101] Figure lb schematically shows an example of an embodiment of data system 102. Data system 102 may comprise multiple application devices; shown are application devices 110.1 and 110.2. Data system 102 may comprise multiple scheduling devices; only scheduling device 120 is shown. Data system 102 may comprise multiple model management devices; shown are model management devices 130.1 and 130.2. Data system 102 may comprise multiple data sources; shown are data sources 140.1, 140.2, and 140.3. The devices are connected through a computer network 172, e.g., the Internet.

[0102] For example, application devices 110.1 and 110.2 may request scheduling device 120 for current data from data sources 140.1, 140.2, and / or 140.3. Scheduling device 120 may either request predicted current data, obtained from model management device 130.1 and / or 130.2, or actual current data, obtained from data sources 140.1, 140.2, and / or 140.3.

[0103] In data systems 100 and 102, various computation elements and / or computation states are distributed over a scheduling device and a model management device; Different choices could, however, be made. For example, a generative model could be run directly in the scheduling device, or a scheduling function could be integrated with a model management device. The decay model could be integrated in a scheduling device, a model management device, in both, or run as a separate entity, e.g., in a decay model device.

[0104] Although reference is made to devices, e.g., scheduling device and model management device, this could be implemented as a system, e.g., a distributed system, or as a process in a distributed system, e.g., a cloud computing platform.

[0105] Figure 2a schematically shows an example of an embodiment of a data system 200 for providing current or predicted data from a data source. Data system 200 comprises an application 210, a scheduler 220, and one or more data sources 240, e.g., data sources. In this embodiment, scheduler 220 comprises a generative model 231, and a decay model 232. Generative model 231 could be implemented external to scheduler 220, e.g., through an external function call, e.g., an API or the like. Decay model 232 could be implemented external to scheduler 220, e.g., through an external function call, e.g., an API or the like.

[0106] Application 210 and scheduler 220 may be implemented, e.g., in a device, e.g., implemented as a process in a computer, e.g., a cloud computing platform. Application 210 and / or scheduler 220 may be implemented as application device 110 and a scheduling device 120, respectively. Typically, the one or more data sources 240 are external to both application 210 and scheduler 220, though this is not strictly necessary. For example, application 210, scheduler 220, and data sources 240 may all or in part be implemented on the same computing platform.

[0107] Scheduler 220 is configured for providing current or predicted data from a data source.

[0108] Data Sources

[0109] Data sources 140 produce data. Data sources 140 can be repeatedly accessed to obtain data. The data that is produced by a data source may be static or mostly static, e.g., stored at data source 140. In the latter case, requesting the same data again would result in the same data, unless the data was updated in the meantime. For example, the data source may comprise a file server, or a streaming data server, e.g., a video server. A generative model for this kind of data may predict a next element, e.g., a next frame of the file or stream.

[0110] The data that is produced by a data source may be dynamic, e.g., fresh, e.g., depending on external circumstances. For example, this may be the case for a sensor, e.g., configured to sense a physical parameter depending on the physical external world. For example, this may be the case for parameters of a computer network, a telecommunication network, and the like. A generative model for this kind of data may predict a next value of the physical parameter or computer network parameter. A computer network may be a telecommunication network.

[0111] Data sources 140 may comprise a so-called endpoint, e.g., a device within a communication network capable of sending, receiving, and / or processing data.

[0112] The data produced by the data source and generated by the generative model may comprise any one of the following: a. an image, a video, or a 3D model;

[0113] For example, the data source may comprise a sensor for sensing the image, video, or 3D model, e.g., a camera. For example, the data source may comprise storage for storing the image, video, or 3D model; the data source may be configured to produce a next part of the image, video, or 3D model as requested. The generative model may be configured to generate a predicted next part. This is very useful for streaming services such as video streaming services. If the streaming service is temporarily unable to provide requested data, generated predicted data may be used instead temporarily. This will significantly increase user experience, as instead of a blocking stream, the missing parts will be replaced with predicted frames — at worst the user may experience some artifacts due to imperfect predicting. Especially for interruptions of short duration, this may be hardly noticeable by a user.

[0114] Application device 110 requests a video frame from scheduling device 120. Scheduling device 120 evaluates whether to retrieve the frame directly from a data source, e.g., a camera or video storage, or use a predicted frame generated by model management device 130. If the predicted frame meets the required fidelity based on the decay model, the system avoids querying the data source, thereby conserving bandwidth. b. scheduling data or application data;

[0115] For example, there may be an interest in monitoring applications or other computer processes and / or networks, likely different applications than application 210. For example, the monitored applications may include web servers or the like. c. communications network data. In particular, the data source may be comprised in the communications network. For example, the data source may be an element in the communications network, e.g., telecommunications network, computer network, etc., configured to report on the status of the monitored application or process, etc. d. sensor data, wherein the data source comprises a sensor;

[0116] For example, the sensor may be an external sensor configured to measure some external parameter, e.g., temperature, pressure, etc. The data source may comprise the sensor used to obtain the current data. e. parameter data;

[0117] For example, parameter data may include metrics or configurations used for optimizing operations in a system, such as threshold values for process control, coefficients in predictive algorithms, or calibration settings for machinery.

[0118] Application device 110 requests parameter data from scheduling device 120 to fine-tune system behavior in real-time. If querying the actual parameter data source is not immediately feasible, scheduling device 120 retrieves predicted parameter data generated by model management device 130. The decay model evaluates whether the predicted data meets accuracy thresholds, ensuring the system operates within acceptable parameters without requiring immediate access to the actual data source. f. status updates.

[0119] For example, status updates may include operational states of devices, system health metrics, or progress indicators for ongoing processes. These updates may originate from dynamic systems such as manufacturing equipment, cloud computing services, or telecommunication infrastructure.

[0120] Application device 110 queries scheduling device 120 for real-time status updates of monitored devices. If the data source providing the updates is temporarily unavailable, scheduling device 120 utilizes predicted status updates generated by model management device 130. The decay model ensures these predictions are sufficiently reliable to act upon until actual updates are accessible, allowing uninterrupted system monitoring and decision-making.

[0121] It is not necessary that the data source comprises the sensor used to produce current data, or that the data source is embedded in the network or process being monitored.

[0122] Applications

[0123] There may be one or more application 210 that need data from data source(s) 240. Application 210 may be configured to request 211 current data from a data source 240, which may be actual current data, e.g., raw data, or predicted current data, e.g., modeled data. The request is received by scheduler 220. Figure 2a shows a data request from application 210 to scheduler 220. For example, an application may be a data streaming application, e.g., a video streaming application. Other examples are given herein.

[0124] In an embodiment, application 210 comprises a digital twin, e.g., implements a digital twin. Data system 200 may be used for data collection and scheduling for the digital twin.

[0125] Digital twins are synchronized with physical systems through sensing or other forms of data collection from data sources situated near the physical entity or system being twinned.

[0126] An important parameter in maintaining digital twins is the age of information, which refers to the maximum permissible age, or equivalently, the minimum required update frequency, of updates from the physical system needed to ensure acceptable fidelity in the digital twin. The maximum allowable age of information varies depending on the digital twin's use case and may even differ within a single digital twin model.

[0127] For digital twins that receive updates from a large number of data sources, such as sensors transmitting data via a network, stringent age of information requirements may result in significant spikes in bandwidth demands and the associated network processing required to provide these updates. To address this, scheduler 220 may manage sensor updates while adhering to age of information constraints.

[0128] For example, the paper “Age of Information- Aware Scheduling for Timely and Scalable Internet of Things Applications” by Lorenzo Comeo, Christian Rohner, and Per Gunningberg, which is incorporated herein by reference, describes such scheduling systems. However, the systems discussed therein do not incorporate model-generated data into the scheduling process. Nevertheless, existing scheduling algorithms may be combined with generating predicted data and / or using decay modeling data for selecting scheduling polling of actual data or using generated predicted data.

[0129] Generative Models

[0130] The generative model 231 may be configured for forward modeling of the data of the data source, e.g., configured to generate predicted data for the data source at a given future time point. The given future time point may be provided by scheduler 220. The given future time point may be fixed, e.g., it may be the current time or a fixed amount of time in the future compared to when the last actual data was obtained that was used to update the generative model.

[0131] For example, if data is obtained from the data source at time t0, which is also used to update the generative model, then at a later time point t15which is a future time point with respect to time point t0, the generative model may be used to generate predicted current data, which is — t0time in the future. In an embodiment, the decay model may be queried to obtain an estimate of the quantitative measure of the accuracy of a prediction based at least on fact that the last update of the generative model was — t0time ago. Generative model 231 may comprise a generative Al model, which is capable of producing realistic outputs for the data source. Generative Al models are available for various data types, including images, videos, and, more recently, 3D models. In certain implementations, these models generate data that is conditioned on the recent history of a true dataset, enabling them to extrapolate into the future. How far into the future generative model 231 predicts, or can reliably predict, may be limited, e.g., to at most a minute in the future. It may even be severely limited, e.g., at most 5 seconds. Nevertheless, having the ability to relay traffic even for 5 seconds can reduce pressure on the data source.

[0132] Nevertheless, scheduler 220 will repeatedly obtain data from data source 240, since the ability of generative model 231 to predict data will be limited. Figure 2a shows a polling 241 of data source 240 by scheduler 220, e.g., if generative model 231 is insufficient to obtain data.

[0133] The data obtained from data source 240 is used to maintain generative model 231 for forward modeling of the data of the data source. For example, maintaining the generative model may comprise updating parameters of the generative model to reflect data obtained from the data source.

[0134] Various types of models may be used for the generative model, depending on factors such as the type of data, desired accuracy, and available computational capacity.

[0135] A generative model configured for temporal prediction tasks — such as continuous sensor readings, image or video prediction, application data, or scheduling data — may include recurrent neural networks, such as Recurrent Neural Networks (RNNs), Long Short-Term Memory Networks (LSTMs), or Gated Recurrent Units (GRUs), as well as time-series models like ARIMA or Prophet.

[0136] For predicting image or video data, the generative model may comprise Convolutional Neural Networks (CNNs), Generative Adversarial Networks (GANs), or Transformer-based models, such as Vision Transformers (ViTs). These models are also suitable for 3D data, although Neural Radiance Fields (NeRFs) are a particularly effective option for 3D modeling tasks.

[0137] For network-related data, such as computer or telecommunication network data, the generative model may comprise Graph Neural Networks (GNNs). Other options include Kalman Filters, useful for time-series forecasting in sensor data, and autoencoders, which are effective for anomaly detection and prediction when sensors produce multidimensional data.

[0138] For small datasets, particularly those requiring uncertainty modeling of parameters, Gaussian Processes (GPs) may be used.

[0139] A generative model may model only one particular data source; however in an embodiment, the generative model models multiple data sources. If the multiple data sources are correlated, even if the correlation is weak, joint modeling of the multiple data sources can significantly improve predictive accuracy and / or reduce the model size, e.g., the number of parameters of the model. For example, multiple sensors that obtain measurements of multiple physical parameters, e.g., multiple temperature sensors, multiple pressure sensors, and so on, where the sensor data is related, e.g., because the sensors are close to each other, or are measuring the same physical process.

[0140] Joint modeling has the advantage that the model can be updated when receiving data from any modeled data source. For example, past sensor readings may be represented as a sequence of data, e.g., including sensor IDs to distinguish between different sensors. A model capable of reading sequence data, e.g., a type of recurrent neural network, or a transformer neural network, may read the sequence and predict any desired sensor, e.g., by including the desired sensor ID in the sequence.

[0141] Decay Models

[0142] Data system 200 may comprise a decay model 232. The decay model is configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model.

[0143] When generating forward-looking data, quantitative predictions may be made regarding the degradation of realism in the generated content as a function of time. An example of a decay model and a corresponding generative model is demonstrated, for example, in the paper “Video Prediction by Efficient Transformers” by Xi Ye and Guillaume-Alexandre Bilodeau, which is incorporated herein by reference for its description of a frame predictor. In this context, realism refers to the degree of accuracy with which the Al-generated content corresponds to the true, ground-truth content or event.

[0144] When new actual data of a data source is available, it can be used to update the generative model but also to update the decay model. For example, when new actual data is available it can be compared to predicted current data that was predicted for the same time point. In this way, a pair of values is obtained: an age of the last update of the generative model, e.g., how old the data was from which the model was working, and an accuracy value. Such a tuple may be used to update the decay model.

[0145] The decay model will typically use as input how long ago were the last update or updates, e.g., the last k updates, where k is a positive integer, e.g., 4 or more. The decay model may use additional inputs. For example, the decay model may receive as input the last data or last multiple data obtained from the data source. This is helpful since the data may indicate that the data source is currently in a state of flux, e.g., in turbulence, so that estimates of accuracy need to be lowered.

[0146] Video prediction example

[0147] An embodiment of the generative model comprises a system configured to predict future data from a data source, where the predicted data may include, e.g., video frames. The generative model may be capable of predicting, for example, 10 future frames given 2 past frames, or 10 future frames given 10 past frames.

[0148] The generative model may utilize neural network architectures for video frame prediction, e.g., Transformer-based models incorporating attention mechanisms. These models may predict a number of future frames (NN) based on a set number of past frames (LL). The neural network may include components such as Transformer attention blocks, which may feature local spatial multi-head self-attention and temporal multi-head self-attention, as well as convolutional feed-forward networks.

[0149] Variants of the Transformer-based generative model may include: A fully autoregressive model that predicts one frame at a time based on a sequence of previous frames; A partially autoregressive model combining a Transformer encoder for past frames with an autoregressive decoder for future frames; A non-autoregressive model that may predict multiple frames simultaneously using learned frame queries.

[0150] Training and usage of the generative model may involve specific techniques. For example, the attention mechanism may include spatial attention applied to localized spatial patches to reduce complexity and temporal attention to model dependencies between video frames. The model’s architecture may incorporate an autoencoder, where the encoder extracts latent features from video frames, and the decoder reconstructs video frames from these latent features.

[0151] The training process may proceed in stages. In a first stage, the encoder and decoder may be trained as a standalone autoencoder. In a second stage, the autoencoder may be frozen, and the Transformer models may be trained using specialized loss functions. During inference, non-autoregressive models may predict multiple future frames in parallel, offering speed advantages and avoiding cumulative errors. In contrast, autoregressive models may predict frames sequentially over latent features or reconstructed pixels but could accumulate errors.

[0152] The implementation of the generative model may prioritize efficiency, employing layer normalization, multi-head self-attention, and depth-wise convolution. Such optimizations may result in significantly faster training and inference compared to alternative models commonly used in the art, e.g., convolutional LSTM-based architectures.

[0153] The parameters of the generative model may be updated continuously as new video frame data becomes available from the data source, allowing the model to adapt to evolving patterns in the video stream. This maintenance process may involve retraining specific components or fine-tuning the entire model architecture based on the newly acquired data.

[0154] An embodiment of the generative model may further include or be associated with a decay model configured to estimate a quantitative measure of the accuracy of the predicted video frames produced by the generative model. The decay model may analyze the temporal and spatial coherence of the predicted frames relative to expected patterns learned during training. For example, the decay model may calculate a frame-level error metric, such as mean squared error between latent representations of predicted frames and reference frames, or analyze the degradation of structural similarity between consecutive frames over time.

[0155] The decay model may be implemented as an auxiliary neural network or as a statistical estimator integrated with the generative model's architecture. It may leverage features extracted by the encoder to compare predicted frames against prior predictions or feedback from observed data. Temporal attention mechanisms within the decay model may assess long-term dependencies, identifying deviations in the continuity of predicted motion or objects across frames. The decay model's accuracy estimates may also depend on factors such as pixel-wise differences between predicted frames and observed frames once they become available, or metrics like temporal consistency and blur or distortion artifacts that increase with prediction distance.

[0156] The quantitative accuracy estimates produced by the decay model may be used to assign confidence scores to predictions, trigger selective retraining of specific components or fine-tuning of the entire generative model architecture, or adjust the weight given to different predicted frames in downstream applications. For example, when the decay model indicates that prediction reliability is falling below a predefined threshold, the generative model may prioritize updating parameters for specific types of video sequences or adjust its prediction strategy (e.g., shifting between autoregressive and non-autoregressive modes) to improve accuracy. This allows the system to dynamically adapt to changing input patterns and improve reliability over extended sequences while optimizing its performance in practical applications.

[0157] Figure 3 schematically shows an example of graph 300 schematically showing model accuracy versus time. Shown in Figure 3 is a time axis 311. Further to the right on axis 311 corresponds with predictions further into the future, e.g., predicting data of a data source based on actual data that lies further in the past. Shown in Figure 3 is an accuracy axis 312. Graph 300 shows that as prediction needs to be made further into the future the accuracy decreases. Accuracy may be measured in various well-known units, e.g., squared error or the like.

[0158] The accuracy measure may quantify the overlap between predicted and actual data. For example, the accuracy measure may indicate the percentage of the prediction that was correct. For example, the accuracy measure may comprise a mean squared error, peak signal to noise ratio, etc. For numeric data, e.g., sensor values and the like, the accuracy measure may comprise the absolute error, or the relative error. Generally speaking, the accuracy measure may be a function of a difference between the predicted and actual data, and possibly the type of data.

[0159] The schematic graph shows that a prediction for a moment in time shortly after receiving the last actual data, e.g., shortly after the last update of the generative model, has a high accuracy. In fact the accuracy of predictive data is typically higher than that of cached data.

[0160] Returning to Figure 2a.

[0161] In Figure 2a, decay model 232 is external to generative model 231. This is not necessary though. Decay model 232 may be combined with generative model 231. For example, generative model 231 may be configured to produce two outputs: predicted current data for a data source, and a quantitative measure of the accuracy of a prediction. For example, this may be used by querying the model and discarding the predicted current data if the accuracy is not good enough, but using it, e.g., returning it to a caller, if the accuracy is good enough. Handling Data Requests

[0162] Scheduler 220 receives from application 210 a request for current data from the data source. In response, scheduler 220 may obtain from decay model 232 an estimate of the quantitative measure of the accuracy of a prediction by generative model 231 of the current data of the data source. Based on the estimate, scheduler 220 selects to obtain actual current data from the data source, or to obtain predicted current data from the generative model. In some cases, the scheduler 220 may always obtain predicted current data but decide on whether to also obtain actual current data from the data source based on the estimate. Obtaining actual current data from a data source may comprise polling the data source for the data; it may comprise waiting for the data source to push the data. Optionally, scheduler 220 may modify a push schedule of the data source, e.g., to better fit current data requirements.

[0163] Depending on the selection, scheduler 220 returns the actual current data or predicted current data. Figure 2a shows returning of data 213. The actual current data is also referred as raw data. The predicted current data is also referred to as modeled data.

[0164] For example, it may be determined whether the estimate obtained from decay model 232 meets a minimum accuracy requirement — if so, the predicted current data is used, if not actual current data is obtained from data source 240.

[0165] The minimum accuracy requirement may be fixed or may be determined by scheduler 220. For example, the minimum accuracy requirement may be set by scheduler 220 based on load, e.g., bandwidth used, computation load, e.g., load of scheduler 220 and / or of data source 240. If load is low, scheduler 220 may set the minimum accuracy requirement to more accurate, while if load is high, the minimum accuracy requirement may be reduced to less accurate.

[0166] The request of application 210 may contain the minimum accuracy requirement. For example, request 210 may comprise one or more of the following: a maximum age for the last data obtained from the data source and used to maintain the generative model, a minimum accuracy of the decay model, etc. If the request of application 210 contains a minimum accuracy requirement, then this may be taken as a condition for returning predicted current data.

[0167] Scheduler 220 may query application 210 for the required accuracy. Figure 2a shows such an optional accuracy query 212.

[0168] In an embodiment, multiple data sources are used with a generative model that jointly models the multiple data sources. Likewise, a decay model may be used that jointly models accuracy for each of the multiple data sources. In an embodiment, a joint model is used for both, but note that it is not necessary to use a joint model for generation if a joint model is used for decay modeling, or vice versa.

[0169] For example, at time t0, application 210 may be provided with current data for data source 240 — either actual current data, e.g., data actually obtained from data source 240, or predicted current data, e.g., data which is predicted to be present at data source 240, e.g., present at time t0. Later, at time t15scheduler 220 receives from application 210 a request for current data from data source 240. At this point, scheduler 220 may decide to retrieve actual current data from data source 240, though this does not necessarily have to happen at time t15but may happen at a later time t2. In an embodiment, current data that is returned to application 210 is combined with a timestamp reflecting the moment the returned data was present at data source 240, or was predicted to be present.

[0170] Furthermore, the request received at time may contain a request for data current at a particular point in the time treqin the future, e.g., 1 second from now. In this case, scheduler 220 may retrieve the data from the data source at treq, or possibly at a later time. Interestingly, using the generative model, predicted current data may immediately be generated, for a time treq— t0, provided the decay model considers the accuracy of such a prediction sufficient.

[0171] The request may be a request for current data from one or more of the multiple data sources. If scheduler 220 decides to obtain actual data, e.g., due to low accuracy, then one option would be to obtain it from the requested one or more of the multiple data sources. Interestingly, in this situation the scheduler has the option to retrieve data from a different data source or sources, e.g., one that is not requested, and / or from fewer data sources than were requested. The retrieved data is used to update the generative model, which in turn is used to generate a prediction for the data source(s) for which data was not retrieved — possibly also checking the decay model for its accuracy. This allows for reducing bandwidth even when data needs to be obtained. For example, scheduler 220 may select the data source from which to obtain current data.

[0172] Yet another option is for the scheduler to obtain current data, e.g., in case of lacking accuracy, but at a lower resolution or sampling rate than requested. A generative model may then be used to upscale the lower resolution or lower sampling rate to the requested resolution or requested sampling rate. The generative model and / or decay model may also be updated using the lower resolution data. Considering other factors than timeliness, e.g., like sample rate or resolution, give the scheduler more control; for example, to reduce bandwidth by reducing the sampling resolution but still maintain desired realism. For example, generative model may take as input, or as a further input or as an optional input, data at a lower resolution and / or sampling rate, in which case the generative model may upscale data from the data source from a lower to a higher resolution and / or sampling rate. Even with lower resolution or lower sampling rate data as an input, the generative model may still predict data for a different time point. Also the lower resolution or lower sampling rate data may be used for updating the model.

[0173] For example, the generative model may not be configured for forward modelling, but may be configured for upscaling data obtained the data source. In an embodiment, the scheduler always retrieves data from the data source to satisfy a request, but decides based on the decay model at which resolution and / or sampling rate. For example, the decay model may receive as input the resolution and / or sampling rate of the data that will be obtained from the data source, and use it to determine expected accuracy.

[0174] 3GPP Data Scheduling and Sharing

[0175] In the 3 GPP standards, the Application Data Analytics Enablement (ADAE) framework enables the collection of data from sources including the core network, with delivery via various APIs. See, for example, any of the following documents, which are incorporated herein by reference:

[0176] 1. 3GPP TS 23.482 V0.3.0 (2024-10). "Functional architecture and information flows for AIML Enablement Service; (Release 19)." Technical Specification Group Services and System Aspects, 3rd Generation Partnership Project (3GPP), October 2024.

[0177] 2. 3GPP TR 23.700-36 V18.1.0 (2022-12). "Study on Application Data Analytics Enablement Service; (Release 18)." Technical Report, Technical Specification Group Services and System Aspects, 3rd Generation Partnership Project (3GPP), December 2022.

[0178] These frameworks encompass data types such as scheduling data, application data, and others. In scenarios where generative models can be applied to these data types, the framework provides valid use cases for the present disclosure.

[0179] Scheduling

[0180] In data system 200, current data is repeatedly obtained from data source 240, e.g., by scheduler 220. For example, scheduler 220 may repeatedly poll data source 240 for new current data. Data source 240 may optionally also push new current data to scheduler 220. In particular, polling of new current data may be done based on an estimate of the quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source. For example, if the estimate is below a threshold, e.g., below a required accuracy, e.g., as requested by application 210, then scheduler 220 may decide to obtain actual current data from data source 240 instead of relying on predicted data. Note, in some cases predicted data may nevertheless be generated, but it may then not be returned to application 210.

[0181] Should scheduler 220 decide to poll data source 240, this does not have to be done immediately. In an embodiment, obtaining actual current data from the data source is scheduled for a future time. This may mean keeping application 210 waiting for its requested data as well, and returning the actual current data from the data source after it has been obtained at the scheduled future time.

[0182] The future time could be determined in a variety of ways. For example, the future time may be a fixed duration starting with the receipt of the request. For example, the future time may be a fixed duration starting with the last time actual current data was obtained.

[0183] In an embodiment, after deciding to obtain actual data, the scheduler may start a waiting interval. During the waiting interval, the scheduler waits for further requests for actual current data, typically from other applications or even other devices. Once a sufficient number of further requests has been received, which number may be 1, the scheduler proceeds with immediately obtaining the actual data. Once the waiting time elapses without reaching the number of further requests, the scheduler may immediately proceed with obtaining the actual data anyway.

[0184] The two approaches may be combined, e.g., waiting for a number of further requests, but also imposing a minimum fixed duration between subsequent polls.

[0185] In an embodiment, scheduler 220 caches the most recent data from a data source together with a timestamp. After receiving a request for the data source, scheduler 220 determines the age of the cached data. If the age is less than a first threshold, scheduler 220 returns the cached data. If the age is above the first threshold, scheduler 220 obtains an estimate of the accuracy of generated data and uses generated data if the accuracy is high enough, otherwise scheduling a poll of the data source.

[0186] In an embodiment, scheduler 220 caches the most recent data from a data source together with a timestamp. After receiving a request for the data source, scheduler 220 determines the age of the cached data. If the age is less than a first threshold, scheduler 220 returns the cached data. If the age is above the first threshold and less than a second threshold, scheduler 220 obtains generated data and returns it to application 210. If the age is above the second threshold, scheduler 220 obtains an estimate of the accuracy of generated data and uses generated data if the accuracy is high enough, otherwise scheduling a poll of the data source.

[0187] When receiving or requesting updates across a network (e.g., sensor updates, parameter updates, or status updates via APIs), it is advantageous to minimize the frequency of polling the data source where the data resides or is measured. This optimization reduces network bandwidth consumption and minimizes the load on the sensor.

[0188] In many use cases, particularly those supporting real-time digital twins, it is also beneficial to avoid situations where multiple simultaneous or near-simultaneous updates are requested. Such situations can lead to repeated polling of data sources in quick succession or the unnecessary polling of multiple sensors measuring the same underlying event, resulting in so-called signal storms. For example, if multiple sensors capture data related to a single event of interest to the digital twin system, inefficiencies may arise.

[0189] To address these issues a scheduling mechanism may be used to schedule sensor updates and similar processes.

[0190] Scheduling of updates, such as those from sensors, may be managed by a generative digital twin. In this approach, sensor data may be modeled alongside its expected decay time, which represents the loss of realism or accuracy as a function of time. This decay is compared with a minimum realism requirement specified by a given application. A scheduler may be configured to attempt to provide data that meets both realism and timeliness requirements while minimizing polling the data source. This may be achieved by leveraging the generative digital twin wherever feasible. The behavior of the scheduler can optionally be modified by a policy tailored to specific downstream applications.

[0191] Using this method, the age of information for downstream digital twins may be improved while reducing network overhead. This reduction is accomplished by minimizing the number of times a sensor is polled or by controlling peak bandwidth usage.

[0192] Note that using a scheduler may be avoided, e.g., by combining the relevant feature with the generative model. In this approach, a request for data would directly go to the generative model. The generative model in turn decides when and / or how often to poll the data source in order to maintain a minimum accuracy as indicated by the decay model. In this approach the decay model may be external to the generative or may be integrated with it.

[0193] Scheduler 220 may be implemented as a scheduling application and / or a scheduling system.

[0194] Figure 2b schematically shows an example of an embodiment of a data system 201 for providing current or predicted data from a data source. In data system 201, both generative model 231 and decay model 232 have been installed in a model manager 230. It is also possible to install only one of the generative model 231 and decay model 232 in the model manager 230, e.g., only generative model 231 or only decay model 232.

[0195] Model manager 230 may be implemented as a model manager application and / or a model manager system.

[0196] For example, if scheduler 220 has obtained data from data source 240, e.g., as a result of pull or push, that is actual current data, then scheduler 220 may provide 221 the actual current data to model manager 230. Model manager 230 in turn will use the data to update generative model 231 and / or decay model 232. Scheduler 220 may still schedule polling of data source 240.

[0197] If scheduler 220 needs predicted current data, the scheduler 220 may retrieve 222 predicted current data, e.g., through an API. If scheduler 220 needs predicted decay data, e.g., an estimate of quantitative measure of the accuracy of the predicted data, then this may be retrieved 223 by scheduler 220, e.g., through an API.

[0198] Data systems 201, and 202 may be implemented in the context of the Application Data Analytics Enablement (ADAE) framework. For example, an Application Data Analytics Enablement (ADAE) server may comprise scheduler 220. For example, an ADAE Client may comprise application 210.

[0199] Figure 4 schematically shows an example of an embodiment of a method 400 for providing current or predicted data from a data source. Method 400 may be computer implemented. Method 400 comprises repeatedly obtaining (410) data from a data source, providing (420) the obtained data for maintaining a generative model for modeling of the data of the data source, the generative model being configured, to generate predicted data for the data source at a given time point, and for maintaining a decay model, the decay model being configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model receiving (430) a request for current data from the data source, and in response obtaining (440) from the decay model an estimate of the quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source, based on the estimate selecting (450) to obtain actual current data from the data source, or to obtain predicted current data from the generative model, returning (460) actual current data or predicted current data.

[0200] In a further example of an embodiment of the method, the method comprises

[0201] - repeatedly obtaining (410) data from a data source,

[0202] - providing the obtained data for maintaining a generative model for forward modeling of the data of the data source, the generative model being configured to generate predicted data for the data source at a given future time point, and for maintaining a decay model, the decay model being configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model

[0203] - receiving (430) a request for current data from the data source, and in response obtaining (440) from the decay model an estimate of the quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source, based on the estimate selecting (450) to obtain actual current data from the data source, or to obtain predicted current data from the generative model, returning (460) actual current data or predicted current data.

[0204] Below several further optional refinements, details, and embodiments are illustrated.

[0205] A system is described for making scheduling decisions, such as requesting updates from sensors in a network. This is achieved by calculating the expected decay of realism over time for a simulation of the requested data, such as sensor data. Simulated data, e.g., predicted current data from the generative model, may be provided when the calculated realism satisfies the requirements of an end application. Additionally, the system may consider other factors than scheduling. These factors may include selecting endpoints or determining the characteristics of the data to retrieve, such as resolution. The system may be configured to produce modeled data that meets the desired accuracy, and thus further increase performance.

[0206] The system may comprise several components working in together to manage data collection, modeling, and distribution.

[0207] Endpoints represent locations where raw data is generated. These can include sensors generating sensing data, cameras capturing video, or logical processes generating network-related data. Endpoints communicate externally through a communications network, such as a 5G network. In the context of ADAE, as discussed in TR 23.700-36, endpoints may correspond to data sources, which could include 5GS data sources (e.g., 5GC, 0AM), enablement layer data sources (e.g., SEAL, EEL), or external data sources located on the DN side, such as VAL servers or VAL UEs.

[0208] Downstream applications consume data provided by the endpoints. An example of such an application is a digital twin service that uses sensor data to create a real-time model of a physical asset. These applications communicate with endpoints and other system components through the communications network. In ADAE contexts, a downstream application may comprise an ADAE client.

[0209] One or more Generative models are used within the system to generate simulated data that approximates what would be measured or held at a target endpoint. Examples include models capable of producing simulated video, 3D data, or other data types.

[0210] One or more decay models configured to estimate the accuracy degradation of the simulated outputs produced by the generative models as a function of time. These decay models may be integrated within the generative models, where the generative model itself provides decay statistics for each output. Alternatively, decay models may operate as standalone processes that take inputs from the generative models to produce accuracy decay estimates.

[0211] A model management system is responsible for managing generative and decay models by providing inputs (e.g., data and control signals) and processing their outputs. This system may be implemented as a dedicated server, such as an edge server that directly runs the models, or as a logical process within a network that connects to remotely hosted models.

[0212] A scheduler configured to schedule the timing of updates from endpoints and may also control additional update parameters, such as datatype, sample size, or sampling fidelity. The scheduler can be implemented as software hosted on an edge server, within a core network, or accessible through other networked configurations. It may communicate with all system components, including endpoints and downstream applications, to ensure efficient data collection and distribution. In ADAE implementations, the scheduler may form part of or be hosted on an ADAE server.

[0213] Below further embodiments of a method is provided. The method has two parts: An ongoing background model update process, and a main process. In the updating process the Model Management System may obtain, in an ongoing manner, any available updated data from the Endpoints, without necessarily polling them directly. For example, each time the Endpoints provide raw updated data to the Scheduler for forwarding to the Downstream Application, this data may be stored by the Model Management System. The Model Management System uses this data to keep its Generative and Decay Models updated.

[0214] The Model Management System may be configured to run its Generative and Decay Models continuously in the background, thereby minimizing response time if a request for modeled data is received. Alternatively, the Model Management System may store the most recently received raw data update and execute the models on-demand, which may reduce resource consumption at the expense of increased response time.

[0215] If the updates received in this passive manner are not sufficient, the Model Management System may request new raw data from the Endpoints via the Scheduler. For example, the Model Management System may acts as a Downstream Application, but specifies that only raw data is acceptable.

[0216] If the Scheduler receives a request for raw data from the Model Management System for an Endpoint for which it has had no other requests for a time period, and has no predicted upcoming requests, it may inform the Model Management System that the number of requests was low, and that serving the present request may generate a new, unique poll of the Endpoint (which may use unnecessary resources at the Endpoint).

[0217] Using this information, the Model Management System can decide whether to go ahead with the request (in which case it informs the Scheduler of this decision) or to wait for a time period (which might be a pre-set period, or wait until the next time a polling request is generated by a true Downstream Application).

[0218] For example, the Model Management System might decide to wait for data coming from Endpoints which have limited resources, to avoid creating undue burden on them. On the other hand, it might decide to proceed immediately in the case of Endpoints whose data is more challenging to model, to avoid excessive lag caused by updating the Generative Model if several new requests from Downstream Applications should come in (e.g., keeping the model up to date may be considered worth the additional burden for those types of Endpoints). This is to avoid the updates to the Model Management System creating an artificial burden on the Endpoints, especially those which are not frequently used.

[0219] The time period and types of difficult to model Endpoints above will be application-specific, and permissions related to this behavior by the Model Management System might be controlled by a policy.

[0220] The Model Management System preferably keeps under review the accuracy of its models (e.g., it examines the output of the Decay Model), and may attempt to match the frequency of the updates it receives and / or requests form the Scheduler to the current accuracy of the models.

[0221] For example, if the Scheduler provides raw data updates to the Model Management System which were not required by it, e.g., because the Generative Model was already very accurate without these updates, the Model Management System may choose to inform the Scheduler to reduce the frequency of raw data updates. The Scheduler can then save networking resources by not transmitting these updates to the Model Management System.

[0222] The Main Process may be as follows. The Scheduler receives a request for an update from a Downstream Application, which would normally require that the Scheduler return data from at least one Endpoint. This could be, for example, generated by an edge server which schedules updates from a sensor network; or it could be a request made for network-specific data, e.g., via an API call.

[0223] The Scheduler requires, for each request, a list of relevant Endpoints, and information about the required timeliness and accuracy of the returned data. The Scheduler has several options for how to acquire the list of Endpoints and timeliness data, e.g., as below.

[0224] 1. Li st of Endp oints : a. The Downstream Application may provide the list of Endpoints, e.g., request data from specific sensors b. The Downstream Application may request a particular type of data, and the Scheduler may choose from a list of possible Endpoints the most suitable ones to poll. i. In this case, the choice of Endpoints may be affected by the output from the Generative Model (see later steps). In that case, the method can iterate to come to a preferred group of Endpoints.

[0225] 2. Required timeliness and accuracy: a. The Scheduler may request from the Downstream Applications the required timeliness and accuracy directly - possibly in a standardized format. b. The Scheduler might request a desired accuracy from the Downstream Application, and use the Decay Model to calculate the required timeliness. c. The Scheduler might store the required timeliness and / or accuracy in a list to be applied to a specific Downstream Application, or a broad class of Downstream Applications. d. Alternatively, the Scheduler might apply a minimum timeliness value according to a policy, even if that is more stringent than what might otherwise be achievable based on the Decay Model

[0226] In some embodiments, the Scheduler can control parameters of the data collection other than time-based scheduling. In these cases, the Scheduler requests or calculates the relevant required parameters from the Downstream Applications similar to the steps above.

[0227] For example, the Scheduler might decide on how many Endpoints from a group to poll directly and how many to rely on a modeled version to reach a given accuracy; in this case it may request or calculates the required accuracy for the Downstream Application.

[0228] In a second example, the Scheduler might poll the same Endpoints but use different settings, e.g., a lower or higher sampling resolution, depending on the required accuracy. The scheduler might upgrade the lower sampling resolution data by providing it to a Generative Model. In this case, it requires a minimum required accuracy value from the Downstream Application.

[0229] Preferably, the Downstream Application also informs the Scheduler whether it intends to make a repeat request for the same data and if so, the planned timing of this request. Optionally, the Downstream Applications may inform the Scheduler that no modeled data is to be used (permission to use this behavior may be controlled by a policy). In this case the method terminates, and raw data is requested from the Endpoints.

[0230] Given the list of Endpoints and the required timeliness / accuracy of their data, the Scheduler accesses the Model Management System to identify the presence of suitable Generative Models and Decay Models.

[0231] Where a suitable Generative Model for all required Endpoints is found, the method can proceed to the next step. However, where a model for only some but not all Endpoints is found, the Scheduler may take a multi-step process. First, poll the Endpoints for which no Generative Model is found, and return their raw data both to the Downstream Application and the Model Management System. Receive an updated request from the Downstream Applications for only the data they still require in light of the raw data they already received. The Scheduler can then attempt to provide this reduced set of data. An advantage is that the raw data from the Endpoints that had to be polled in any case, e.g., because there was no relevant model, could in some cases reduce the requirements for or required accuracy of the remaining required data, which could save resources.

[0232] For any data that is still required by the Downstream Applications, the Scheduler proceeds to the remainder of the method.

[0233] Where no suitable model is found, the process exits and the Scheduler contacts the Endpoints directly and returns their raw data to the Downstream Applications (and optionally, the Model Management system which may attempt to create a new model of that Endpoint).

[0234] The Model Management System returns to the Scheduler, for the selected Endpoints, the current output of the Decay Model. It also returns a time since the last raw data update was received from the Endpoint.

[0235] The Scheduler uses the time since a last raw update received to estimate the current accuracy level of the Generative Model using the Decay Model. It compares this with the accuracy required by the Downstream Application(s).

[0236] Where this accuracy level is higher than the minimum required by the Downstream Applications, the Scheduler accesses the Generative Model to retrieve its current, modeled version of the data likely to be held at the Endpoint, and returns this to the Downstream Application.

[0237] Preferably, the Scheduler appends a label to this data indicating that it is simulated rather than raw data; this label may include the estimated accuracy given by the Decay Model. Where the accuracy level is lower than the minimum required, the Scheduler polls the Endpoint directly to retrieve its raw data and forwards this to the Downstream Application (for use) and the Model Management System (to update its models for that Endpoint).

[0238] Where the Downstream Application has provided information about any scheduled future updates, the Scheduler may choose a good time to poll the Endpoint to ensure that the Generative Model is expected to be outputting data of at least the required accuracy, given the rate of decay of the accuracy expected via the Decay Model.

[0239] In some cases this might involve polling the Endpoint before the accuracy would drop to a level that strictly requires it - for example, the Scheduler might choose to poll the Endpoint early if there is a quiet time of low resource utilization in the network.

[0240] The scheduler may also look at any other requests for data for that particular Endpoint coming from different Downstream Applications within a short time period of each other, preferably when deciding whether or not to use modelled data or raw data.

[0241] For example, the multiple Downstream Applications might be several external digital twin systems which are using data from a single sensor, but are not otherwise communicating with each other (for example, the sensor may be attached to a machine which has its own digital twin, but is also part of a digital twin of a factory, as well as a digital twin of all similar machines from the same manufacturer).

[0242] If it has several requests for data from a single Endpoint, the Scheduler may choose to poll the Endpoint with a timing that is set by that Downstream Application which requires the highest accuracy, e.g., most frequent raw data updates from the Endpoint, and use the Generative Model to provide modeled data to other requests which come in for some period afterwards.

[0243] The benefit of this is that the Endpoint is polled only according to the frequency required by one Downstream Application, but all Downstream Applications receive a data output instantly upon request which matches their accuracy needs. In the absence of this system, the scheduler would either have to poll the Endpoint more frequently (wasting resources), or could not provide updates instantly (e.g., it would have to batch the requests).

[0244] The Scheduler returns the data, either raw or modeled, to the Downstream Application(s). In case of raw data, the scheduler may also forward the raw data to the Model Management System, e.g., to be used in its Background Model Update Process of the generative model and / or decay model. As mentioned, where modeled data is returned to a Downstream Application the Scheduler may preferably add metadata indicating that it is modeled rather than raw data, and giving the estimated accuracy, e.g., as derived from the Decay Model.

[0245] The above method has mainly focused on scheduling of updates from Endpoints (e.g., deciding on the timing of data collection). The system may also be used to affect other properties of the data collection - including;

[0246] Sampling resolution. For example, the Scheduler may choose to request raw data from Endpoints at a lower sampling resolution if it can use the Generative Model to upscale this to a suitable accuracy.

[0247] The choice of Endpoint. For example, the Scheduler may choose to poll an Endpoint which provides lower resolution data of the same scene or event, if that has some other advantage like lower power consumption or data rate.

[0248] The number of Endpoints polled for a given request. For example, the Scheduler might choose to only poll a fraction of the number of Endpoints that would naively be used to fulfil a given request, if it can use their data to provide modeled data of suitable accuracy.

[0249] In all cases the method may proceed based on the same logic. For example, the Scheduler compares an accuracy level estimated by the Decay Model to the required accuracy level known or estimated for the Downstream Applications.

[0250] An advantage is the ability to provide instant updates to Downstream Applications using modeled data while respecting their accuracy requirements, but without always having to poll the Endpoints for every request. This contrasts to known systems which would either have to poll the Endpoints for each new request (which uses a lot of resources), or can reduce the polling of the Endpoints but cannot provide instant updates, e.g., because they need to batch several requests for updates in an attempt to optimize the scheduling, which leads to a delay for many or most requests.

[0251] A further advantage is that the scheduler is able to use the time period during which accuracy is expected to remain above the required threshold to optimize other networking tasks in light of any other requests or tasks it has to perform - for example, by choosing to poll an Endpoint at an earlier or later time to match with a quiet period of low resource usage, and provide the modeled data in the interim. Figure 5 schematically shows an example of an embodiment of a method 500 for providing current or predicted data from a data source. Method 500 may be computer implemented and comprises the following stages.

[0252] Ongoing / Background Model Update Process

[0253] 510: The Model Management System continuously obtains any available updated data from the Endpoints. The Model Management System uses this data to keep its Generative and Decay Models up-to-date.

[0254] Main Process

[0255] 520: The Scheduler receives a request for an update from a Downstream Application, which would normally require that the Scheduler return data from at least one Endpoint.

[0256] 530: The Scheduler obtains, for each request, a list of relevant Endpoints, and information about the required timeliness and accuracy of the returned data. (This could be directly communicated by the Downstream Application, or calculated by the Scheduler based on parameters known about the Downstream Application).

[0257] 540: Given the list of Endpoints and the required timeliness / accuracy of their data, the Scheduler accesses the Model Management System to identify the presence of suitable Generative Models and Decay Models.

[0258] 550: The Model Management System returns, for the selected Endpoints, the current output of the Decay Model. It also returns a time since the last raw data update was received from the Endpoint.

[0259] 560: The Scheduler uses the time since a last raw update received to estimate the current accuracy level of the Generative Model using the Decay Model. It compares this with the accuracy required by the Downstream Application(s).

[0260] 570: Depending on the outcome of the comparison, the Scheduler may provide either modeled or raw data to the Downstream Application(s). Where raw data is to be provided, the Scheduler may further choose when to request this from the Endpoints considering the data available from the Generative or Decay Models.

[0261] 580: The Scheduler returns the data (either raw or modeled) to the Downstream Application(s), and where necessary, forwards raw data to the Model Management System (to assist with its Background Model Update Process). Many different ways of executing embodiments of method are possible, as will be apparent to a person skilled in the art. For example, the order of the steps can be performed in the shown order, but the order of the steps can be varied or some steps may be executed in parallel. Moreover, in between steps other method steps may be inserted. The inserted steps may represent refinements of the method such as described herein, or may be unrelated to the method. For example, some steps may be executed, at least partially, in parallel. Moreover, a given step may not have finished completely before a next step is started.

[0262] Embodiments of the method may be executed using software, which comprises instructions for causing a processor system to perform an embodiment of a method, e.g., method 400 or 500. Software may only include those steps taken by a particular sub-entity of the system. The software and / or other data according to an embodiment may be stored in a non-transitory storage medium, such as a hard disk, a floppy, a memory, an optical disc, read only memory, random access memory, CD- ROMs, magnetic tape, optical data storage devices, etc. Transitory signals and carrier waves are excluded from non-transitory media.

[0263] The software may be sent as a transitory signal along a wire, or wireless, e.g., sent as a transitory signal over a data network, e.g., the Internet. For example, signals and / or carrier waves may serve as a transitory medium for carrying information. For example, a modulated electromagnetic wave may carry a signal bearing the software and / or other data according to an embodiment.

[0264] The software may be made available for download and / or for remote usage on a server. Embodiments of the method may be executed using a bitstream arranged to configure programmable logic, e.g., a field-programmable gate array (FPGA), to perform an embodiment of the method.

[0265] It will be appreciated that the presently disclosed subject matter also extends to computer programs, particularly computer programs on or in a carrier, adapted for putting the presently disclosed subject matter into practice. The program may be in the form of source code, object code, a code intermediate source, and object code such as partially compiled form, or in any other form suitable for use in the implementation of an embodiment of the method. An embodiment relating to a computer program product comprises computer executable instructions corresponding to each of the processing steps of at least one of the methods set forth. These instructions may be subdivided into subroutines and / or be stored in one or more files that may be linked statically or dynamically. Another embodiment relating to a computer program product comprises computer executable instructions corresponding to each of the devices, units and / or parts of at least one of the systems and / or products set forth.

[0266] Figure 6a shows a computer readable medium 1000 having a writable part 1010, and a computer readable medium 1001 also having a writable part. Computer readable medium 1000 is shown in the form of an optically readable medium. Computer readable medium 1001 is shown in the form of an electronic memory, in this case a memory card. Computer readable medium 1000 and 1001 may store data 1020 wherein the data may indicate instructions, which when executed by a processor system, cause a processor system to perform an embodiment of a method for providing current or predicted data from a data source, according to an embodiment. The computer program 1020 may be embodied on the computer readable medium 1000 as physical marks or by magnetization of the computer readable medium 1000. However, any other suitable embodiment is conceivable as well. Furthermore, it will be appreciated that, although the computer readable medium 1000 is shown here as an optical disc, the computer readable medium 1000 may be any suitable computer readable medium, such as a hard disk, solid state memory, flash memory, etc., and may be non-recordable or recordable. The computer program 1020 comprises instructions for causing a processor system to perform an embodiment of said method for providing current or predicted data from a data source.

[0267] In an embodiment, a transitory or non-transitory computer storage medium is provided encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform the method according to an embodiment.

[0268] Figure 6b shows in a schematic representation of a processor system 1140 according to an embodiment. The processor system comprises one or more integrated circuits 1110. The architecture of the one or more integrated circuits 1110 is schematically shown in Figure 6b. Circuit 1110 comprises a processing unit 1120, e.g., a CPU, for running computer program components to execute a method according to an embodiment and / or implement its modules or units. Circuit 1110 comprises a memory 1122 for storing programming code, data, etc. Part of memory 1122 may be read-only. Circuit 1110 may comprise a communication element 1126, e.g., an antenna, connectors or both, and the like. Circuit 1110 may comprise a dedicated integrated circuit 1124 for performing part or all of the processing defined in the method. Processor 1120, memory 1122, dedicated IC 1124 and communication element 1126 may be connected to each other via an interconnect 1130, say a bus. The processor system 1140 may be arranged for contact and / or contact-less communication, using an antenna and / or connectors, respectively.

[0269] For example, in an embodiment, processor system 1140, e.g., a system for providing current or predicted data from a data source, may comprise a processor circuit and a memory circuit, the processor being arranged to execute software stored in the memory circuit. For example, the processor circuit may be an Intel Core i7 processor, ARM Cortex-R8, etc. The memory circuit may be an ROM circuit, or a non-volatile memory, e.g., a flash memory. The memory circuit may be a volatile memory, e.g., an SRAM memory. In the latter case, the device may comprise a non-volatile software interface, e.g., a hard drive, a network interface, etc., arranged for providing the software.

[0270] While system 1140 is shown as including one of each described component, the various components may be duplicated in various embodiments. For example, the processing unit 1120 may include multiple microprocessors that are configured to independently execute the methods described herein or are configured to perform elements or subroutines of the methods described herein such that the multiple processors cooperate to achieve the functionality described herein. Further, where the system 1140 is implemented in a cloud computing system, the various hardware components may belong to separate physical systems. For example, the processor 1120 may include a first processor in a first server and a second processor in a second server.

[0271] It should be noted that the above-mentioned embodiments illustrate rather than limit the presently disclosed subject matter, and that those skilled in the art will be able to design many alternative embodiments.

[0272] In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. Use of the verb ‘comprise’ and its conjugations does not exclude the presence of elements or steps other than those stated in a claim. The article ‘a’ or ‘an’ preceding an element does not exclude the presence of a plurality of such elements. Expressions such as “at least one of’ when preceding a list of elements represent a selection of all or of any subset of elements from the list. For example, the expression, “at least one of A, B, and C” should be understood as including only A, only B, only C, both A and B, both A and C, both B and C, or all of A, B, and C. The presently disclosed subject matter may be implemented by hardware comprising several distinct elements, and by a suitably programmed computer. In the device claim enumerating several parts, several of these parts may be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

[0273] In the claims references in parentheses refer to reference signs in drawings of exemplifying embodiments or to formulas of embodiments, thus increasing the intelligibility of the claim. These references shall not be construed as limiting the claim.

Claims

39CLAIMSClaim 1. A method (400) for providing current or predicted data from a data source, the method comprising repeatedly obtaining ( 10) data from a data source, providing (420) the obtained data for maintaining a generative model for modeling of the data of the data source, the generative model being configured to generate predicted data for the data source at a given time point, and for maintaining a decay model, the decay model being configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model receiving (430) a request for current data from the data source, and in response obtaining (440) from the decay model an estimate of the quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source, based on the estimate selecting (450) to obtain actual current data from the data source, or to obtain predicted current data from the generative model, returning (460) actual current data or predicted current data.Claim 2. The method of Claim 1, further comprising maintaining the generative model, maintaining the generative model comprising updating parameters of the generative model to reflect data obtained from the data source, and maintaining the decay model.Claim 3. A method as in any one of the preceding claims, wherein the data of the data source and generated by the generative model comprises one or more of: a. an image, a video, or a 3D model; b. scheduling data or application data; c. communications network data, wherein the data source is comprised in the communications network; d. sensor data, wherein the data source comprises a sensor; e. parameter data; f. status updates.40Claim 4. A method as in any one of the preceding claims, wherein the selecting depends on the estimate meeting a minimum accuracy requirement.Claim 5. A method as in any one of the preceding claims, wherein the request comprises one or more of the following conditions: a maximum age for the last data obtained from the data source and used to maintain the generative model, a minimum accuracy of the decay model; the method comprising selecting to obtain actual current data from the data source if either condition is not met.Claim 6. A method as in any one of the preceding claims, wherein obtaining actual current data from the data source is scheduled for a future time, the method comprising obtaining and returning actual current data from the data source at the scheduled future time.Claim 7. A method as in any one of the preceding claims, comprising before the selecting waiting at most a waiting interval for a further request for current data from the data source, and in response the further request performing the selecting, the selecting depending on the further request.Claim 8. A method as in any one of the preceding claims, wherein data is obtained from multiple data sources, and the generative model is configured for modeling of the data of the multiple data sources, the request being a request for current data from one or more of the multiple data sources, the method further comprising selecting a data source from which to obtain the current data.Claim 9. A method as in any one of the preceding claims, wherein the generative model for a data source is updated using data obtained from another data source.Claim 10. A method as in any one of the preceding claims, comprising obtaining actual current data at a first resolution, upscaling the actual current data at the first resolution to upscaled current data at a second resolution, wherein the second resolution is higher than the first resolution, and returning the upscaled current data.Claim 11. A method as in any one of the preceding claims, wherein the decay model is part of the generative model or wherein the decay model is external to the generative model.41Claim 12. A method as in any one of the preceding claims, wherein maintaining the decay model comprises determining the quantitative measure of the accuracy of predicted current data compared to actual current data and updating parameters of the decay model based on the quantitative measure.Claim 13. A method as in any one of the preceding claims, wherein the request is received by a scheduling application configured to schedule the timing of updates from the data source, data obtained by the scheduling application from the data source is provided to a model management application, the model management application storing the data, maintaining the generative model and / or decay model, generating predicted current data and / or estimating the quantitative measure of accuracy.Claim 14. A method as in any one of the preceding claims, wherein the request is received by an Application Data Analytics Enablement (ADAE) server, the ADAE server being configured to obtain the data from the data source, the request being received from an ADAE Client.Claim 15. A system for providing current or predicted data from a data source, comprising: one or more processors; and one or more storage devices storing instructions that, when executed by the one or more processors, cause the one or more processors to perform repeatedly obtaining data from a data source, providing the obtained data for maintaining a generative model for modeling of the data of the data source, the generative model being configured to generate predicted data for the data source at a given time point, and for maintaining a decay model, the decay model being configured to estimate a quantitative measure of the accuracy of the predicted data produced by the generative model receiving a request for current data from the data source, and in response obtaining from the decay model an estimate of the quantitative measure of the accuracy of a prediction by the generative model of the current data of the data source, based on the estimate selecting to obtain actual current data from the data source, or to obtain predicted current data from the generative model, returning actual current data or predicted current data.Claim 16. A system as in Claim 15, further configured for maintaining the generative model, maintaining the generative model comprising updating parameters of the generative model to reflect data obtained from the data source, and maintaining the decay model.Claim 17. A transitory or non-transitory computer storage medium encoded with instructions that, when executed by one or more computers, cause the one or more computers to perform the method according to any one of Claims 1-14.