First node, fourth node and methods performed thereby for handling data

By profiling and validating data using node-specific models, the method addresses data poisoning issues in machine learning models, ensuring accurate and reliable training data for improved model performance in communications systems.

WO2026124797A1PCT designated stage Publication Date: 2026-06-18TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)

Patent Information

Authority / Receiving Office
WO · WO
Patent Type
Applications
Current Assignee / Owner
TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
Filing Date
2025-05-26
Publication Date
2026-06-18

Smart Images

  • Figure EP2025064419_18062026_PF_FP_ABST
    Figure EP2025064419_18062026_PF_FP_ABST
Patent Text Reader

Abstract

A computer-implemented method, performed by a first node (111). The first node (111) obtains (201, 202) a set of data from, and a profile of, a second node (112) based on one or more characteristics. The first node (111) obtains (203), based on the profile, at least one of one or more models capable of estimating a respective probability value distribution of data. The first node (111) then obtains (206), based on the models, a respective result of whether or not the set of data falls within the respective distributions. The models correspond to the characteristics. The first node (111) determines (207) whether or not the set of data is to be used as input to train an ML model (MLM), based on the respective results. The first node (111) then provides (208), based on a result of the determining (207), the set of data as input to train the MLM.
Need to check novelty before this filing date? Find Prior Art

Description

[0001] FIRST NODE, FOURTH NODE AND METHODS PERFORMED THEREBY FOR HANDLING

[0002] DATA

[0003] TECHNICAL FIELD

[0004] The present disclosure relates generally to a first node and methods performed thereby for handling data. The present disclosure also relates generally to a fourth node and methods performed thereby for handling data. The present disclosure also relates generally to computer programs and computer-readable storage mediums, having stored thereon the computer programs to carry out these methods.

[0005] BACKGROUND

[0006] Computer systems in a communications network or communications system may comprise one or more nodes. A node may comprise a processing circuitry which, together with computer program code may perform different functions and actions, a memory, a receiving port, and a sending port. A node may be, for example, a server. Nodes may perform their functions entirely on the cloud.

[0007] The communications system may cover a geographical area which may be divided into cell areas, each cell area being served by a type of node, a network node in the Radio Access Network (RAN), radio network node or Transmission Point (TP), for example, an access node such as a Base Station (BS), e.g., a Radio Base Station (RBS), which sometimes may be referred to as e.g., gNB, evolved Node B (“eNB”), “eNodeB”, “NodeB”, “B node”, or Base Transceiver Station (BTS), depending on the technology and terminology used. The base stations may be of different classes such as e.g., Wide Area Base Stations, Medium Range Base Stations, Local Area Base Stations, and Home Base Stations, based on transmission power and thereby also cell size. A cell may be understood to be the geographical area where radio coverage may be provided by the base station at a base station site. One base station, situated on the base station site, may serve one or several cells. Further, each base station may support one or several communication technologies. The telecommunications network may also comprise network nodes which may serve receiving nodes, such as user equipments, with serving beams.

[0008] The standardization organization Third Generation Partnership Project (3GPP) is currently in the process of specifying a New Radio Interface called Next Generation Radio or New Radio (NR) or 5G-Universal Terrestrial Radio Access (UTRA), as well as a Fifth Generation (5G) Packet Core Network, which may be referred to as 5G Core Network (5GC), abbreviated as 5GC.

[0009] Machine Learning Machine learning (ML) may be understood as the study of computer algorithms that may improve automatically through experience. It is seen as a part of Artificial Intelligence (Al). ML algorithms may build a model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to do so. ML algorithms may be used in a wide variety of applications, such as email filtering and computer vision, where it may be difficult or unfeasible to develop conventional algorithms to perform the needed tasks.

[0010] In ML, there may be basically three types of algorithms for training of models: Supervised Learning, Unsupervised Learning, and Reinforcement Learning (RL).

[0011] Supervised Learning algorithms may comprise a target / outcome variable, or dependent variable, which may have to be predicted from a given set of predictors, that is, independent variables. Using this set of variables, a function may be generated that may map inputs to desired outputs. The training process may continue until the model may achieve a desired level of accuracy on the training data. Once an ML model may have been trained, an inference process may begin, whereby new data may be run through the ML model to calculate an output. Examples of Supervised Learning may be Regression, Decision Tree, Random Forest, KNN, Logistic Regression etc.

[0012] In Unsupervised Learning algorithms, there may be no target or outcome variable to predict / estimate. It may be used for clustering a population into different groups, which may be widely used for segmenting customers in different groups for specific intervention. Examples of Unsupervised Learning may be K-means, mean-shift clustering, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Expectation-Maximization (EM) Clustering using Gaussian Mixture Models (GMM), Agglomerative Hierarchical Clustering, etc....

[0013] Cluster analysis or clustering may be understood as an ML technique which may comprise grouping a set of objects in such a way that objects in the same group, which may be called a cluster, may be understood to be more similar, in some sense, to each other than to those in other groups, that is, other clusters. It may be understood as a main task of exploratory data mining, and a common technique for statistical data analysis, used in many fields, including pattern recognition, image analysis, information retrieval, bioinformatics, data compression, computer graphics and ML.

[0014] Reinforcement learning (RL) may be understood to be a type of ML where an agent may learn to make decisions by taking actions in an environment to achieve some goal. The agent may receive feedback in the form of rewards, which it may use to learn the best strategy, or policy, to accumulate the most reward over time.

[0015] In ML, online learning algorithms may be used throughout the RAN for resource allocation, interference management, predictive maintenance, traffic prediction, anomaly detection etc. As mobile network UE traffic and mobility patterns may be different in different geographical areas and points in time it may be advantageous for models to be trained upon arrival of new data, as they may potentially be more accurate than models that may have been trained once, on a static dataset.

[0016] SUMMARY

[0017] As part of the development of a embodiments herein, one or more problems with the existing technology will first be identified and discussed.

[0018] Online learning has associated security risks. A malicious user may poison a dataset used for the learning by injecting data that is erroneous, thus leading to models making incorrect predictions. Poisoning the data may not be exclusively a malicious act. For example, in case models are time-sensitive, e.g., sequence to sequence models such as those based on recurrent neural network (RNNs), a pattern that indicates a trend in data may be misleading. This may be due, for example, to rare, critical events that may not happen that often. Such events, assuming a model in a mobile network, may be, for example, natural disasters, large gatherings of people / User Equipments (UE), such as in demonstrations, protests, etc. These events may generate real time data that may not be representative of normal behavior and may result in biasing of the model.

[0019] Certain aspects of the present disclosure and their embodiments address one or more of the challenges identified with the existing methods and provide solutions to the challenges discussed.

[0020] According to a first aspect of embodiments herein, the object is achieved by a computer- implemented method, performed by a first node. The method is for handling data. The first node operates in a communications system. The first node obtains a first set of data from a second node operating in the communications system. The first node obtains a first profile of the second node based on one or more first characteristics of the second node. The first node obtains, based on the obtained first profile, at least one of one or more first models capable of estimating a respective probability value distribution of data. The first node obtains, based on the one or more first models capable of estimating the respective probability value distribution of data, a respective first result. The respective first result is of whether or not the first set of data falls within the respective probability value distributions respectively estimated with the one or more first models. The one or more first models correspond to the one or more first characteristics of the second node. The first node determines whether or not the first set of data is to be used as input to train a first machine learning model (MLM). The determining is based on the obtained respective first results. The first node also provides, based on a result of the determining, the first set of data as input to train the first MLM.

[0021] According to a second aspect of embodiments herein, the object is achieved by a computer-implemented method, performed by a fourth node. The method is for handling data. The fourth node operates in the communications system. The fourth node obtains a respective plurality of second sets of data from a plurality of fifth nodes operating in the communications system. The fourth node obtains a respective third profile of the fifth nodes in the plurality of fifth nodes based on respective one or more second characteristics of the fifth nodes the plurality of fifth nodes. The fourth node determines a respective first model capable of estimating a probability value distribution of the respective plurality of second sets of data The fourth node creates a third indication of a correspondence between the obtained respective third profile and the determined respective first model for every respective fifth node of the plurality of fifth nodes. The fourth node also provides the determined respective first models, the obtained respective third profiles and the third indication of the correspondence to the first node operating in the communications system.

[0022] According to a third aspect of embodiments herein, the object is achieved by the first node, configured to perform the method. The first node is for handling data. The first node is configured to operate in the communications system. The first node is configured to obtain the first set of data from the second node configured to operate in the communications system. The first node is also configured to obtain the first profile of the second node based on the one or more first characteristics of the second node. The first node is further configured to obtain, based on the first profile configured to be obtained, the at least one of the one or more first models configured to be capable of estimating the respective probability value distribution of data. The first node is also configured to obtain, based on the one or more first models capable of estimating the respective probability value distribution of data, the respective first result of whether or not the first set of data falls within the respective probability value distributions configured to be respectively estimated with the one or more first models. The one or more first models are configured to correspond to the one or more first characteristics of the second node. The first node is further configured to determine whether or not the first set of data is to be used as input to train the first MLM. The determining is configured to be based on the respective first results configured to be obtained. The first node is also configured to provide, based on the result of the determining, the first set of data as input to train the first MLM.

[0023] According to a fourth aspect of embodiments herein, the object is achieved by the fourth node, configured to perform the method. The fourth node is for handling data The fourth node is configured to operate in the communications system. The fourth node is configured to obtain the respective plurality of second sets of data from the plurality of fifth nodes configured to operate in the communications system. The fourth node is also configured to obtain the respective third profile of the fifth nodes in the plurality of fifth nodes based on the respective one or more second characteristics of the fifth nodes the plurality of fifth nodes. The fourth node is further configured to determine the respective first model capable of estimating the probability value distribution of the respective plurality of second sets of data. The fourth node is also configured to create the third indication of the correspondence between the respective third profile configured to be obtained and the respective first model configured to be determined for every respective fifth node of the plurality of fifth nodes. The fourth node is further configured to provide the respective first models configured to be determined, the respective third profiles configured to be obtained and the third indication of the correspondence to the first node configured to operate in the communications system.

[0024] According to a fifth aspect of embodiments herein, the object is achieved by a computer program, comprising instructions which, when executed on at least one processing circuitry, cause the at least one processing circuitry to carry out the method performed by the first node.

[0025] According to a sixth aspect of embodiments herein, the object is achieved by a computer-readable storage medium, having stored thereon the computer program, comprising instructions which, when executed on at least one processing circuitry, cause the at least one processing circuitry to carry out the method performed by the first node.

[0026] According to a seventh aspect of embodiments herein, the object is achieved by a computer program, comprising instructions which, when executed on at least one processing circuitry, cause the at least one processing circuitry to carry out the method performed by the fourth node.

[0027] According to an eighth aspect of embodiments herein, the object is achieved by a computer-readable storage medium, having stored thereon the computer program, comprising instructions which, when executed on at least one processing circuitry, cause the at least one processing circuitry to carry out the method performed by the fourth node.

[0028] By obtaining the first set of data, the first node may be enabled to then determine if the first set of data may be used to train the first MLM or not, that is, to verify the data, and if so, initiate its use to train the first MLM. Otherwise, the first set of data may be rejected or discarded.

[0029] By obtaining the first profile of the second node, the first node may be enabled to then obtain the at least one of one or more first models capable of estimating a respective probability value distribution of data that may match the first profile.

[0030] By the first node obtaining the at least one of one or more first models based on the obtained first profile, the first node may be enabled to consult the probability value distributions it may have learned from other input sources that may match the first profile of the second node to check if the first set of data may be outside the area of probability value distribution for the second node, based on the one or more first characteristics of the second node. The one or more first characteristics of the second node, that is, the source, contained in the first profile may be later used, in conjunction with the trained one of more first models obtained to check for the legitimacy of first set of data obtained from the second node.

[0031] By obtaining the respective first results of whether or not the first set of data falls within the respective probability value distributions respectively estimated with the one or more first models, the first node may be enabled to determine whether the first set of data may be used to train the first MLM or not.

[0032] By the first node 11 determining whether or not the first set of data is to be used as input to train the first MLM based on the obtained respective first results, the first node may advantageously enable efficient and accurate validation of model training data for the first MLM, since the determination may be based on the obtained one or more first models having been obtained based on the one or more first characteristics of the second node. The determination may be understood to be efficient since the first node may use previously trained one or more first models that may just need to retrieved whenever the first set of data may arrive, even if the second node may be a new source of data, by matching the one or more first models to the obtained first profile of the second node, that is, based on the one or more first characteristics of the second node. The first node may therefore be confronted with any set of data from any new source, and based on the profile of the source, use one or more corresponding first models, to decide whether the obtained data behaves as expected or deviates from that expectation, indicating that the first set of data may be fraudulent and / or the second node may be malicious.

[0033] By then the first node providing the first set of data as input to the first MLM based on the result of the determining, the first node may enable that the first MLM may be trained with verified data, and therefore that the first MLM may provide more accurate predictions.

[0034] In some examples wherein there may be more than one first model matching the obtained first profile of the second node, the determining may further enable that the verification of the first set of data may be crowdsourced, either based on local sources, and / or based on remotely located sources other nodes.

[0035] By the fourth node obtaining the respective plurality of second sets of data from the plurality of fifth nodes, the fourth node may be enabled to determine, e.g., train, the respective first model capable of estimating a probability value distribution of the respective plurality of second sets of data.

[0036] By the fourth node obtaining the respective third profile of the fifth nodes in the plurality of fifth nodes based on respective one or more second characteristics of the fifth nodes the plurality of fifth nodes, the fourth node may then be enabled to determine the correspondence between the obtained respective third profile and the determined respective first model for every respective fifth node of the plurality of fifth nodes.

[0037] By the fourth node determining the respective first model capable of estimating the probability value distribution of the respective plurality of second sets of data, the fourth node may then be enabled to provide the determined respective first models to the first node, thereby enabling the first node to determine whether or not the first set of data is to be used as input to train the first MLM.

[0038] By the fourth node creating the third indication of the correspondence between the obtained respective third profile and the determined respective first model for every respective fifth node of the plurality of fifth nodes, the fourth node may then be enabled to provide the determined correspondence to the first node, thereby enabling the first node to obtain at least one of one or more first models capable of estimating a respective probability value distribution of data based on the correspondence of the at least one of one or more first models to the obtained first profile.

[0039] By the fourth node providing the determined respective first models, the obtained respective third profiles and the third indication of the correspondence to the first node the first node to be enabled to determine whether or not the first set of data is to be used as input to train the first MLM, faster and more effectively, without having to determine, e.g., train the respective first models itself.

[0040] BRIEF DESCRIPTION OF THE DRAWINGS

[0041] Examples of embodiments herein are described in more detail with reference to the accompanying drawings, according to the following description.

[0042] Figure 1 is a schematic diagram illustrating two non-limiting examples, in panel a) and panel b), of a communications system, according to embodiments herein.

[0043] Figure 2 is a flowchart depicting embodiments of a method in a first node, according to embodiments herein.

[0044] Figure 3 is a flowchart depicting embodiments of a method in a fourth node, according to embodiments herein.

[0045] Figure 4 is a schematic diagram illustrating aspects of the method in the first node, or in the fourth node, according to embodiments herein.

[0046] Figure 5 is a schematic diagram illustrating aspects of the method in the first node, according to embodiments herein.

[0047] Figure 6 is a schematic block diagram illustrating a first node, according to embodiments herein.

[0048] Figure 7 is a schematic block diagram illustrating a fourth node, according to embodiments herein.

[0049] DETAILED DESCRIPTION

[0050] Embodiments herein may be understood to relate to verification of training data for models, e.g., ML models. Some embodiments herein may relate to a distributed input validation using input source profiles. An input source may be understood as the entity that may generate the input. More particularly, some embodiments herein may relate to crowdsourcing verification of training data. Further particularly, embodiments herein may relate to a system for distributed verification of model input data. Topologies may be considered such as a radio access network, where input sources may be distributed. Such input sources may, for example, be User Equipment (UE) and Radio Base Stations (RBSs). In some examples of embodiments herein, both sources and verification may be distributed. According to embodiments herein, a node, which may be also referred to herein as a control point (CP), may act as an intermediary between the infrastructure that may be used for training an ML model and the input sources. The CP may learn the probability value distribution of input features from different input sources. When input data arrives from a given input source that may be outside the area of probability value distribution for the given input source, the CP may consult probability value distributions it may have learned from other input sources that match the profile of the current input source. If the consensus is that the output is outside of the probability value distribution of the other sources, then it may be blocked, meaning that it may not be forwarded to train the model. On any other case, the input value may be used for training.

[0051] The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which examples are shown. In this section, embodiments herein are illustrated by exemplary embodiments. It should be noted that these embodiments are not mutually exclusive. Components from one embodiment or example may be tacitly assumed to be present in another embodiment or example and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. All possible combinations are not described to simplify the description.

[0052] Figure 1 depicts two non-limiting examples, in panels “a” and “b” respectively, of a communications system 100, in which embodiments herein may be implemented. In some example implementations, such as that depicted in the non-limiting example of Figure 1 a), the communications system 100 may be a computer network. In other example implementations, such as that depicted in the non-limiting example of Figure 1 b), the communications system 100 may be implemented in a telecommunications system, sometimes also referred to as a telecommunications network, cellular radio system, cellular network, or wireless communications system. In some examples, the telecommunications system may comprise network nodes which may serve receiving nodes, such as wireless devices. The communications system 100 may for example be a network such as a 5G system, or a newer system supporting similar functionality. The telecommunications system may alternatively or additionally support other technologies such as, for example, Long-Term Evolution (LTE), e.g., LTE Frequency Division Duplex (FDD), LTE Time Division Duplex (TDD), LTE Half-Duplex Frequency Division Duplex (HD-FDD), or LTE operating in an unlicensed band. The telecommunications system may also support yet other technologies, such as Wideband Code Division Multiple Access (WCDMA), Universal Mobile Telecommunications System Terrestrial Radio Access (UTRA) TDD, Global System for Mobile communications (GSM) network, GSM / Enhanced Data Rate for GSM Evolution (EDGE) Radio Access Network (GERAN) network, Ultra-Mobile Broadband (UMB), EDGE, any combination of Radio Access Technologies (RATs) such as e.g. Multi-Standard Radio (MSR) base stations, multi-RAT base stations etc., any 3rd Generation Partnership Project (3GPP) cellular network, Wireless Local Area Network / s (WLAN) or WiFi network / s, Worldwide Interoperability for Microwave Access (WiMax), IEEE 802.15.4-based low-power short-range networks such as IPv6 over Low-Power Wireless Personal Area Networks (6LowPAN), Zigbee, Z-Wave, Bluetooth Low Energy (BLE), or any cellular network or system. The telecommunications system may for example support a Low Power Wide Area Network (LPWAN). LPWAN technologies may comprise Long Range physical layer protocol (LoRa), Haystack, SigFox, LTE for Machines (LTE-M), and Narrow- Band loT (NB-loT).

[0053] The communications system 100 may comprise a plurality of nodes, whereof a first node 111 , a second node 112, one or more third nodes 113, a fourth node 114 and a plurality of fifth nodes 115 are depicted in Figure 1.

[0054] Any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be understood, respectively, as a first computer system, a second computer system, one or more third computer systems, a fourth computer system and a plurality of computer systems. Any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be implemented as a standalone server in e.g., a host computer in the cloud 115, as depicted in the non-limiting example of Figure 1 b), for the first node 111 and the fourth node

[0055] 114. In other examples, any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be a distributed node or distributed server, such as a virtual node in the cloud 115, and may perform some of its respective functions locally, e.g., by a client manager, and some of its functions in the cloud

[0056] 115, by e.g., a server manager. In other examples, any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be a distributed node or distributed server, such as a virtual node in the cloud 115 may perform its functions entirely on the cloud 115, or partially, in collaboration or collocated with a radio network node. Yet in other examples, any of the first node 111 , the second node 112, the third node 113 and the fourth node 114 may also be implemented as processing resources in a server farm. Yet in other examples, any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may also be implemented as virtual network functions, e.g., according to a Network Functions Virtualization (NFV) Architecture.

[0057] Any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be under the ownership or control of a service provider or may be operated by the service provider, or on behalf of the service provider.

[0058] Any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may have a capability to perform machine- implemented learning procedures, which may be also referred to as “machine learning” (ML).

[0059] In some examples, the first node 111 and the fourth node 114 may be co-localized or be the same node. In other examples the first node 111 and the fourth node 114 may be different nodes. In the non-limiting examples depicted in Figure 1a), the second node 112 is one of the plurality of fifth nodes 115.

[0060] In some examples, one or more of the one or more third nodes 113 may be the same as one or more of the fifth nodes in the plurality of fifth nodes 115, or may be co-localized. In other examples the one or more third nodes 113 and the plurality of fifth nodes 115 may be different nodes.

[0061] In some examples, the second node 112 may be the same node as, or be co-localized with, one or more of the one or more third nodes 113 or one or more of the fifth nodes in the plurality of fifth nodes 115. In other examples, the second node 112 may be different than the one or more third nodes 113 and the plurality of fifth nodes 115.

[0062] In some examples, the first node 111 may be the same node as, or be co-localized with, the second node 112, one or more of the one or more third nodes 113 or one or more of the fifth nodes in the plurality of fifth nodes 115. In other examples, the first node 11 may be different than the second node 112, the one or more third nodes 113 and the plurality of fifth nodes 115.

[0063] In some examples, the first node 111 and the fourth node 114 may be core network nodes, such as an operations support system (OSS).

[0064] In other examples, any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be a radio network nodes, such as depicted in Figure 1 b) for the second node 112, the one or more third nodes 113 and the plurality of fifth nodes 115. Any of the radio network nodes may be, e.g., comprised in a Radio Access Network of the telecommunications system. That is, any of the radio network nodes may be a transmission point such as a radio base station, for example a gNB, an eNB, or any other network node with similar features capable of serving a wireless device, such as a user equipment or a machine type communication device, in the communications system 100. In typical examples, any of the radio network nodes may be a base station, such as a gNB or an eNB. In other examples, any of the radio network nodes may be a distributed node, such as a virtual node in the cloud 115, and may perform its functions entirely on the cloud 115, or partially, in collaboration with a radio network node.

[0065] The telecommunications system may cover a geographical area, which in some embodiments may be divided into cell areas, wherein each cell area may be served by a radio network node, although, one radio network node may serve one or several cells. In the example of Figure 1 b), the cells are not depicted in order to simplify the figure. Any of the radio network nodes may be of different classes, such as, e.g., macro eNodeB, home eNodeB or pico base station, based on transmission power and thereby also cell size. In some examples, any of the radio network nodes may serve receiving nodes with serving beams. Any of the radio network nodes may be directly connected to one or more core networks.

[0066] Any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115, and / or any of the other nodes comprised in the communications system 100 may support one or several communication technologies, and its name may depend on the technology and terminology used.

[0067] In some examples, any of the first node 111 , the second node 112, the one or more third nodes 113, the fourth node 114 and the plurality of fifth nodes 115 may be wireless devices, such as a first wireless device 131 and a second wireless device 132 depicted in Figure 1 .

[0068] Any of the wireless devices comprised in the telecommunications system may be a wireless communication device such as a 5G UE, or a UE, which may also be known as e.g., mobile terminal, wireless terminal and / or mobile station, a Customer Premises Equipment (CPE) a mobile telephone, cellular telephone, or laptop with wireless capability, just to mention some further examples. Any of the wireless devices comprised in the communications system 100 may be, for example, portable, pocket-storable, hand-held, computer-comprised, or a vehicle-mounted mobile device, enabled to communicate voice and / or data, via the RAN, with another entity, such as a server, a laptop, a Personal Digital Assistant (PDA), or a tablet, Machine-to-Machine (M2M) device, device equipped with a wireless interface, such as a printer or a file storage device, modem, sensor, loT device, or any other radio network unit capable of communicating over a radio link in a communications system. Any of the wireless devices comprised in the telecommunications system, such as the first wireless device 131 and the second wireless device 132 may be enabled to communicate wirelessly in the telecommunications system. The communication may be performed e.g., via a RAN, and possibly the one or more core networks, which may be comprised within the telecommunications system. It may be understood that the telecommunications system may comprise additional, or fewer, nodes, radio network nodes and / or wireless devices than those depicted in Figure 1 . In the non-limiting example of Figure 1b), the first wireless device 131 is served by the second node 112, and the second wireless device 132 is served by one of the fifth nodes in the plurality of fifth nodes 115. This may be understood to be non-limiting and for illustrative purposes only. Each of the second node 112, the third nodes of the one or more third nodes 113, as well as each of the fifth nodes in the plurality of fifth nodes 115 may serve one or more wireless devices, or be wireless devices themselves.

[0069] The first node 111 may be configured to communicate within the communications system 100 with the fourth node 114 over a first link 141, e.g., a radio link, or a wired link. The first node 111 may be configured to communicate within the communications system 100 with the second node 112 over a second link 142, e.g., a radio link, or a wired link. The fourth node 114 may be configured to communicate within the communications system 100 with the plurality of fifth nodes 115 over a respective third link 153, e.g., a radio link, or a wired link. The first node 111 may be configured to communicate within the communications system 100 with any of the third nodes of the one or more third nodes 113 over a respective fourth link 144, e.g., a radio link. The second node 112 may be configured to communicate within the communications system 100 with the first wireless device 131 over a fifth link 145, e.g., a radio link. One of the fifth nodes 115 may be configured to communicate within the communications system 100 with the fourth node 114 over a respective sixth link 146, e.g., a radio link.

[0070] Any of the first link 141 , the second link 142, the respective third link 143, the respective fourth link 144 and the fifth link 145 may be a direct link or may be comprised of a plurality of individual links, wherein it may go via one or more computer systems or one or more core networks in the communications system 100, which are not depicted in Figure 1 , or it may go via an optional intermediate network. The intermediate network may be one of, or a combination of more than one of, a public, private or hosted network; the intermediate network, if any, may be a backbone network or the Internet; in particular, the intermediate network may comprise two or more sub-networks, which is not shown in Figure 1 .

[0071] In general, the usage of “first”, “second”, “third”, “fourth” and / or “fifth” herein may be understood to be an arbitrary way to denote different elements or entities, and may be understood to not confer a cumulative or chronological character to the nouns they modify.

[0072] Some of the embodiments contemplated herein will now be described more fully with reference to the accompanying drawings. Other embodiments, however, are contained within the scope of the subject matter disclosed herein, the disclosed subject matter should not be construed as limited to only the embodiments set forth herein; rather, these embodiments are provided by way of example to convey the scope of the subject matter to those skilled in the art.

[0073] Some embodiments herein may relate to a method performed by the first node 111. Some embodiments herein may relate to a method performed by the fourth node 114.

[0074] Embodiments of a computer-implemented method, performed by the first node 111 , will now be described with reference to the flowchart depicted in Figure 2. The method may be understood to be for handling data. The first node 111 operates in the communications system 100.

[0075] In some examples, the communications system 100 may be a 5G network.

[0076] Several embodiments are comprised herein. In some embodiments, all the actions may be performed. In some embodiments, some of the actions may be performed. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. A non-limiting example of the method performed by the first node 111 is depicted in Figure 2.

[0077] In some examples, the method may be performed by the first node 111 in real-time training .

[0078] In Figure 2, optional actions are represented with dashed lines.

[0079] Action 201

[0080] In this Action 201 , the first node 111 obtains a first set of data from the second node 112 operating in the communications system 100. The second node 112 may be understood to be a source node (S) of the first set of data, that is, an input source, as described earlier. In context of a mobile network, these sources may, for example, be different base stations (BS) in a RAN providing information to a centralized location such as an operations support system (OSS), or a User Equipment (UE).

[0081] Obtaining in this Action 201 may be understood as receiving, collecting or retrieving from a storage.

[0082] In one example, the data in the first set of data may comprise multiple independent observations, which may be referred to herein as "datapoints", that may be arranged in a specific sequence in case this data may be time-sensitive, e.g., based on a timestamp value. Each datapoint may comprise multiple “features”. In another example, the data may comprise one observation aggregating multiple independent observations using statistical tools, for example, an average value and standard deviation value. The period of observing the input data may differ, e.g., it may range from milliseconds, in case an input is one datapoint, to seconds or even minutes, in case an input comprises multiple datapoints.

[0083] In some examples of embodiments herein, the obtaining of the first set of data may be via an endpoint (EP) of the first node 111.

[0084] The first set of data may be real-time data. The first node 111 may store the data of the first set of data, as it may arrive from the second node 112, in a temporary buffer (TB).

[0085] The first set of data may be for training a model, referred to herein as a first ML model (MLM), which may be, e.g., a discriminative model, such as a model performing classification.

[0086] According to embodiments herein, the first node 111 , which may be also referred to herein as a control point (CP), may act as an intermediary between the infrastructure that may be used for training the first MLM and the input sources, such as the second node 112. By obtaining the first set of data in this Action 201 , the first node 111 may be enabled to then determine if the first set of data may be used to train the first MLM or not, that is, to verify the data, as will be described later, and if so, initiate its use to train the first MLM. Otherwise, the first set of data may be rejected or discarded. Other actions, e.g., flagging the second node 112 as a potential fraudulent source, may be taken with respect to the second node 112, as will also be described later.

[0087] Action 202

[0088] One task of the first node 111 may be understood to be to profile each source data, such as the second node 112. In this Action 202, the first node 111 obtains a first profile of the second node 112 based on one or more first characteristics of the second node 112.

[0089] Obtaining in this Action 202 may be understood as calculating, receiving, or retrieving from a storage.

[0090] The profile may contain several characteristics, e.g., the one or more first characteristics, of the source that may be later used to check for legitimacy of the first set of data. In case the input source, that is, the second node 112, may be a UE, the one or more first characteristics may be for example those present in UE-Context, e.g., security, location and Quality of Service (QoS) information, in combination with measured values, such as traffic and mobility patterns. In addition, other features such as information about the manufacturer and model of UE may be part of the profile.

[0091] In case the input source, that is, the second node 112, may be a BS, the one or more first characteristics may be aggregate Performance Monitoring (PM) counters, such as physical resource block (PRB) utilization, average downlink / uplink throughput, average number of active mobile subscribers, etc.

[0092] The obtained first profile may be one of a plurality of first profiles attributable to the second node 112. The plurality of first profiles may be multiple and / or dynamic profiles. An input source such as the second node 112 may be a logical entity that may be attributed to a physical entity such as an RBS or a UE. To this end, it may also be possible to have multiple profiles for such an entity, as it may engage with various applications and services concurrently, and consequently have multiple input sources.

[0093] The first profile may also be updated periodically or on event basis, e.g., after X amount of input from the second node 112, that is, the input source, may be received, X being a number set by design.

[0094] Action 203

[0095] In this Action 203, the first node 111 obtains, based on the obtained first profile, at least one of one or more first models capable of estimating a respective probability value distribution of data. That is, each of the one or more first models, may be capable of estimating its respective, that is, its own probability value distribution.

[0096] Some embodiments herein may relate to a method wherein multiple input sources may be considered, wherein the multiple input sources may provide real-time data for training a discriminative model, such as a model performing classification. The first node 111 may learn the probability value distribution of input features from different input sources (S), either sources locally available to the first node 111 and / or sources available via the one or more third nodes 113.

[0097] The one or more first models may be understood to be the models that may be used to check, either locally, by the first node 111 itself in Action 204, or elsewhere, by the one or more third nodes 113, as will be explained later, whether or not the first set of data may be outside the area of probability value distribution respectively estimated by the one or more first models.

[0098] At least one of the one or more first models may comprise, in some examples, that the first node 111 may obtain all of the one or more first models in this Action 203. In other examples, the first node 111 may obtain one or some of the one or more first models in this Action 203. The first node 111 may then receive the remaining models of the one or more first models in Action 206.

[0099] The one or more first models, each referred to as theta (0X) may be understood to be the models trained to capture the probability value distribution of every input feature in input data such as the first set of data, provided by the sources, such as the second node 112.

[0100] The one or more first models may not necessarily be ML models. The one or more first models may be statistical models. In a simplest form, each of the one or more first models may be a regression type of model. In a more complex form, each of the one or more first models may be an artificial neural network, that may calculate the probability value distribution function of every input feature of the respective source, e.g., the second node 112. Any of the one or more first models may then be able to validate whether an input such as the first set of data may belong to the probability value distribution or may be out of range, indicating a malicious and / or erroneous and / or spontaneous type of datum that may need to be disregarded.

[0101] The obtaining in this Action 203 may comprise retrieving the at least one of the one or more first models from a storage. That the first node 111 may obtain, e.g., retrieve, the at least one of the one or more first models based on the obtained first profile may be understood as that the first node 111 may retrieve the at least one of the one or more first models corresponding or matching the obtained first profile of the second node 112. For example, the obtaining in this Action 203 may comprise that the first node 111 , e.g., via the EP, may first retrieve the respective first profile of all input sources (S), from a source profile database. The first node 111 may then, as part of this Action 203, compare the first profile of the input source that provided the input data, that is, of the second node 112, to profiles of all other input sources. One way to do this comparison may be to vectorize the profile features and use a similarity measure such as cosine similarity. Eventually, the first node 111 , e.g., the EP, may downselect a small number of other input sources based on their matching to the first profile of the input source, that is, of the second node 112. The first node 111 may then, in this Action 203 obtain, e.g., retrieve, the respective first models corresponding to the downselected input sources.

[0102] In some embodiments, the first node 111 may retrieve the at least one of the one or more first models that it may have trained itself. In some embodiments, the obtaining in this Action 203 of the at least one of the one or more first models may comprise one of: a) training the at least one of the one or more first models by the first node 111 , and c) receiving the trained at least one of the one or more first models from the fourth node 114 operating in the communications system 100.

[0103] For every second node 112, or source node, in the communications system 100, a node such as the first node 111 may act as the CP. The task of the CP may be twofold. One task of the CP, as described in Action 201 , may be understood to be the task of profiling each source. Another task, performed in this Action 203, may be understood to be to train models, the at least one of the one or more first models, one per source of input, to learn the probability value distribution of input data. The training of the one or more first models may be distributed, e.g., one or more of the one or more first models may be trained by the first node 111 , while other models of the one or more first models may be trained by the one or more the nodes 113, and / or delegated to the fourth node 114, as will be described in relation to Figure 3.

[0104] In examples wherein the obtaining of the the at least one of the one or more first models may comprise training the at least one of the one or more first models, the first node 111 may store the data of the first set of data, as it may arrive from the second node 112, in the TB.

[0105] The first node 111 may then periodically pull data from the TB to train a respective first model of the at least one of the one or more first models.

[0106] The training of the at least one of the one or more first models by the first node 111 may be performed by statistical methods or by ML methods.

[0107] In some examples, the first node 111 may retrieve the least one of the one or more first models from the storage instead of calculating, e.g., training, a first model solely on the first set of data obtained from the second node 112. This may be understood to advantageously save time and resources.

[0108] In some examples, the communications system 100 may be bootstrapped for a newly- joining node. As previously explained, input sources may be distributed and may be available at different times. For example, if input sources such as the second node 112 are gNB, then a newly commissioned gNB may need some time to train its own first model capturing the probability value distribution, and may rely on older, more seasoned gNB for assessment. The source profile database may include a “seasonality” parameter as part of the input source profile. The seasonality parameter may be related to one or combination of: time of operation, amount of input data and / or recent rate of input. The time of operation may be understood as the time from which the input source may have been operating. The amount of input data may be understood as the amount of data the input source may provide in terms of e.g., datapoints. The recent rate of input may be understood as a historical rate of input in terms of datapoints / unit of time, e.g., datapoints / hour for example, that may indicate how active the input source may have, e.g., in the past X amount of time, e.g., in the last 3 days. The seasonality parameter may be used as a criterion for selection. For example, the more active an input source may have been recently, the more likely the probability value distribution model may be to be selected for evaluation.

[0109] By the first node 111 obtaining the at least one of one or more first models based on the obtained first profile, the first node 111 may be enabled to consult the probability value distributions it may have learned from other input sources that may match the first profile of the second node 112 to check if the first set of data may be outside the area of probability value distribution for the second node 112, based on the one or more first characteristics of the second node 112. The one or more first characteristics of the second node 112, that is, the source, contained in the first profile may be later used in Action 207, in conjunction with the trained one of more first models obtained in this Action 203 to check for the legitimacy of first set of data obtained from the second node 112.

[0110] Action 204

[0111] In order to check for the legitimacy of first set of data obtained from the second node 112, the first node 111 , may consult the probability value distributions that other nodes, the one or more third nodes 113 may have learned from other input sources that may match the first profile of the second node 112.

[0112] For that purpose, in some embodiments, in this Action 204, the first node 111 may send a first indication of the first set of data to the one or more third nodes 113 operating in the communications system 100. The one or more third nodes 113 may be understood to be other nodes having a similar functionality to the first node 111. The one or more third nodes 113 may be other core network nodes, or radio network nodes, or wireless devices.

[0113] The first indication may be a message comprising the first set of data.

[0114] The sending of the first indication may be performed, e.g., via the EP, by forwarding the first set of data to the one or more third nodes.

[0115] By sending the first indication in this Action 204, the first node 111 may ask for the opinion of other nodes, that is, of the one or more third nodes 113, on the first set of data. This may then enable the first node 111 to determine what in the context embodiments herein may be referred to as a consensus criterion or “consensus” among the first node 111 and the one or more third nodes 113 to decide whether or not the first set of data may be legitimate. It may be noted that a consensus criterion may not be necessarily require unanimity. In one example, the consensus may for example be the majority vote. In other words, by performing Action 204, the first node 111 may enable crowdsourcing the verification of the first set of data.

[0116] In some examples, the sending of the first indication in this Action 204 may be performed whenever new data may be available at the second node 112, and, for example, based on the obtained first profile of the second node 112 the first set of data may have been flagged as potentially poisonous by the first node 111.

[0117] Action 205

[0118] To achieve the consensus, the other nodes, that is, the one or more third nodes 113, may also use their respective discriminators, to provide an opinion on the legitimacy of the first set of data.

[0119] In this Action 205, the first node 111 may obtain a respective second indication from the one or more third nodes 113. The respective second indication may indicate a respective result, referred to herein as a second result, of a respective determination of a respective third node of the one or more third nodes 113, using at least a respective subset of the one or more first models, of whether or not the first set of data falls within respective probability value distributions respectively estimated by the respective subset of the one or more first models.

[0120] In other words, in this Action 205 the one or more third nodes 113 may provide their own indication of whether the input provided from the source may be legitimate or not.

[0121] Each of the one or more third nodes 113 may use a respective subset of the one or more first models to evaluate the legitimacy of the first set of data. In other words, not all third nodes may use all of the first models, and not all of the third nodes may use the same first models, but each may use a respective subset of the one or more first models.

[0122] By obtaining the respective second indication in this Action 204, the first node 111 may obtain the opinion of other nodes, that is, of the one or more third nodes 113, on the first set of data. This may then enable the first node 111 to determine the consensus criterion or “consensus” among the first node 111 and the one or more third nodes 113 to decide whether or not the first set of data may be legitimate. In other words, by performing Action 205, the first node 111 may enable crowdsourcing the verification of the first set of data.

[0123] Action 206

[0124] In this Action 206, the first node 111 obtains, based on the one or more first models capable of estimating the respective probability value distribution of data, a respective first result of whether or not the first set of data falls within the respective probability value distributions respectively estimated with the one or more first models. The one or more first models correspond to the one or more first characteristics of the second node 112.

[0125] Obtaining in this Action 206 may be understood as determining.

[0126] In some examples, the obtaining in this Action 206 may comprise the first node 111 calculating whether or not the first set of data falls within the respective probability value distributions respectively estimated with all of the one or more first models.

[0127] In other examples, the obtaining in this Action 206 may comprise the first node 111 calculating whether or not the first set of data falls within the respective probability value distributions respectively estimated with the at least one of the one or more first models retrieved from the storage in Action 203. In some of examples, the obtaining 206 may be based on locally trained first models. In such examples, the first node 111 may e.g., via the EP, forward the first set of data to the models of the downselected input sources in Action 203.

[0128] In other examples, the obtaining in this Action 206 may comprise the first node 111 calculating whether or not the first set of data falls within the respective probability value distributions respectively estimated with the at least one of the one or more first models retrieved from the storage in Action 203 and determining the respective second results respectively reached by the one or more third nodes 113, and indicated by the respective second indications obtained in Action 205. By obtaining the respective first results of whether or not the first set of data falls within the respective probability value distributions respectively estimated with the one or more first models in this Action 206, the first node 111 may be enabled to determine in Action 207 whether the first set of data may be used to train the first MLM or not.

[0129] Action 207

[0130] In this Action 207, the first node 111 determines whether or not the first set of data is to be used as input to train the first MLM. The first MLM may be a discriminative model, such as a model performing classification. This Action 207 may be performed by a filtering function of the first node 111. The first node 111 , via the filtering function, may decide, in this Action 207, whether the first set of data may be forwarded for training of the use-case model or rejected.

[0131] The one or more first models may be used as discriminators in a sense in this Action 207, to be able to classify whether the first set of data may be considered during training of the first MLM or not.

[0132] Determining may be understood as calculating, deriving estimating or similar.

[0133] The determining in this Action 207 is based on the obtained respective first results in Action 206.

[0134] In some embodiments, the determining in this Action 207 may be further based on the obtained respective second indications. That is, not only in local results by the first node 111 , but also in determinations respectively performed by the one or more third nodes 113.

[0135] In some embodiments, the determining in this Action 207 may be based on a consensus criterion among at least one of: the obtained respective first results and the obtained respective second results. In some examples wherein Action 205 may have been performed, the respective first results may also comprise the respective second results. In other examples, the respective first results may only comprise local results obtained by the first node 111.

[0136] In the example the consensus may be the majority vote, if there are a majority of respective second indications from the one or more third nodes 113 indicating “flags” of the first set of data being "potentially poisonous", then the first node 111 may determine in this Action 207 that the first set of data may not be used for training the first MLM, e.g., a predictive model. If not, then the first set of data may be added to the dataset to be used to train the first MLM.

[0137] In some embodiments, the determining in this Action 207 may comprise applying a weighted factor to at least one of: the obtained respective first results and the obtained respective second results. In an example, a weighted average may be used. In case, for example, the input of one input source, e.g., one third node, may have greater weight than an opinion of another input source, e.g., another third node.

[0138] The weighted factor may be based on a closeness of a respective profile of the respective third node of the one or more third nodes 113 and the obtained first profile. The weight in this case, e.g., weighted average, may be assigned based on how close the source profile may match the input source that provided the input.

[0139] In the process of validating model input, other information may be used by the first node 111 in addition to learning the probability value distribution and mapping the input values to this probability value distribution. In some embodiments, the determining in this Action 207 may be further based on one or more of: a time of arrival of the first set of data, a completeness of the first set of data, the obtained first profile, and an accuracy of the first set of data. One criterion may be the interarrival time of received input datapoints from the input source, e.g., the second node 112, compared to historical input interarrival time. Another criterion may be the completeness of the input, e.g., the percentage of input features reported in a single input datapoint.

[0140] The decision in Action 207 may be based on comparing one or more aspects of the input, that is, of the first set of data, such as e.g., probability value distribution, datapoint interarrival time and / or completeness, with other input sources, the one or more third nodes 113, or nodes whose information may be analyzed by the one or more third nodes. The comparison may be based on a consensus algorithm that may itself vary, see previous description. In some examples, the decision may be forwarded to the EP that may act accordingly.

[0141] By the first node 111 determining in this Action 207 whether or not the first set of data is to be used as input to train the first MLM based on the obtained respective first results, the first node 111 may advantageously enable efficient and accurate validation of model training data for the first MLM, since the determination may be based on the obtained one or more first models having been obtained based on the one or more first characteristics of the second node 112. The determination in this Action 207 may be understood to be efficient since the first node 111 may use previously trained one or more first models that may just need to retrieved whenever the first set of data may arrive, even if the second node 112 may be a new source of data, by matching the one or more first models to the obtained first profile of the second node 112, that is, based on the one or more first characteristics of the second node 112. The first node 111 may therefore be confronted with any set of data from any new source, and based on the profile of the source, use one or more corresponding first models, to decide whether the obtained data behaves as expected or deviates from that expectation, indicating that the first set of data may be fraudulent and / or the second node 112 may be malicious. The first node 111 may therefore enable the first MLM to be trained with verified data, and therefore that the first MLM may provide more accurate predictions.

[0142] In some examples wherein there may be more than one first model matching the obtained first profile of the second node 112, the determining in this Action 207 may further enable that the verification of the first set of data may be crowdsourced, either based on local sources, and / or based on remotely located sources via the one or more third nodes 113.

[0143] Action 208

[0144] In this Action 208, the first node 111 provides, based on a result of the determining in Action 207, the first set of data as input to train the first MLM.

[0145] With the proviso a result of the determination in Action 207 is that the first set of data falls, e.g., according to the consensus criterion, within the respective probability value distributions respectively estimated with the one or more first models, the first node 111 , in this Action 208, may provide the first set of data as input to train the first MLM.

[0146] With the proviso a result of the determination in Action 207 is that the first set of data does not falls, e.g., according to the consensus criterion, within the respective probability value distributions respectively estimated with the one or more first models, the first node 111 , in this Action 208, may refrain from providing the first set of data as input to train the first MLM. The first node 111 may instead discard the first set of data.

[0147] Action 209

[0148] The result of the determining of Action 207 may be understood to be a third result. In some embodiments, with the proviso the third result is that the first set of data is not to be used as input to train the first MLM, and the first node 111 has previously determined the same third result for previous sets of data obtained from the second node 112, in this Action 209, the first node 111 may refrain from accepting further data from the second node 112 for one or more of: a period of time and a set of observations.

[0149] The refraining in this Action 209 may be understood as quarantining the second node 112, that is, the input source. In some examples, in case of multiple REJECT messages, the first node 111 , e.g., via the filtering function, may decide to quarantine the input source. In examples wherein the EP may be implemented, the filtering function may signal the EP.

[0150] The first node 111 , e.g., via the EP comprised in the first node 111 , may reject input for at least some time, and / or for the set of observations, from the second node 112. Embodiments of a computer-implemented method performed by the fourth node 114, will now be described with reference to the flowchart depicted in Figure 3. The method may be understood to be for handling data. The fourth node 114 operates in the communications system 100.

[0151] The method may comprise the following actions. Several embodiments are comprised herein. In some embodiments, the method may comprise all the actions. In other embodiments, the method may comprise some of the actions. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. It should be noted that the examples herein are not mutually exclusive. Components from one example may be tacitly assumed to be present in another example and it will be obvious to a person skilled in the art how those components may be used in the other examples. In Figure 3, optional actions are depicted with dashed lines.

[0152] The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the first node 111 and will thus not be repeated here to simplify the description. For example, any of the first models may be ML models or statistical models.

[0153] The fourth node 114 may be understood to be a node that may train the one or more first models, if they are not trained by the first node 111 itself.

[0154] Action 301

[0155] In this Action 301 , the fourth node 114 obtains a respective plurality of second sets of data from the plurality of fifth nodes 115 operating in the communications system 100. That is, each fifth node 115 may provide its own plurality of second sets of data. What may constitute a second “set” of data may be configurable, e.g., how much data may need to be comprised in the set to be considered as such. For example, a second set may be throughput data over 1 hr, a plurality of second sets may be the throughput data for every hour over one day period. However, the skilled person may understand that the amount of data obtained from every fifth node in the plurality of fifth nodes 115 may be understood to be enough to estimate a probability value distribution.

[0156] Each of the fifth nodes in the plurality of third nodes 115 may be understood to be a source node (S).

[0157] In some examples, at least one of the fifth nodes 115 in the plurality of fifth nodes 115 may be the second node 112.

[0158] The obtaining in this Action 301 may comprise collecting, receiving retrieving or similar.

[0159] This Action 302 may be understood to be performed in a similar manner as Action 201 as described for the first node 111. Action 302

[0160] In this Action 302, the fourth node 114 obtains a respective third profile of the fifth nodes in the plurality of fifth nodes 115 based on respective one or more second characteristics of the fifth nodes the plurality of fifth nodes 115. That is, in this Action 302, input sources S may be profiled.

[0161] This Action 302 may be understood to be similar to Action 202 as described for the first node 111 in relation to the first profile of the second node 112.

[0162] Every third profile may be understood as profile of a respective fifth node in the plurality of fifth nodes 115.

[0163] In some embodiments, the obtained respective third profiles may comprise a plurality of respective third profiles for at least one of the fifth nodes in the plurality of fifth nodes 115 attributable to the at least one of the fifth nodes in the plurality of fifth nodes 115 based on one or more contexts.

[0164] Action 303

[0165] In this Action 303, the fourth node 114 determines a respective first model capable of estimating a probability value distribution of the respective plurality of second sets of data. That is, in this Action 303, the input data obtained from every source may be used for learning a respective probability value distribution (9).

[0166] In some embodiments, the determining in this Action 303 of the respective first models may be performed by training a second MLM. That is, in some examples the second MLM may be trained to estimate the respective first model capable of estimating the probability value distribution of the respective plurality of second sets of data of any fifth node 115.

[0167] This Action 303 may be understood to correspond to Action 403 described in relation to Figure 3.

[0168] Action 304

[0169] In this Action 304, the fourth node 114 creates a third indication of a correspondence between the obtained respective third profile and the determined respective first model for every respective fifth node of the plurality of fifth nodes 115. The third indication may be a link between the respective third profile of a given fifth node with its corresponding determined respective first model.

[0170] The created third indication may be stored in a storage or memory, e.g., a database.

[0171] Action 305

[0172] In this Action 305, that the fourth node 114 provides the determined respective first models, the obtained respective third profiles and the third indication of the correspondence to the first node 111 operating in the communications system 100. As stated earlier, the first node 111 and the fourth node 114 may be the same node in some examples.

[0173] By providing the determined respective first models to the first node 111 , the first node 111 may then be enabled to determine in Action 207, whether or not the first set of data may be used as input to train a first MLM.

[0174] It may be noted that the method performed by the first node 111 and the method performed by the fourth node 114, as described in relation to Figure 2 and Figure 3, respectively may be executed in parallel, upon incoming data traffic.

[0175] Some embodiments herein will now be further described with some non-limiting examples, which may be combined with the embodiments just described.

[0176] Figure 4 is a schematic diagram illustrating some aspects of embodiments herein as performed by the fourth node 114, which may, in some examples, be the same node as the first node 111 , and also a CP. As stated earlier, some embodiments herein may relate to a system and method wherein multiple input sources may be considered, wherein the multiple input sources may provide real-time data for training a discriminative model, such as a model performing classification. In the non-limiting example depicted in Figure 4, the fourth node 114 is a CP that may learn the probability value distribution of input features from different input sources Si ... Sx. S may be understood to be a source. In context of a mobile network, the sources S may, for example, be different base stations (BS) in a RAN providing information to a centralized location such as an operations support system (OSS), or different User Equipment (UE). Particularly, Figure 4 is a schematic diagram illustrating the part of the background process where the plurality of fifth nodes 115, that is, a plurality of input sources Si ... Sxmay be profiled according to Action 302 and the respective plurality of second sets of data, that is, their input data may be used for learning a respective probability value distribution according to Action 303. In Figure 4, TB (TBi ... TBX) may be understood to be a respective temporary buffer 401 and theta (9i ... 9X) may be understood to be the respective first models for every respective fifth node of the plurality of fifth nodes 115, trained in Action 303 to capture the probability value distribution of every input feature in the respective input data provided by the sources S. The third indication of the correspondence between the obtained respective third profile and the determined respective first model for every respective fifth node of the plurality of fifth nodes 115 may be created according to Action 304 and stored in a source profile database of the fourth node 114.

[0177] Figure 5 is a schematic diagram illustrating some aspects of embodiments herein as performed by the first node 111. Particularly, Figure 5 is a schematic diagram illustrating a non-limiting example of the first node 111 in operation. Particularly, Figure 2 illustrates the block components of the disclosed system in operation. In Step 1 , according to Action 201 , the input data may be sent by the second node 112, that is, an input source Si, to an endpoint (EP) of the first node 111. When the input data arrives, the first node 111 may consult probability value distributions it may have learned from other input sources that may match the profile of the current input source. In Step 2, the first node 111 , via the endpoint, may first retrieve the respective profile of all input sources from a source profile database 501 , and, according to Action 203, may compare the profile of the input source that provided the input data, as obtained according to Action 202, to profiles of all other input sources. One way to do this comparison may be to vectorize the profile features and use a similarity measure such as cosine similarity. Eventually, the first node 111 , via the EP, may downselect a small number of other input sources based on their matching to the profile of the input source Si. In Step 3, the first node 111 , via the EP may then forward the first set of data to the models (9i ... 0x) of the downselected input sources, that is, the one or more first models, which, according to Action 206, may provide their own indication of whether the input provided from the second node 112 may be legitimate or not. A filtering function may then, according to Action 207, decide whether the first set of data may be forwarded for training of the first MLM, that is, the usecase model, or rejected. The decision may be based on comparing one or more aspects of the first set of data, e.g., probability value distribution, datapoint interarrival time and / or completeness, with other input sources. The comparison may be based on a consensus algorithm that may itself vary, see previous description. The decision may be forwarded to the EP that may act accordingly. If the consensus is that the output is outside of the probability value distribution of the other sources, then the first set of data may be blocked according to Action 208, meaning that it may not be forwarded to train the first MLM. The first node 111 may decide to signal to the EP to accept or, according to reject the first set of data, depicted as “input data” in Figure 5. On any other case, the input value may be used for training, and the first node 111 may then forward the first set of data to a model trainer 502 for training of the first MLM. It may be noted that both background parts illustrated in Figure 4 and real-time training part illustrated in Figure 5 may be executed in parallel, upon incoming data traffic. The EP and Filtering functions may be logical entities and may be independently hosted in physical infrastructure.

[0178] Figure 6 depicts an example of the arrangement that the first node 111 may comprise to perform the method actions described above in relation to Figure 2 and / or any of Figures 4-5. The first node 111 may be understood to be for handling data. The first node 111 is configured to operate in the communications system 100.

[0179] Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description.

[0180] Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In Figure 6, optional components are depicted with dashed lines. The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the first node 111 and will thus not be repeated here. For example, any of the first models may be configured to be ML models or statistical models.

[0181] The first node 111 is configured to obtain the first set of data from the second node 112 configured to operate in the communications system 100.

[0182] The first node 111 is also configured to obtain the first profile of the second node 112 based on the one or more first characteristics of the second node 112.

[0183] The first node 111 is further configured to obtain, based on the first profile configured to be obtained, the at least one of the one or more first models configured to be capable of estimating the respective probability value distribution of data.

[0184] The first node 111 is also configured to obtain, based on the one or more first models capable of estimating the respective probability value distribution of data, the respective first result of whether or not the first set of data falls within the respective probability value distributions configured to be respectively estimated with the one or more first models. The one or more first models are configured to correspond to the one or more first characteristics of the second node 112.

[0185] The first node 111 is further configured to determine whether or not the first set of data is to be used as input to train the first MLM. The determining is configured to be based on the respective first results configured to be obtained.

[0186] The first node 111 is also configured to provide, based on the result of the determining, the first set of data as input to train the first MLM.

[0187] In some embodiments, the first node 111 may be further configured to send the first indication of the first set of data to the one or more third nodes 113 configured to operate in the communications system 100.

[0188] In some embodiments, the first node 111 may be further configured to obtain the respective second indication from the one or more third nodes 113. The respective second indication may be configured to indicate the respective second result of the respective determination of the respective third node of the one or more third nodes 113. The respective determination may be, using at least the respective subset of the one or more first models, of whether or not the first set of data falls within the respective probability value distributions configured to be respectively estimated by the respective subset of the one or more first models. The determining may be configured to be further based on the respective second indications configured to be obtained.

[0189] In some embodiments, the determining may be configured to be based on the consensus criterion among at least one of: the respective first results configured to be obtained and the respective second results configured to be obtained.

[0190] In some embodiments, the determining may be configured to comprise applying the weighted factor to at least one of: the respective first results configured to be obtained and the respective second results configured to be obtained.

[0191] In some embodiments, the weighted factor may be configured to be based on the closeness of the respective profile of the respective third node of the one or more third nodes 113 and the first profile configured to be obtained.

[0192] In some embodiments, the first profile configured to be obtained may be configured to be one of the plurality of first profiles configured to be attributable to the second node 112.

[0193] In some embodiments, the determining may be further configured to be based on one or more of: the time of arrival of the first set of data, the completeness of the first set of data, the first profile configured to be obtained, and the accuracy of the first set of data.

[0194] In some embodiments, the first node 111 may be further configured to send

[0195] In some embodiments, the result of the determining may be configured to be the third result. With the proviso the third result may be that the first set of data is not to be used as input to train the first MLM, and the first node 111 may have previously determined the same third result for previous sets of data obtained from the second node 112, in some embodiments, the first node 111 may be further configured to refrain from accepting further data from the second node 112 for one or more of: the period of time and the set of observations.

[0196] In some embodiments, the obtaining of the at least one of the one or more first models may be configured to comprise one of: a) training the at least one of the one or more first models by the first node 111 , and b) receiving the trained at least one of the one or more first models from the fourth node 114 configured to operate in the communications system 100.

[0197] The embodiments herein in the first node 111 may be implemented through one or more processors, such as a processing circuitry 601 in the first node 111 depicted in Figure 6, together with computer program code for performing the functions and actions of the embodiments herein. A processor, as used herein, may be understood to be a hardware component. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the first node 111. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the first node 111.

[0198] The first node 111 may further comprise a memory 602 comprising one or more memory units. The memory 602 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the first node 111.

[0199] In some embodiments, the first node 111 may receive information from, e.g., the second node 112, the one or more third nodes 113, the fourth node 114, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100, through a receiving port 603. In some embodiments, the receiving port 603 may be, for example, connected to one or more antennas in the first node 111. In other embodiments, the first node 111 may receive information from another structure in the communications system 100 through the receiving port 603. Since the receiving port 603 may be in communication with the processing circuitry 601 , the receiving port 603 may then send the received information to the processing circuitry 601 . The receiving port 603 may also be configured to receive other information.

[0200] The processing circuitry 601 in the first node 111 may be further configured to transmit or send information to e.g., the second node 112, the one or more third nodes 113, the fourth node 114, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100, through a sending port 604, which may be in communication with the processing circuitry 601 , and the memory 602.

[0201] Those skilled in the art will also appreciate that the units comprised within the first node 111 described above as being configured to perform different actions, may refer to a combination of analog and digital circuits, and / or one or more processors configured with software and / or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processing circuitry 601 , perform as described herein. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).

[0202] Also, in some embodiments, the first node 111 may be configured to perform the actions of Figure 2 and / or any of Figures 4-5 with respective units that may be implemented as one or more applications running on one or more processors such as the processing circuitry 601 . The first node 111 may be configured to perform any of the Actions described in relation to Figure 2 and / or any of Figures 4-5, e.g., by means of the processing circuitry 601 within the first node 111 , configured to perform any of such actions.

[0203] Thus, the methods according to the embodiments described herein for the first node 111 may be respectively implemented by means of a computer program 605 product, comprising instructions, i.e., software code portions, which, when executed on at least one processing circuitry 601 , cause the at least one processing circuitry 601 to carry out the actions described herein, as performed by the first node 111. The computer program 605 product may be stored on a computer-readable storage medium 606. The computer-readable storage medium 606, having stored thereon the computer program 605, may comprise instructions which, when executed on at least one processing circuitry 601 , cause the at least one processing circuitry 601 to carry out the actions described herein, as performed by the first node 111. In some embodiments, the computer-readable storage medium 606 may be a non-transitory computer- readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 605 product may be stored on a carrier containing the computer program 605 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 606, as described above.

[0204] The first node 111 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the first node 111 and other nodes or devices, e.g., the second node 112, the one or more third nodes 113, the fourth node 114, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.

[0205] In other embodiments, the first node 111 may comprise a radio circuitry 607, which may comprise e.g., the receiving port 603 and the sending port 604.

[0206] The radio circuitry 607 may be configured to set up and maintain at least a wireless connection with the second node 112, the one or more third nodes 113, the fourth node 114, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100. Circuitry may be understood herein as a hardware component.

[0207] Hence, embodiments herein also relate to the first node 111 operative to operate in the communications system 100. The first node 111 may comprise the processing circuitry 601 and the memory 602, said memory 602 containing instructions executable by said processing circuitry 601 , whereby the first node 111 is further operative to perform the actions described herein in relation to the first node, e.g., in Figure 2 and / or any of Figures 4-5.

[0208] Figure 7 depicts an example of the arrangement that the fourth node 114 may comprise to perform the method actions described above in relation to Figure 3 and / or Figures 4-5. The fourth node 114 may be understood to be for handling data. The fourth node 114 is configured to operate in the communications system 100.

[0209] Several embodiments are comprised herein. It should be noted that the examples herein are not mutually exclusive. One or more embodiments may be combined, where applicable. All possible combinations are not described to simplify the description. Components from one embodiment may be tacitly assumed to be present in another embodiment and it will be obvious to a person skilled in the art how those components may be used in the other exemplary embodiments. In Figure 7, optional components are depicted with dashed lines. The detailed description of some of the following corresponds to the same references provided above, in relation to the actions described for the fourth node 114 and will thus not be repeated here. For example, any of the first models may be configured to be ML models or statistical models.

[0210] The fourth node 114 is configured to obtain the respective plurality of second sets of data from the plurality of fifth nodes 115 configured to operate in the communications system 100.

[0211] The fourth node 114 is also configured to obtain the respective third profile of the fifth nodes in the plurality of fifth nodes 115 based on the respective one or more second characteristics of the fifth nodes the plurality of fifth nodes 115.

[0212] The fourth node 114 is further configured to determine the respective first model capable of estimating the probability value distribution of the respective plurality of second sets of data.

[0213] The fourth node 114 is also configured to create the third indication of the correspondence between the respective third profile configured to be obtained and the respective first model configured to be determined for every respective fifth node of the plurality of fifth nodes 115.

[0214] The fourth node 114 is further configured to provide the respective first models configured to be determined, the respective third profiles configured to be obtained and the third indication of the correspondence to the first node 111 configured to operate in the communications system 100.

[0215] In some embodiments, the respective third profiles configured to be obtained may be configured to comprise the plurality of respective third profiles for at least one of the fifth nodes in the plurality of fifth nodes 115 configured to be attributable to the at least one of the fifth nodes in the plurality of fifth nodes 115 based on the one or more contexts. In some embodiments, the determining of the respective first models may be configured to be performed by training the second MLM.

[0216] The embodiments herein in the fourth node 114 may be implemented through one or more processors, such as a processing circuitry 701 in the fourth node 114 depicted in Figure 7, together with computer program code for performing the functions and actions of the embodiments herein. A processor, as used herein, may be understood to be a hardware component. The program code mentioned above may also be provided as a computer program product, for instance in the form of a data carrier carrying computer program code for performing the embodiments herein when being loaded into the fourth node 114. One such carrier may be in the form of a CD ROM disc. It is however feasible with other data carriers such as a memory stick. The computer program code may furthermore be provided as pure program code on a server and downloaded to the fourth node 114.

[0217] The fourth node 114 may further comprise a memory 702 comprising one or more memory units. The memory 702 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the fourth node 114.

[0218] In some embodiments, the fourth node 114 may receive information from, e.g., the first node 111 , the second node 112, the one or more third nodes 113, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100, through a receiving port 703. In some embodiments, the receiving port 703 may be, for example, connected to one or more antennas in the fourth node 114. In other embodiments, the fourth node 114 may receive information from another structure in the communications system 100 through the receiving port 703. Since the receiving port 703 may be in communication with the processing circuitry 701 , the receiving port 703 may then send the received information to the processing circuitry 701 . The receiving port 703 may also be configured to receive other information.

[0219] The processing circuitry 701 in the fourth node 114 may be further configured to transmit or send information to e.g., the first node 111 , the second node 112, the one or more third nodes 113, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100, through a sending port 704, which may be in communication with the processing circuitry 701 , and the memory 702.

[0220] Those skilled in the art will also appreciate that the units comprised within the fourth node 114 described above as being configured to perform different actions, may refer to a combination of analog and digital circuits, and / or one or more processors configured with software and / or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processing circuitry 701 , perform as described herein. One or more of these processors, as well as the other digital hardware, may be included in a single Application-Specific Integrated Circuit (ASIC), or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).

[0221] Also, in some embodiments, the fourth node 114 may be configured to perform the actions of Figure 3 and / or any of Figures 4-5 with respective units that may be implemented as one or more applications running on one or more processors such as the processing circuitry 701.

[0222] The fourth node 114 may be configured to perform any of the Actions described in relation to Figure 3 and / or any of Figures 4-5, e.g., by means of the processing circuitry 701 within the fourth node 114, configured to perform any of such actions.

[0223] Thus, the methods according to the embodiments described herein for the fourth node 114 may be respectively implemented by means of a computer program 705 product, comprising instructions, i.e., software code portions, which, when executed on at least one processing circuitry 701 , cause the at least one processing circuitry 701 to carry out the actions described herein, as performed by the fourth node 114. The computer program 705 product may be stored on a computer-readable storage medium 706. The computer- readable storage medium 706, having stored thereon the computer program 705, may comprise instructions which, when executed on at least one processing circuitry 701 , cause the at least one processing circuitry 701 to carry out the actions described herein, as performed by the fourth node 114. In some embodiments, the computer-readable storage medium 706 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. In other embodiments, the computer program 705 product may be stored on a carrier containing the computer program 705 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the computer-readable storage medium 706, as described above.

[0224] The fourth node 114 may comprise a communication interface configured to facilitate, or an interface unit to facilitate, communications between the fourth node 114 and other nodes or devices, e.g., the first node 111 , the second node 112, the one or more third nodes 113, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100. The interface may, for example, include a transceiver configured to transmit and receive radio signals over an air interface in accordance with a suitable standard.

[0225] In other embodiments, the fourth node 114 may comprise a radio circuitry 707, which may comprise e.g., the receiving port 703 and the sending port 704.

[0226] The radio circuitry 707 may be configured to set up and maintain at least a wireless connection with the first node 111 , the second node 112, the one or more third nodes 113, the plurality of fifth nodes 115, one or more radio network nodes, wireless devices, such as the first wireless device 131 and the second wireless device 132, another node or user equipment, and / or another structure in the communications system 100. Circuitry may be understood herein as a hardware component.

[0227] Hence, embodiments herein also relate to the fourth node 114 operative to operate in the communications system 100. The fourth node 114 may comprise the processing circuitry 701 and the memory 702, said memory 702 containing instructions executable by said processing circuitry 701 , whereby the fourth node 114 is further operative to perform the actions described herein in relation to the first node, e.g., in Figure 3 and / or any of Figures 4-5.

[0228] When using the word "comprise" or “comprising”, it shall be interpreted as non- limiting, i.e., meaning "consist at least of'.

[0229] The embodiments herein are not limited to the above-described preferred embodiments. Various alternatives, modifications and equivalents may be used. Therefore, the above embodiments should not be taken as limiting the scope of the invention.

[0230] Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and / or is implied from the context in which it is used. All references to a / an / the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and / or where it is implicit that a step must follow or precede another step. Any feature of any of the embodiments disclosed herein may be applied to any other embodiment, wherever appropriate. Likewise, any advantage of any of the embodiments may apply to any other embodiments, and vice versa. Other objectives, features and advantages of the enclosed embodiments will be apparent from the following description.

[0231] As used herein, the expression “at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the “and” term, may be understood to mean that only one of the list of alternatives may apply, more than one of the list of alternatives may apply or all of the list of alternatives may apply. This expression may be understood to be equivalent to the expression “at least one of:” followed by a list of alternatives separated by commas, and wherein the last alternative is preceded by the “or” term.

[0232] Any of the terms processor and circuitry may be understood herein as a hardware component. As used herein, the expression “in some embodiments” has been used to indicate that the features of the embodiment described may be combined with any other embodiment or example disclosed herein.

[0233] As used herein, the expression “in some examples” has been used to indicate that the features of the example described may be combined with any other embodiment or example disclosed herein.

Claims

36CLAIMS:1 . A computer-implemented method performed by a first node (111), the method being for handling data, the first node (111) operating in a communications system (100), the method comprising:- obtaining (201) a first set of data from a second node (112) operating in the communications system (100),- obtaining (202) a first profile of the second node (112) based on one or more first characteristics of the second node (112),- obtaining (203), based on the obtained first profile, at least one of one or more first models capable of estimating a respective probability value distribution of data,- sending (204) a first indication of the first set of data to one or more third nodes (113) operating in the communications system (100),- obtaining (205) a respective second indication from the one or more third nodes (113), the respective second indication indicating a respective second result of a respective determination of a respective third node of the one or more third nodes (113), using at least a respective subset of the one or more first models, of whether or not the first set of data falls within respective probability value distributions respectively estimated by the respective subset of the one or more first models,- obtaining (206), based on the one or more first models capable of estimating the respective probability value distribution of data, a respective first result of whether or not the first set of data falls within the respective probability value distributions respectively estimated with the one or more first models, wherein the one or more first models correspond to the one or more first characteristics of the second node (112),- determining (207) whether or not the first set of data is to be used as input to train a first Machine Learning model, MLM, the determining (207) being based on the obtained respective first results, and- providing (208), based on a result of the determining (207), the first set of data as input to train the first MLM.

2. The method according to claim 1 , wherein determining (207) is further based on the obtained respective second indications.

373. The method according to any claim 2, wherein the determining (207) is based on a consensus criterion among at least one of: the obtained respective first results and the obtained respective second results.

4. The method according to any one of claims 2-3, wherein the determining (207) comprises applying a weighted factor to at least one of: the obtained respective first results and the obtained respective second results.

5. The method according to claim 4, wherein the weighted factor is based on a closeness of a respective profile of the respective third node of the one or more third nodes (113) and the obtained first profile.

6. The method according to any one of claims 1-5, wherein the obtained first profile is one of a plurality of first profiles attributable to the second node (112).

7. The method according to any one of claims 1-6, wherein the determining (207) is further based on one or more of: a time of arrival of the first set of data, a completeness of the first set of data, the obtained first profile, and an accuracy of the first set of data.

8. The method according to any of claims 1-7, wherein the result of the determining (207) is a third result, wherein with the proviso the third result is that the first set of data is not to be used as input to train the first MLM, and the first node (111) has previously determined the same third result for previous sets of data obtained from the second node (112), the method further comprises:- refraining (209) from accepting further data from the second node (112) for one or more of: a period of time and a set of observations.

9. The method according to any of claims 1-8, wherein the obtaining (203) of the at least one of the one or more first models comprises one of:- training the at least one of the one or more first models by the first node (111), and- receiving the trained at least one of the one or more first models from a fourth node (114) operating in the communications system (100).

10. A computer-implemented method performed by a fourth node (114), the method being for handling data, the fourth node (114) operating in a communications system (100), the method comprising:- obtaining (301) a respective plurality of second sets of data from a plurality of fifth nodes (115) operating in the communications system (100),- obtaining (302) a respective third profile of the fifth nodes in the plurality of fifth nodes (115) based on respective one or more second characteristics of the fifth nodes the plurality of fifth nodes (115),- determining (303) a respective first model capable of estimating a probability value distribution of the respective plurality of second sets of data,- creating (304) a third indication of a correspondence between the obtained respective third profile and the determined respective first model for every respective fifth node of the plurality of fifth nodes (115), and- providing (305) the determined respective first models, the obtained respective third profiles and the third indication of the correspondence to a first node (111) operating in the communications system (100).11 . The method according to claim 10, wherein the obtained respective third profiles comprise a plurality of respective third profiles for at least one of the fifth nodes in the plurality of fifth nodes (115) attributable to the at least one of the fifth nodes in the plurality of fifth nodes (115) based on one or more contexts.

12. The method according to any one of claims 10-11 , wherein the determining (303) of the respective first models is performed by training a second Machine Learning Model, MLM.

13. A first node (111), for handling data, the first node (111) being configured to operate in a communications system (100), the first node (111) being further configured to:- obtain a first set of data from a second node (112) configured to operate in the communications system (100),- obtain a first profile of the second node (112) based on one or more first characteristics of the second node (112),- obtain, based on the first profile configured to be obtained, at least one of one or more first models configured to be capable of estimating a respective probability value distribution of data,- send a first indication of the first set of data to one or more third nodes (113) configured to operate in the communications system (100),- obtain a respective second indication from the one or more third nodes (113), the respective second indication being configured to indicate a respective second result of a respective determination of a respective third node of the oneor more third nodes (113), using at least a respective subset of the one or more first models, of whether or not the first set of data falls within respective probability value distributions configured to be respectively estimated by the respective subset of the one or more first models,- obtain, based on the one or more first models capable of estimating the respective probability value distribution of data, a respective first result of whether or not the first set of data falls within the respective probability value distributions configured to be respectively estimated with the one or more first models, wherein the one or more first models are configured to correspond to the one or more first characteristics of the second node (112),- determine whether or not the first set of data is to be used as input to train a first Machine Learning model, MLM, the determining, being configured to be based on the respective first results configured to be obtained, and- provide, based on a result of the determining, the first set of data as input to train the first MLM.

14. The first node (111) according to claim 13, wherein determining is configured to be further based on the respective second indications configured to be obtained.

15. The first node (111) according to any claim 14, wherein the determining is configured to be based on a consensus criterion among at least one of: the respective first results configured to be obtained and the respective second results configured to be obtained.

16. The first node (111) according to any one of claims 14-15, wherein the determining is configured to comprise applying a weighted factor to at least one of: the respective first results configured to be obtained and the respective second results configured to be obtained.

17. The first node (111) according to claim 16, wherein the weighted factor is configured to be based on a closeness of a respective profile of the respective third node of the one or more third nodes (113) and the first profile configured to be obtained.

18. The first node (111) according to any one of claims 13-17, wherein the first profile configured to be obtained is configured to be one of a plurality of first profiles configured to be attributable to the second node (112).

19. The first node (111) according to any one of claims 13-18, wherein the determining is further configured to be based on one or more of: a time of arrival of the first set of data, a completeness of the first set of data, the first profile configured to be obtained, and an accuracy of the first set of data.

20. The first node (111) according to any of claims 13-19, wherein the result of the determining is configured to be a third result, wherein with the proviso the third result is that the first set of data is not to be used as input to train the first MLM, and the first node (111) has previously determined the same third result for previous sets of data obtained from the second node (112), the first node (111) is further configured to:- refrain from accepting further data from the second node (112) for one or more of: a period of time and a set of observations.

21. The first node (111) according to any of claims 13-20, wherein the obtaining of the at least one of the one or more first models is configured to comprise one of:- training the at least one of the one or more first models by the first node (111), and- receiving the trained at least one of the one or more first models from a fourth node (114) configured to operate in the communications system (100).

22. A fourth node (114), for handling data, the fourth node (114) being configured to operate in a communications system (100), the fourth node (114) being further configured to:- obtain a respective plurality of second sets of data from a plurality of fifth nodes (115) configured to operate in the communications system (100),- obtain a respective third profile of the fifth nodes in the plurality of fifth nodes (115) based on respective one or more second characteristics of the fifth nodes the plurality of fifth nodes (115),- determine a respective first model capable of estimating a probability value distribution of the respective plurality of second sets of data,- create a third indication of a correspondence between the respective third profile configured to be obtained and the respective first model configured to be determined for every respective fifth node of the plurality of fifth nodes (115), and- provide the respective first models configured to be determined, the respective third profiles configured to be obtained and the third indication of the41 correspondence to a first node (111) configured to operate in the communications system (100).

23. The fourth node (114) according to claim 22, wherein the respective third profiles configured to be obtained are configured to comprise a plurality of respective third profiles for at least one of the fifth nodes in the plurality of fifth nodes (115) configured to be attributable to the at least one of the fifth nodes in the plurality of fifth nodes (115) based on one or more contexts.

24. The fourth node (114) according to any one of claims 22-23, wherein the determining of the respective first models is configured to be performed by training a second Machine Learning Model, MLM.

25. A computer program (605), comprising instructions which, when executed on at least one processing circuitry (601), cause the at least one processing circuitry (601) to carry out the method according to any of claims 1-9.

26. A computer-readable storage medium (606), having stored thereon a computer program (605), comprising instructions which, when executed on at least one processing circuitry (601), cause the at least one processing circuitry (601) to carry out the method according to any of claims 1-9.

27. A computer program (705), comprising instructions which, when executed on at least one processing circuitry (701), cause the at least one processing circuitry (701) to carry out the method according to any of claims 10-12.

28. A computer-readable storage medium (706), having stored thereon a computer program (705), comprising instructions which, when executed on at least one processing circuitry (701), cause the at least one processing circuitry (701) to carry out the method according to any of claims 10-12.