Target task prediction method, data processing method and system of task prediction model

By adding a data adjustment layer to the natural language pre-trained model, a task prediction model adapted to power load forecasting is generated, which solves the problem of high data requirements for power load forecasting models and achieves efficient power load forecasting.

CN116050639BActive Publication Date: 2026-06-19ALIBABA (CHINA) CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
ALIBABA (CHINA) CO LTD
Filing Date
2023-01-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing power load forecasting models have high data requirements and struggle to maintain efficient forecasting in the absence of high-quality labeled data.

Method used

By adding a data adjustment layer to the natural language pre-trained model, a task prediction model is generated, which adapts to the processing requirements of the target task's time-series data and reduces the dependence on sample labeled data.

Benefits of technology

It enables efficient power load forecasting even in the absence of high-quality sample labeled data, improving forecasting efficiency and accuracy.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116050639B_ABST
    Figure CN116050639B_ABST
Patent Text Reader

Abstract

This specification provides a target task prediction method, a data processing method for a task prediction model, and a system. The target task prediction method includes: acquiring target task time-series data; inputting the target task time-series data into the embedding layer of a task prediction model to obtain initial time-series feature data; inputting the initial time-series feature data into the splitting mapping layer of the task prediction model to obtain multiple split sub-data feature vectors; inputting the multiple split sub-data feature vectors into the decoder of the task prediction model to obtain multiple predicted split sub-data feature vectors; and inputting the multiple predicted split sub-data feature vectors into the mapping and restoration layer of the task prediction model to obtain target task prediction time-series data. By sequentially inputting the target task time-series data into the embedding layer and mapping and restoration layer of the task prediction model to obtain multiple split sub-data feature vectors, the decoder can directly perform prediction processing on the multiple split sub-data feature vectors, improving prediction efficiency.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This specification relates to the field of machine learning, and in particular to a method for predicting a target task. One or more embodiments of this specification also relate to a power load prediction method, a data processing method for a task prediction model, a data processing system for a task prediction model, a target task prediction device, a power load prediction device, a data processing device for a task prediction model, a computing device, a computer-readable storage medium, and a computer program. Background Technology

[0002] With the development of computer technology, machine learning has been widely used in fields such as natural language processing, computer vision, and speech recognition. For example, in predicting future events, the usual practice is to select a suitable machine learning model and train the model using event-related data to continuously optimize the reliability of the model's prediction results.

[0003] When forecasting power load, most current power load forecasting models require massive amounts of labeled data to maintain the accuracy of the model's forecast output. However, in real life, it is often difficult to obtain high-quality, massive amounts of labeled sample data for training. Without such labeled sample data for training, the forecasting model will be less efficient. Therefore, there is an urgent need for an efficient solution for forecasting target tasks. Summary of the Invention

[0004] In view of the above, embodiments of this specification provide a target task prediction method. One or more embodiments of this specification also relate to a power load prediction method, a data processing method for a task prediction model, a data processing system for a task prediction model, a target task prediction device, a power load prediction device, a data processing device for a task prediction model, a computing device, a computer-readable storage medium, and a computer program, to address the technical deficiencies existing in the prior art.

[0005] According to a first aspect of the embodiments of this specification, a target task prediction method is provided, applied to an edge device, comprising:

[0006] Obtain the time-series data of the target task;

[0007] The target task time series data is input into the task prediction model. After processing by the task prediction model, the target task prediction time series data is obtained. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the target task time series data meets the data processing requirements of the natural language pre-training model.

[0008] According to a second aspect of the embodiments of this specification, a power load forecasting method is provided, applied to end-side equipment, comprising:

[0009] Obtain the power load time-series data input by the user on the front end;

[0010] The power load time series data is input into the task prediction model. After processing by the task prediction model, power load prediction time series data is obtained. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the power load time series data meets the data processing requirements of the natural language pre-training model.

[0011] The power load forecast time series data is fed back to the front end for display.

[0012] According to a third aspect of the embodiments of this specification, a data processing method for a task prediction model is provided, applied to a cloud-side device, comprising:

[0013] Obtain a sample set, wherein the sample set includes multiple sample data, and the multiple sample data carry sample labels;

[0014] The sample data is input into the initial task prediction model, and the initial task prediction model processes the data to obtain prediction data. The initial task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the sample data meets the data processing requirements of the natural language pre-training model.

[0015] Compare the predicted data with the sample labels, and calculate the loss value;

[0016] Based on the loss value, adjust the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model, and return to the step of inputting the sample data into the initial task prediction model. If the training stopping condition is met, obtain the model parameters of the trained task prediction model.

[0017] The model parameters of the trained task prediction model are sent to the edge device.

[0018] According to a fourth aspect of the embodiments of this specification, a data processing system for a task prediction model is provided, comprising:

[0019] An edge device is used to construct a sample set and send the sample set to a cloud device, wherein the sample set includes multiple sample data, and the multiple sample data carry sample tags;

[0020] A cloud-side device is used to input sample data from the sample set into an initial task prediction model. The initial task prediction model processes the data to obtain prediction data. This initial task prediction model is generated by adjusting a natural language pre-trained model. This adjustment involves adding a data adjustment layer to the natural language pre-trained model to make the sample data meet the data processing requirements of the natural language pre-trained model. The device then compares the prediction data with the sample labels and calculates a loss value. Based on the loss value, it adjusts the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model and returns to the step of inputting the sample data into the initial task prediction model. If the training stops, the device obtains the model parameters of the trained task prediction model and sends these model parameters to the edge device.

[0021] According to a fifth aspect of the embodiments of this specification, a target task prediction apparatus is provided, comprising:

[0022] The first acquisition module is configured to acquire the timing data of the target task.

[0023] The first acquisition module is configured to input the target task time series data into the task prediction model, and obtain the target task prediction time series data through the processing of the task prediction model. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the target task time series data meets the data processing requirements of the natural language pre-training model.

[0024] According to a sixth aspect of the embodiments of this specification, a power load forecasting device is provided, applied to end-side equipment, comprising:

[0025] The second acquisition module is configured to acquire the power load time-series data input by the user at the front end;

[0026] The second acquisition module is configured to input the power load time series data into the task prediction model, and obtain power load prediction time series data through the processing of the task prediction model. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the power load time series data meets the data processing requirements of the natural language pre-training model.

[0027] The feedback module is configured to feed back the power load forecast time series data to the front-end display.

[0028] According to a seventh aspect of the embodiments of this specification, a data processing apparatus for a task prediction model is provided, applied to a cloud-side device, comprising:

[0029] The sample set acquisition module is configured to acquire a sample set, wherein the sample set includes multiple sample data, and the multiple sample data carry sample labels;

[0030] The third acquisition module is configured to input the sample data into the initial task prediction model, and obtain prediction data through the processing of the initial task prediction model. The initial task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the sample data meets the data processing requirements of the natural language pre-training model.

[0031] The calculation module is configured to compare the predicted data and the sample labels to calculate the loss value;

[0032] The adjustment module is configured to adjust the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model according to the loss value, and return to the step of inputting the sample data into the embedding layer of the initial task prediction model to obtain the initial temporal feature data. If the training stopping condition is reached, the model parameters of the trained task prediction model are obtained.

[0033] The sending module is configured to send the model parameters of the trained task prediction model to the end device, wherein the model parameters include the parameters of the data adjustment layer and the data restoration layer.

[0034] According to an eighth aspect of the embodiments of this specification, a computing device is provided, comprising:

[0035] Memory and processor;

[0036] The memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions. When the computer-executable instructions are executed by the processor, they implement the steps of the data processing method of the above-mentioned target task prediction method, power load prediction method, or task prediction model.

[0037] According to a ninth aspect of the embodiments of this specification, a computer-readable storage medium is provided that stores computer-executable instructions, which, when executed by a processor, implement the steps of the data processing method of the target task prediction method, the power load prediction method, or the task prediction model described above.

[0038] According to a tenth aspect of the embodiments of this specification, a computer program is provided, wherein when the computer program is executed in a computer, it causes the computer to perform the steps of the data processing method of the target task prediction method, the power load prediction method, or the task prediction model described above.

[0039] An embodiment of this specification provides a target task prediction method comprising: acquiring target task time-series data; inputting the target task time-series data into a task prediction model, and processing the data through the task prediction model to obtain target task prediction time-series data. The task prediction model is generated by adjusting a natural language pre-training model. This adjustment involves adding a data adjustment layer to the natural language pre-training model to ensure that the target task time-series data meets the data processing requirements of the natural language pre-training model. By adding a data adjustment layer to the natural language pre-training model, the target task time-series data meets the data processing requirements of the natural language pre-training model when making predictions based on the acquired target task time-series data. Furthermore, by adjusting the natural language pre-training model to generate the task prediction model, the task prediction model can efficiently obtain prediction time-series data without requiring a large amount of labeled sample data for training, thus achieving high efficiency in prediction based on target task time-series data. Attached Figure Description

[0040] Figure 1 This is a model structure diagram of a GPT2 model;

[0041] Figure 2 This is a framework diagram of a data processing system for a task prediction model provided in one embodiment of this specification;

[0042] Figure 3 This is a framework diagram of a data processing system for another task prediction model provided in one embodiment of this specification;

[0043] Figure 4 This is a flowchart of a target task prediction method provided in one embodiment of this specification;

[0044] Figure 5 This is a flowchart of an embodiment of an electricity load forecasting method provided in this specification;

[0045] Figure 6 This is a flowchart illustrating a data processing method for a task prediction model provided in one embodiment of this specification;

[0046] Figure 7 This is a model structure diagram of a task prediction model for a target task prediction method provided in one embodiment of this specification;

[0047] Figure 8This is a flowchart illustrating the processing procedure of a target task prediction method provided in one embodiment of this specification.

[0048] Figure 9 This is a schematic diagram of the structure of a target task prediction device provided in one embodiment of this specification;

[0049] Figure 10 This is a schematic diagram of the structure of a power load forecasting device provided in one embodiment of this specification;

[0050] Figure 11 This is a schematic diagram of the structure of a data processing device for a task prediction model provided in one embodiment of this specification;

[0051] Figure 12 This is a structural block diagram of a computing device provided in one embodiment of this specification. Detailed Implementation

[0052] Many specific details are set forth in the following description to provide a full understanding of this specification. However, this specification can be implemented in many other ways than those described herein, and those skilled in the art can make similar extensions without departing from the spirit of this specification. Therefore, this specification is not limited to the specific implementations disclosed below.

[0053] The terminology used in one or more embodiments of this specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of this specification. The singular forms “a,” “described,” and “the” as used in one or more embodiments of this specification and the appended claims are also intended to include the plural forms unless the context clearly indicates otherwise. It should also be understood that the term “and / or” as used in one or more embodiments of this specification refers to and includes any or all possible combinations of one or more associated listed items.

[0054] It should be understood that although the terms first, second, etc., may be used to describe various information in one or more embodiments of this specification, such information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, first may also be referred to as second without departing from the scope of one or more embodiments of this specification, and similarly, second may also be referred to as first. Depending on the context, the word "if" as used herein may be interpreted as "when," "when," or "in response to a determination."

[0055] First, the terms and concepts used in one or more embodiments of this specification will be explained.

[0056] Natural Language Processing (NLP) is an important field within computer science and artificial intelligence. It studies the theories and methods that enable effective communication between humans and computers using natural language.

[0057] Pretrained Language Model (PRM): A transformer-based language model that is massive in scale and trained on huge datasets. Its structure is similar to a transformer that only contains a decoder. For example, the open-source pretrained natural language model (GPT2, generative pre-trained transformer2) is an application of NLP.

[0058] Figure 1 A model structure diagram of a gpt2 model is shown. See [link / reference]. Figure 1 The input data is processed through an embedding layer, which includes an input embedding and positional embeddings. Specifically, the input data is passed through the embedding layer to obtain the embedding vector, and the positional embedding is processed on the embedding vector. The result from the embedding layer is then fed into the decoder for processing. The decoder includes an attention layer and a fully connected layer. The attention layer includes a multi-head attention layer, residual connections, and layer normalization (Add & Layer Norm). Specifically, the data output from the embedding layer is sequentially fed into the multi-head attention layer, residual connections, and layer normalization. The result from the attention layer is then fed into the fully connected layer, which includes a feed forward layer, residual connections, and layer normalization (Add & Layer Norm). The result from the fully connected layer is then fed into the output layer to obtain the prediction result.

[0059] N-gram model: This is an algorithm based on statistical language models. Its basic idea is to process the text content into a sliding window of size N bytes, forming a sequence of byte segments of length N.

[0060] Transformer is a model that uses attention mechanisms to improve the speed of model training.

[0061] A multilayer perceptron (MLP), also known as an artificial neural network (ANN), can have multiple hidden layers in addition to the input and output layers. The simplest MLP contains only one hidden layer, i.e., a three-layer structure.

[0062] Knowledge transfer is the influence of one learning process on another. In the continuous process of learning, all learning is based on the learner's existing knowledge, experience, cognitive structure, acquired motor skills, and learned attitudes.

[0063] Time series data refers to data collected over time. It is a data series recorded in chronological order based on the same unified indicator. All data points within the same data series must be of the same statistical caliber and be comparable. Time series data can be data from periods or data points.

[0064] Time series forecasting is a regression forecasting method, belonging to quantitative forecasting. Its basic principle is: on the one hand, it acknowledges the continuity of the development of things and uses past time series data for statistical analysis to infer the development trend of things; on the other hand, it fully considers the randomness caused by accidental factors. In order to eliminate the impact of random fluctuations, it uses historical data for statistical analysis and appropriately processes the data to make trend predictions.

[0065] The integration of a large amount of distributed renewable energy into the power grid poses significant challenges to tasks such as power generation planning and scheduling, impacting the safe and stable operation of the grid. Therefore, accurate load forecasting capabilities have become a crucial technological foundation. Load forecasting refers to determining load data at a specific future moment based on various factors such as system operating characteristics, capacity expansion decisions, natural conditions, and social impacts, while meeting certain accuracy requirements. It is a prerequisite for power system dispatching, real-time control, operation planning, and development planning, and is essential information for power grid dispatching and planning departments. Accurate load forecasting allows for the economical and rational scheduling of generator start-up and shutdown within the power grid, maintaining the safe and stable operation of the grid, reducing unnecessary spinning reserve capacity, rationally scheduling generator maintenance, ensuring normal social production and life, effectively reducing power generation costs, and improving economic and social benefits.

[0066] Traditional electricity load forecasting relies on manual point mapping by personnel. With the integration of new energy sources, especially distributed photovoltaic (PV) systems, the difficulty and complexity of manual point mapping for load forecasting have increased. Traditional load forecasting, dependent on human experience, requires personnel to spend a significant amount of time analyzing and manually mapping data in the face of various external inputs, posing a substantial challenge and burden on their work. Under the new dual-carbon environment, there is an urgent need to utilize big data and artificial intelligence to make accurate electricity load forecasts, ensuring both speed and accuracy. This would effectively reduce the workload of grid power generation planning and scheduling, and improve the efficiency of load forecasting personnel.

[0067] Most existing deep learning algorithms for long-range time series prediction are transformer-based, which have the problem of excessively high data requirements for the model. Furthermore, existing time series prediction algorithms such as Fedformer (a deep model based on Fourier transform and wavelet transform for feature extraction), Autoformer (a long-term series prediction model based on deep decomposition architecture and autocorrelation mechanism), Informer (a novel Transformer for long-term time series prediction), LogTrans (a Transformer with a tightly coupled convolutional time series prediction architecture), and Reformer (requiring less memory and effectively handling very long time series data) have relatively low accuracy in multivariate and univariate prediction.

[0068] To address the aforementioned technical problems, this specification provides a time series prediction algorithm, specifically a target task prediction method. The method involves acquiring target task time series data; inputting the target task time series data into a task prediction model; and processing the data to obtain predicted target task time series data. The task prediction model is generated by adjusting a natural language pre-trained model. This adjustment involves adding a data adjustment layer to the natural language pre-trained model to ensure the target task time series data meets the data processing requirements of the natural language pre-trained model. By adding a data adjustment layer to the natural language pre-trained model, the target task time series data meets the data processing requirements of the natural language pre-trained model when making predictions based on the acquired target task time series data. Furthermore, by adjusting the natural language pre-trained model to generate the task prediction model, the task prediction model can efficiently obtain predicted time series data without requiring a large amount of labeled sample data for training, thus achieving high efficiency in prediction based on target task time series data.

[0069] This specification provides a target task prediction method, and also relates to an electricity load prediction method, a data processing method for a task prediction model, a data processing system for a task prediction model, a target task prediction device, an electricity load prediction device, a data processing device for a task prediction model, a computing device, and a computer-readable storage medium, which will be described in detail in the following embodiments.

[0070] See Figure 2 , Figure 2 A framework diagram of a data processing system for a task prediction model according to an embodiment of this specification is shown, wherein the data processing system for the task prediction model includes cloud-side devices and edge-side devices.

[0071] The edge device is used to build a sample set and send the sample set to the cloud device. The sample set includes multiple sample data, and the multiple sample data carry sample labels.

[0072] The cloud-based device is used to input sample data from the sample set into the initial task prediction model. After processing by the initial task prediction model, prediction data is obtained. The initial task prediction model is generated by adjusting the natural language pre-trained model. The adjustment refers to adding a data adjustment layer to the natural language pre-trained model to make the sample data meet the data processing requirements of the natural language pre-trained model. The predicted data and sample labels are compared to calculate the loss value. Based on the loss value, the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model are adjusted, and the process of inputting sample data into the initial task prediction model is returned. When the training stopping condition is met, the model parameters of the trained task prediction model are obtained.

[0073] Furthermore, the cloud-side device is also used to send the model parameters of the trained task prediction model to the edge-side device.

[0074] Applying the scheme of the embodiments in this specification, the edge device constructs a sample set and sends the sample set to the cloud device. The sample set includes multiple sample data points, each carrying a sample label. The cloud device inputs the sample data from the sample set into an initial task prediction model. After processing by the initial task prediction model, prediction data is obtained. The initial task prediction model is generated by adjusting a natural language pre-trained model. Adjustment refers to adding a data adjustment layer to the natural language pre-trained model to make the target task time-series data conform to the data processing requirements of the natural language pre-trained model. The prediction data and sample labels are compared, and a loss value is calculated. Based on the loss value, the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model are adjusted, and the process returns to the step of inputting sample data into the initial task prediction model. When the training stopping condition is met, the model parameters of the trained task prediction model are obtained. The cloud device sends the model parameters of the trained task prediction model to the edge device. By adding a data adjustment layer to the natural language pre-trained model, the target task time series data meets the data processing requirements of the natural language pre-trained model when making predictions based on the acquired target task time series data. Furthermore, by adjusting the natural language pre-trained model to generate a task prediction model, the task prediction model can efficiently obtain prediction time series data without using a large amount of sample labeled data for training, thus achieving high efficiency in prediction based on target task time series data.

[0075] See Figure 3 , Figure 3 This specification illustrates a framework diagram of a data processing system for another task prediction model provided in one embodiment. The system includes a cloud-side device and multiple edge devices. Communication connections can be established between the multiple edge devices via the cloud-side device. In the task prediction model data processing scenario, the cloud-side device provides data processing services for the task prediction model between the multiple edge devices. Each edge device can act as a sender or receiver, achieving real-time communication through the cloud-side device.

[0076] Users can interact with cloud devices via edge devices to receive data sent by other edge devices or send data to other edge devices. In the data processing scenario of task prediction models, users can publish data streams to cloud devices via edge devices. The cloud devices can then process the data of the task prediction model based on the data streams and push the processed task prediction model to other edge devices that have established communication.

[0077] In this process, the edge device and the cloud device establish a connection through a network. The network provides the medium for communication between the edge device and the cloud device. The network can include various connection types, such as wired, wireless communication links, or fiber optic cables. The data transmitted by the edge device may need to undergo encoding, transcoding, compression, and other processing before being published to the cloud device.

[0078] End-device devices can be browsers, apps (applications), web applications such as H5 (HyperText Markup Language 5) applications, lightweight applications (also known as mini-programs), or cloud applications. End-device devices can be developed using software development kits (SDKs) provided by cloud-based devices, such as real-time communication (RTC) SDKs. End-device devices can be deployed in electronic devices and rely on the device's operation or certain apps within the device to run. Electronic devices may have displays and support information browsing, such as personal mobile terminals like smartphones, tablets, and personal computers. Various other types of applications can also be configured in electronic devices, such as human-computer interaction applications, model training applications, text processing applications, web browser applications, shopping applications, search applications, instant messaging tools, email end-devices, and social media platform software.

[0079] Cloud-side devices can include servers providing various services, such as servers providing communication services for multiple end-device devices, servers supporting backend training of models used on end-device devices, and servers processing data sent by end-device devices. It should be noted that cloud-side devices can be implemented as a distributed server cluster composed of multiple servers, or as a single server. Servers can also be servers in a distributed system, or servers incorporating blockchain technology. Servers can also be cloud servers providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content delivery networks (CDNs), and big data and artificial intelligence platforms, or intelligent cloud computing servers or intelligent cloud hosts with artificial intelligence technology.

[0080] It is worth noting that the data processing method for the task prediction model provided in the embodiments of this specification can be executed by a cloud-side device. In other embodiments of this specification, the edge device may also have similar functions to the cloud-side device, thereby executing the data processing method for the task prediction model provided in the embodiments of this specification. In other embodiments, the data processing method for the task prediction model provided in the embodiments of this specification may also be executed jointly by the cloud-side device and the edge device.

[0081] See Figure 4 , Figure 4 A flowchart of a target task prediction method provided in one embodiment of this specification is shown, which is applied to an edge device and specifically includes the following steps.

[0082] Step 402: Obtain the target task time series data.

[0083] When there is a need for prediction of target task time series data, the edge device will acquire the target task time series data. The target task time series data can be input by the user at the front end, or it can be obtained by the edge device from the database that stores time series data.

[0084] Specifically, target task time series data refers to the time series data corresponding to the target task. This data is continuous data. For example, when the target task is power load, the power load time series data can be represented by a continuous waveform.

[0085] In one possible implementation of this specification, the end device can open a channel for uploading target task timing data (e.g., via Bluetooth, network, etc.) through a user's click, allowing the user to upload the target task timing data, and the end device to make predictions based on the user-uploaded target task timing data.

[0086] In another possible implementation of this specification, the end device may contain a large amount of timing data. The user can click on which timing data needs to be uploaded, and the selected timing data can be used as the target task timing data. The end device can then make predictions based on the target task timing data.

[0087] Optionally, the target task time series data may carry the time series length of the time series data to be predicted. For example, if the target task time series data carries a time series length of 1 hour, then the time series length of the time series data predicted by the task prediction model will be 1 hour.

[0088] Step 404: Input the target task time series data into the task prediction model. After processing by the task prediction model, target task prediction time series data is obtained. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the target task time series data meets the data processing requirements of the natural language pre-training model.

[0089] Specifically, a task prediction model refers to a model that predicts time-series data. For example, inputting target task time-series data into a task prediction model yields target task predicted time-series data corresponding to the target task time-series data. Target task predicted time-series data refers to the target task time-series data obtained after being processed by the task prediction model. This target task predicted time-series data is continuous time-series data, predicted based on the target task time-series data and the task prediction model. A data adjustment layer refers to a network layer that adjusts the format and form of the data. For example, inputting time-series data into a data adjustment layer results in output data that has been adjusted according to the data adjustment rules of the data adjustment layer. The data processing requirements for a natural language pre-trained model refer to the processing requirements of the natural language pre-trained model for the input data. For example, data processing requirements might include the need for data to be discretized data patches, and the data dimensionality needing to be related to the requirements of the processing network layer.

[0090] In one or more embodiments of this specification, after the end-side device obtains the target task time series data, it inputs the target task time series data into the task prediction model to obtain the target task prediction time series data corresponding to the target task time series data.

[0091] In practical applications, adjusting a natural language pre-trained model to generate a task prediction model can be achieved by adding a data adjustment layer to the natural language pre-trained model. This ensures that the data processed by the natural language pre-trained model is data that has been processed by the data adjustment layer, and thus the data processed by the natural language pre-trained model is obtained by adjusting the time-series data of the target task through the data adjustment layer.

[0092] Alternatively, the natural language pre-trained model can be a GPT2 model.

[0093] Optionally, when a data adjustment layer is added to the natural language pre-training model to process the target task time series data, an inverse adjustment network layer corresponding to the data adjustment layer can be added to the natural language pre-training model so that when the predicted data corresponding to the target task time series data is obtained, the corresponding adjustment can be made to obtain the target task predicted time series data.

[0094] In one optional embodiment of this specification, the task prediction model includes an embedding layer, a data adjustment layer, a decoder, and a data restoration layer.

[0095] Specifically, the embedding layer refers to the network layer that performs embedding processing on the input target task time-series data to generate corresponding embedded data. The decoder refers to a decoder composed of multiple network layers that perform predictive processing on the data to generate predicted data corresponding to the target task time-series data. For example, the decoder can be the part of the GPT2 model excluding the embedding and output layers. The data restoration layer restores the data's form and format. The data restoration layer's restoration is based on the data adjustment layer. For example, if the data adjustment layer splits the data, the data restoration layer merges the data; if the data adjustment layer increases the data's dimensionality, the data restoration layer reduces the data's dimensionality.

[0096] Optionally, the order of network layers in the task prediction model can be an embedding layer, a data adjustment layer, a decoder, and a data restoration layer. Specifically, the acquired target task time series data is input into the embedding layer, the data processed by the embedding layer is input into the data adjustment layer for processing, the processed data is input into the value decoder for prediction to obtain the corresponding prediction data, and the prediction data is input into the data restoration layer to obtain the target task prediction time series data corresponding to the target task time series data.

[0097] The solution implemented in the embodiments of this specification includes an embedding layer, a data adjustment layer, a decoder, and a data restoration layer in the task prediction model. This allows the acquired target task time series data to be processed by the embedding layer, the data adjustment layer, the decoder, and the data restoration layer to obtain target task prediction time series data corresponding to the target task time series data. By transferring data between the network layers, the prediction efficiency of the target task time series data is improved.

[0098] In one optional embodiment of this specification, step 404 above, which involves inputting the target task time series data into the task prediction model and processing it to obtain the target task prediction time series data, includes the following specific steps:

[0099] The target task time series data is input into the embedding layer to obtain initial time series feature data;

[0100] The initial time-series feature data is input into the data adjustment layer to obtain multiple split sub-data feature vectors;

[0101] The multiple split sub-data feature vectors are input into the decoder to obtain multiple predicted split sub-data feature vectors;

[0102] The feature vectors of the multiple predicted sub-data are input into the data restoration layer to obtain the time series data for the target task prediction.

[0103] In one or more embodiments of this specification, after the end-side device obtains the target task time series data, it can further input the target task time series data into the task prediction model for prediction processing. The target task time series data is processed by each neural network layer of the task prediction model to obtain the target task prediction time series data.

[0104] Specifically, the initial temporal feature data refers to the data generated after embedding the target task temporal data. This initial temporal feature data is continuous and contains position vectors. The embedding layer includes input embedding units and position embedding units. The split sub-data feature vectors are generated by processing the initial temporal feature data into the data adjustment layer of the task prediction model; they are high-dimensional feature vectors obtained by mapping low-dimensional splitting. The predicted split sub-data feature vectors are obtained by processing multiple split sub-data feature vectors into the decoder of the task prediction model. The decoder is a neural network layer that can obtain multiple predicted split sub-data feature vectors based on the multiple split sub-data feature vectors. The decoder includes an attention layer and a fully connected layer. The attention layer includes a multi-head attention layer, residual connections, and layer normalization; the fully connected layer includes a feedforward layer, residual connections, and layer normalization. The target task prediction temporal data is obtained by inputting multiple predicted split sub-data feature vectors into the data restoration layer of the task prediction model. This target task prediction temporal data is continuous temporal data and is predicted based on the target task temporal data and the task prediction model.

[0105] The target task time series data is input into the embedding layer. Specifically, the target task time series data is embedded to obtain an embedding vector. At the same time, position embedding vector information is added to each vector in the embedding vector to obtain the initial time series feature data.

[0106] For example, adding a positional embedding to each embedding vector could be such that, if the embedding vector has the consecutive "I love learning", then after the positional embedding, "I: 1", "love: 2", "learn: 3", and "practice: 4" are generated.

[0107] The target task time series data is input into the embedding layer of the task prediction model because the target task time series data is continuous time series data. If the data is directly input into the task prediction model for processing, it may disrupt the original order of the data, resulting in incorrect output results from the task prediction model. Therefore, it is necessary to first input the target task time series data into the embedding layer of the task prediction model for processing to obtain initial time series feature data.

[0108] By inputting the target task time series data into the embedding layer of the task prediction model, initial time series feature data is obtained, which enables subsequent prediction processing based on the obtained initial time series feature data.

[0109] Optionally, inputting the initial time-series feature data into the data adjustment layer of the task prediction model can involve splitting the initial time-series feature data and mapping the split results to ensure that the results obtained after the data adjustment layer meet the subsequent input requirements. Alternatively, the initial time-series feature data can be normalized, then the normalized results can be split, the split results can be mapped, and then the mapped results can be input...

[0110] The initial time-series feature data is input into the data adjustment layer of the task prediction model for splitting and mapping because the next layer in the task prediction model has strict requirements on the input content. The input content needs to be processed in advance so that the task prediction model can be used to predict the target task. Therefore, the initial time-series feature data needs to be input into the data adjustment layer for corresponding processing.

[0111] Multiple sub-data feature vectors are input into the decoder of the task prediction model. Specifically, the decoder obtains multiple predicted sub-data feature vectors based on the input sub-data feature vectors, so that subsequent processing can be performed based on the obtained predicted sub-data feature vectors, thereby efficiently completing the prediction processing based on the target task time series data.

[0112] The reason for inputting multiple predicted sub-data feature vectors into the data restoration layer of the task prediction model for mapping and restoration is that before the decoder is used for processing, the initial time series feature data is input into the data adjustment layer, where dimensionality increase and splitting operations are performed, resulting in low-dimensional discrete data. Therefore, the data generated by the decoder is also low-dimensional discrete. Based on the output requirements, it is necessary to process the low-dimensional discrete data to obtain high-dimensional continuous data. Therefore, mapping and restoration processing is required to obtain the target task prediction time series data.

[0113] In one optional embodiment of this specification, the data adjustment layer includes a splitting unit and a mapping unit; the above steps input the initial time-series feature data into the data adjustment layer to obtain multiple split sub-data feature vectors, including the following specific steps:

[0114] The initial time-series feature data is input into the splitting unit to obtain multiple split sub-data feature information;

[0115] The multiple sub-data feature information is input into the mapping unit to obtain the sub-data feature vector corresponding to each sub-data feature information.

[0116] Specifically, a splitting unit is a unit included in the data adjustment layer of a task prediction model. It is used to split the initial input temporal feature data to obtain multiple sub-data feature information. For example, a splitting unit can break down a long string of temporal data into individual data feature information. The splitting unit can be a patchwise unit or an N-gram word model. The sub-data feature information is the non-continuous feature information obtained after splitting. A mapping unit is a unit included in the data adjustment layer of a task prediction model. It is used to map the input data to obtain the sub-data feature vector corresponding to each sub-data feature information. For example, a mapping unit can map the dimension of the sub-data feature information, changing the dimension of the sub-data feature information to obtain the corresponding sub-data feature vector. The mapping unit can be an MLP unit.

[0117] Optionally, the initial time-series feature data is input into the splitting unit for splitting processing to obtain multiple split sub-data feature information. The splitting can be performed according to certain rules to obtain splitting results that conform to the rules. There can be many splitting rules, and this embodiment of the specification does not limit them in any way. The specific rules can be selected according to the actual situation.

[0118] The initial time-series feature data is input into the splitting unit. The splitting process is because the input requirement of the next layer of the embedding layer is discrete data, not continuous data. Therefore, it is necessary to split the continuous initial time-series feature data to obtain multiple groups of data, and use each group of data as a split sub-data feature information.

[0119] For example, if the initial time series feature data is (128,1), after inputting the initial time series feature data into the splitting unit, the output data can be (16,8), where the initial time series feature data is divided into 8 groups, each group has 16 feature information, that is, each split sub-data feature information includes 16 feature information.

[0120] Multiple sub-data feature information is input into the mapping unit for mapping processing because the input requirement of the next layer of the embedding layer is that the dimension of each sub-data is determined. Therefore, the mapping unit is needed to map multiple sub-data feature information to obtain the sub-data feature vector corresponding to each sub-data feature information.

[0121] The data adjustment layer, according to the embodiments of this specification, includes a splitting unit and a mapping unit. Initial time-series feature data is input to the splitting unit to obtain multiple split sub-data feature information, and the multiple split sub-data feature information is input to the mapping unit to obtain a split sub-data feature vector corresponding to each split sub-data feature information. This results in the generated split sub-data feature vector having both the features processed by the splitting unit and the features processed by the mapping unit, which meets the requirements of subsequent processing and provides further convenience for prediction based on the time-series data of the target task.

[0122] In one optional embodiment of this specification, the data adjustment layer further includes a first normalization unit; the above steps input the initial time-series feature data to the splitting unit to obtain multiple split sub-data feature information, including the following specific steps:

[0123] The initial time series feature data is input into the first normalization unit to obtain the first time series feature data;

[0124] The first time-series feature data is input into the splitting unit to obtain feature information of multiple split sub-data.

[0125] Specifically, the first normalization unit refers to a unit included in the data adjustment layer of the task prediction model, used to normalize the input initial time-series feature data to obtain the first time-series feature data. The normalization unit can be a reversible normalization unit, where the normalization unit is time series inductive bias normalization. The first time-series feature data refers to the data obtained by normalizing the initial time-series feature data through the first normalization unit. The normalization process does not affect the data type; it reduces the data range of the initial time-series feature data to between 0 and 1, so that the data size in the first time-series feature data is between 0 and 1.

[0126] Optionally, there are many ways to normalize the initial time-series feature data. One way is to calculate the variance of the initial time-series feature data and use the calculated variance for normalization. Another way is to calculate the mean of the initial time-series feature data and use the calculated mean for normalization. Yet another way is to calculate the variance and mean of the initial time-series feature data and use the mean and variance for normalization. Any method that can normalize continuous data can implement the solution of this embodiment. The continuous data can be time-series data. That is, there are many methods for normalization. This embodiment does not limit the methods in this specification. The specific method can be determined according to the actual situation.

[0127] The initial time-series feature data is input into the first normalization unit. The initial time-series feature data is obtained from the target task time-series data through the embedding layer and does not affect the size of the data. However, in the prediction processing based on the initial time-series feature data, corresponding processing is required. In order to avoid large errors in the processing, the initial time-series feature data needs to be normalized by the first normalization unit so that the data of the initial time-series feature data are all within a relatively small range.

[0128] For example, if the initial time series feature data are X1, X2, X3, X4, and X5, and the mean is calculated to be x and the variance is y, then the mean and variance can be subtracted from each of the data X1, X2, X3, X4, and X5 respectively to obtain x1, x2, x3, x4, and x5. Here, x1, x2, x3, x4, and x5 are the first time series feature data, and the data range is between 0 and 1.

[0129] The data adjustment layer, according to the embodiments of this specification, further includes a first normalization unit. Initial time-series feature data is input into the first normalization unit to obtain first time-series feature data. The first time-series feature data is then input into the splitting unit to obtain multiple split sub-data feature information. By normalizing the initial time-series feature data, the first time-series feature data is obtained, ensuring that the information used subsequently is the normalized first time-series feature data. This reduces errors generated during subsequent processing and further improves the accuracy of prediction based on the target task time-series data.

[0130] In one optional embodiment of this specification, the above steps input the initial time-series feature data into the splitting unit to obtain multiple split sub-data feature information, including the following specific steps:

[0131] The initial time-series feature data is input into the splitting unit, and the initial time-series feature data is split according to the preset splitting parameters to obtain multiple split sub-data feature information.

[0132] Specifically, preset splitting parameters refer to the pre-set splitting parameters for splitting the initial time-series feature data. Preset splitting parameters may include preset splitting length and preset splitting quantity.

[0133] Optionally, the preset splitting parameters include a preset splitting length and a preset splitting number. When splitting the initial time-series feature data, the splitting can be based on the preset splitting length or the preset splitting number. The selection of the preset splitting length or the preset splitting number can be determined based on the processing rules of the task prediction model.

[0134] Initial time-series feature data is input into the splitting unit and split according to preset splitting parameters. If the preset splitting length is used, the length of each split sub-data feature information will be the same; if the preset splitting parameter is a preset splitting number, the number of split sub-data feature information obtained is fixed. By processing according to the preset splitting length or preset splitting number to obtain the corresponding results, subsequent processing of multiple split sub-data feature information can be performed according to the preset splitting parameters, improving processing efficiency.

[0135] For example, if the initial time series feature data is (128,1) and the preset split length is 16, then the data is split according to the preset split parameter 16, and the resulting multiple split sub-data feature information is (16,8), where each split sub-data feature information is a set of 16 feature information.

[0136] By applying the scheme of the embodiments of this specification, the initial time series feature data is input into the splitting unit, and the initial time series feature data is split according to the preset splitting parameters to obtain multiple split sub-data feature information. The split sub-data feature information is obtained by splitting according to certain rules. In subsequent processing, the multiple split sub-data feature information can be directly processed according to the set rules, which improves the efficiency of subsequent processing and further improves the efficiency of prediction based on the target task time series data.

[0137] In one optional embodiment of this specification, the above steps of inputting the plurality of sub-data feature information into the mapping unit to obtain the sub-data feature vector corresponding to each sub-data feature information include the following specific steps:

[0138] Multiple sub-data feature information is input into the mapping unit. Based on the feature dimension of the decoder, the dimension of each sub-data feature information is mapped to obtain the corresponding sub-data feature vector.

[0139] Specifically, the feature dimension of a decoder refers to the feature dimension required for the data input to the decoder to be processed.

[0140] Optionally, multiple sub-data feature information is input into the mapping unit, and the feature dimension of each sub-data feature information is mapped according to the feature dimension of the decoder. The feature dimension of the decoder is related to the actual processing requirements of the decoder. When the decoder is a gpt2 model, the feature dimension of the decoder is 768-dimensional.

[0141] Multiple sub-data feature information is input into the mapping unit. The dimension of each sub-data feature information is mapped according to the feature dimension of the decoder. This is because the output of the data adjustment layer needs to be input into the decoder for corresponding processing. In order to meet the decoder's requirements for feature dimension, each sub-data feature information needs to be mapped according to the decoder's feature dimension to obtain the sub-data feature vector corresponding to each sub-data feature information.

[0142] For example, if multiple sub-data features are (16, 8) and the feature dimension of the decoder is 768, then the multiple sub-data features are input to the mapping unit and mapped to obtain the sub-data feature vector corresponding to each sub-data feature as (8, 768). Here, every 16 features are transformed into a 768-dimensional feature vector. This embodiment does not limit the feature dimension to 768; 768 is only an example and is determined according to the feature dimension of the decoder used.

[0143] The scheme implemented in this specification involves inputting multiple sub-data feature information into a mapping unit. Based on the feature dimensions of the decoder, the dimension of each sub-data feature information is mapped to obtain a corresponding sub-data feature vector. This ensures that the obtained sub-data feature vector is mapped based on the feature dimensions of the decoder, improving the efficiency of subsequent decoder processing.

[0144] In one optional embodiment of this specification, the data restoration layer includes a mapping restoration unit and a merging unit; the above steps input the feature vectors of the multiple predicted sub-data to the data restoration layer to obtain the target task prediction time series data, including the following specific steps:

[0145] The multiple predicted split sub-data feature vectors are input into the mapping and restoration unit to obtain the predicted split sub-data feature information corresponding to each predicted split sub-data feature vector.

[0146] The feature information of multiple predicted sub-data is input into the merging unit to obtain the target task prediction time series data.

[0147] Specifically, the mapping and restoration unit is a unit included in the data restoration layer of the task prediction model. It is used to map and restore the feature vectors of multiple input predictive sub-data segments to obtain the predictive sub-data feature information corresponding to each feature vector. For example, the mapping and restoration unit can map the dimension of the predictive sub-data feature vectors, changing the dimension of the predictive sub-data feature vectors to obtain the corresponding predictive sub-data feature information. The mapping unit can be an MLP unit. The predictive sub-data feature information refers to non-continuous, low-dimensional feature information. The merging unit is a unit included in the data restoration layer of the task prediction model. It is used to merge the feature information of multiple input predictive sub-data segments to obtain the target task prediction time-series data. For example, the merging unit merges individual feature information into a long string of time-series data.

[0148] Multiple predicted sub-data feature vectors are input into the mapping and restoration unit for mapping and restoration processing. This is because, in the data adjustment layer, in order to adapt to the decoder's requirements for the input data, the low-dimensional sub-data feature information is mapped to obtain high-dimensional sub-data feature vectors. Furthermore, in order to output the data predicted by the decoder, multiple predicted sub-data feature vectors need to be input into the mapping and restoration unit for mapping to obtain multiple predicted sub-data feature information.

[0149] The feature information of multiple prediction sub-data is input into the merging unit for merging because the input target task time series data is continuous time series data, and the prediction data generated based on the target task time series data should also be continuous time series data. Therefore, it is necessary to input the feature information of multiple prediction sub-data into the merging unit for merging to obtain the target task prediction time series data.

[0150] When multiple predicted sub-data feature information are input into the merging unit for merging, it is necessary to merge them based on the position vector of each predicted sub-data feature information to ensure that the order of the target predicted time series data generated by the merging is correct.

[0151] The data restoration layer, using the scheme of the embodiments in this specification, includes a mapping restoration unit and a merging unit. Multiple predicted sub-data feature vectors are input to the mapping restoration unit to obtain the predicted sub-data feature information corresponding to each predicted sub-data feature vector. The multiple predicted sub-data feature information are then input to the merging unit to obtain the target task prediction time series data. This ensures that the generated target task prediction time series data is time series data of the same data type as the target task time series data, forming a correspondence between the input data and the output data, and improving the prediction function of the task prediction model.

[0152] In one optional embodiment of this specification, the data restoration layer further includes a second normalization unit; the above steps input the feature information of multiple predicted sub-data to the merging unit to obtain the target task prediction time series data, including the following specific steps:

[0153] The feature information of multiple predicted sub-data is input into the merging unit to obtain the initial task prediction time series data;

[0154] The initial task prediction time series data is input into the second normalization unit to obtain the target task prediction time series data.

[0155] Specifically, the second normalization unit refers to a unit included in the data restoration layer of the task prediction model. It is used to normalize the input initial task prediction time-series data to obtain the target task prediction time-series data. The normalization unit can be a reversible normalization unit. The first normalization unit normalizes the original data to obtain data ranging from 0 to 1, while the second normalization unit inversely normalizes the data ranging from 0 to 1 to obtain the original data. In other words, the first normalization unit is the inverse process of the second normalization unit. The initial task prediction time-series data refers to continuous data obtained by merging the feature information of multiple prediction sub-data.

[0156] Optionally, there are many ways to normalize the initial time-series feature data. One way is to calculate the variance of the initial time-series feature data and use the calculated variance for normalization. Another way is to calculate the mean of the initial time-series feature data and use the calculated mean for normalization. Yet another way is to calculate the variance and mean of the initial time-series feature data and use the mean and variance for normalization. Any method that can normalize continuous data can implement the solution of this embodiment. The continuous data can be time-series data. That is, there are many methods for normalization. This embodiment does not limit the methods in this specification. The specific method can be determined according to the actual situation.

[0157] In the embodiments of this specification, the selection of the normalization method is the reverse of the normalization method selected at the first normalization unit. For example, the first normalization unit subtracts the mean from the initial time-series feature data X1, X2, X3, X4, and X5 respectively and then divides by the variance to obtain x1, x2, x3, x4, and x5, while the second normalization unit multiplies the initial task prediction time-series data y1, y2, y3, y4, and y5 respectively by the variance and then adds the mean to obtain Y1, Y2, Y3, Y4, and Y5.

[0158] The initial task prediction time series data is input into the second normalization unit because the input target task time series data is continuous time series data. Therefore, it is necessary to input the initial task prediction time series data into the second normalization unit to obtain continuous target task prediction time series data.

[0159] The mapping and merging unit further includes a second normalization unit, which inputs the feature information of multiple predicted sub-data to the merging unit to obtain initial task prediction time series data; and inputs the initial task prediction time series data to the second normalization unit to obtain target task prediction time series data, so that the data type of the target task prediction time series data output by the task prediction model is consistent with that of the input target task time series data.

[0160] In one optional embodiment of this specification, the above steps of inputting the plurality of predicted sub-data feature vectors into the mapping and restoration unit to obtain the predicted sub-data feature information corresponding to each predicted sub-data feature vector include the following specific steps:

[0161] Multiple predicted sub-data feature vectors are input into the mapping and restoration unit. Based on the dimension of the sub-data feature information, the dimension of each predicted sub-data feature vector is mapped and restored to obtain the corresponding predicted sub-data feature information.

[0162] Specifically, the mapping and restoration unit and the mapping unit are inverses of each other. The mapping unit maps low-dimensional feature information into high-dimensional feature vectors, while the mapping and restoration unit maps high-dimensional feature vectors back into low-dimensional feature information.

[0163] For example, multiple predicted split sub-data feature vectors are (8, 768) data. After inputting these multiple predicted split sub-data feature vectors into the mapping and restoration unit, mapping and restoration are performed to obtain the predicted split sub-data feature information corresponding to each predicted split sub-data feature vector as (16, 8) data, where each 768-dimensional feature vector is transformed into 16 feature information.

[0164] When the mapping and restoration unit performs dimension mapping with the mapping unit in the above embodiment, the change in dimension is opposite. Therefore, the mapping unit transforms the dimensions of multiple split sub-data feature information into high-dimensional feature vectors, while the mapping and restoration unit needs to map and restore the high-dimensional feature vectors to predict the split sub-data feature information with the same dimensions as the split sub-data feature information according to the dimensions of the split sub-data feature information.

[0165] By applying the scheme of the embodiments of this specification, multiple predicted sub-data feature vectors are input to the mapping and restoration unit. Based on the dimension of the sub-data feature information, the dimension of each sub-data feature vector is mapped and restored to obtain the corresponding predicted sub-data feature information. This allows subsequent merging to be performed based on the obtained multiple predicted sub-data feature information, thereby generating the prediction result of the task prediction model.

[0166] In one optional embodiment of this specification, the above steps input the feature information of multiple predicted sub-data segments into the merging unit to obtain the target task prediction time series data, including the following specific steps:

[0167] Multiple prediction sub-data feature information are input into the merging unit, and the target task prediction time series data is obtained based on the data parameters of the initial time series feature data.

[0168] Specifically, data parameters refer to the data length and data dimension of the initial time series feature data. The merging unit and the splitting unit are inverses of each other. The splitting unit splits the continuous time series data to obtain multiple sets of data, while the merging unit merges the multiple sets of data into continuous time series data.

[0169] For example, if the splitting unit splits the initial time-series feature data (128,1) into data (16,8), then the merging unit merges the data (16,8) into data (128,1).

[0170] Both the data adjustment layer of the splitting unit and the data restoration layer of the merging unit belong to the neural network layers in the task prediction model. Therefore, when the splitting unit and the merging unit perform corresponding inverse processing, the merging unit needs to perform corresponding merging processing based on the input of the splitting unit, that is, the data parameters of the initial time series feature data.

[0171] By applying the scheme of the embodiments of this specification, the feature information of multiple predicted sub-data is input into the merging unit. Based on the data parameters of the initial time series data, the target task predicted time series data is obtained, so that the target task predicted time series data generated by the merging unit is continuous time series data, thereby making the data type of the output target task predicted time series data the same as the data type of the target task time series data.

[0172] In one optional embodiment of this specification, the training method of the task prediction model includes the following specific steps:

[0173] Obtain a sample set, which includes multiple sample data points, each carrying a sample label;

[0174] The sample data is input into the embedding layer of the initial task prediction model to obtain the initial time series feature data;

[0175] The initial time series feature data is input into the initial task prediction model. After processing by the initial task prediction model, the prediction data is obtained. The initial task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the target task time series data meets the data processing requirements of the natural language pre-training model.

[0176] Compare the predicted data with the sample labels and calculate the loss value;

[0177] Based on the loss value, adjust the parameters of the network layers in the initial task prediction model, excluding the natural language pre-trained model, and return to the step of inputting sample data into the initial task prediction model. If the training stopping condition is met, the task prediction model is obtained.

[0178] In one or more embodiments of this specification, in order to improve the predictive efficiency of the trained task prediction model, a sample set may be obtained when starting model training.

[0179] Specifically, the sample set refers to the set of samples obtained when adjusting the parameters of the model; it is the collection of training samples, typically including the training set and the test set. Sample labels refer to the category to which the sample data belongs, carried in the sample data in the form of labels. The training stopping condition refers to the condition at which training of the model stops; for example, training stops when the total loss value reaches a preset loss threshold. The input layer, decoder, and output layer in the task prediction model are pre-trained, and can be a GPT2 model. Therefore, knowledge transfer techniques are used to transfer knowledge from the GPT2 model to the initial task prediction model. Based on the obtained sample set, the initial task prediction model is trained to obtain the task prediction model.

[0180] By acquiring a sample set containing multiple sample data points, and with each sample data point carrying a category label, a classification model can be trained based on the acquired sample set.

[0181] In practical applications, there are many functions for calculating text loss values, such as cross-entropy loss function, L1 norm loss function, maximum loss function, mean squared error loss function, log loss function, etc. The specific function to be selected depends on the actual situation, and the embodiments in this specification do not impose any limitations on this.

[0182] Optionally, the initial task prediction model can be trained based on each sample data and its sample label. This can be done by training the initial task prediction model sequentially based on multiple sample data and sample labels. Alternatively, each sample data can be divided into two parts, one part for training the initial task prediction model and the other part for testing the trained task prediction model.

[0183] In practical applications, the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model are adjusted, specifically the network parameters in the data adjustment layer and the data restoration layer.

[0184] Optionally, any one of the multiple sample data is extracted as the target sample data and its accompanying sample label. The target sample data is input into the initial task prediction model to obtain a prediction result output by the initial task prediction model. The sample data, the prediction result, and the sample label carried by the target sample data are calculated to obtain the total loss value.

[0185] Optionally, after calculating the total loss value, the system determines whether the total loss value meets the preset loss threshold. If it does, training stops; otherwise, the system adjusts the network parameters of the initial task prediction model based on the total loss value. After adjustment, the system returns to the point where it extracts one sample from multiple sample data as the target sample data and continues training the initial task prediction model with adjusted network parameters until the total loss value meets the preset loss threshold.

[0186] The scheme implemented in this specification involves sequentially inputting multiple sample data from the sample set into the embedding layer, data adjustment layer, decoder, and data restoration layer of the initial task prediction model to obtain prediction data. The prediction data and sample labels are compared to calculate the loss value, and the network parameters of the initial task prediction model are adjusted according to the loss value until the training stopping condition is met, thereby obtaining the task prediction model. The initial task prediction model is trained using the sample data in the sample set, resulting in a task prediction model with high prediction efficiency.

[0187] See Figure 5 , Figure 5 This specification shows a flowchart of a power load forecasting method according to an embodiment, applied to end-side equipment, which specifically includes the following steps:

[0188] Step 502: Obtain the power load time-series data input by the user at the front end.

[0189] In one or more embodiments of this specification, the end-side device can acquire power load timing data input by the user at the front end.

[0190] Specifically, power load time series data refers to continuous power consumption data. The data format of power load time series data is diverse, and it can be waveform or coordinate axis. The specific choice depends on the actual situation, and the embodiments in this specification do not limit it in any way.

[0191] It should be noted that there are many ways for end-side devices to obtain the power load timing data input by the user at the front end. The specific method should be selected according to the actual situation. The embodiments in this specification do not limit this in any way.

[0192] In one possible implementation of this specification, the user may have pre-uploaded or stored power load timing data in the end-side device, and the user can directly select the data at the front end and input the data based on the selected content.

[0193] In another possible implementation of this specification, when the user browses the front-end webpage, a portion of the power load time series data is randomly selected as the power load time series data, which is then predicted by the end-side device.

[0194] Step 504: Input the power load time series data into the task prediction model, and obtain the power load prediction time series data through the processing of the task prediction model.

[0195] The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the power load time series data meets the data processing requirements of the natural language pre-training model.

[0196] In one or more embodiments of this specification, after the end-side device obtains the power load time-series data input by the user's front end, it further inputs the power load time-series data into the task prediction model to obtain the target task prediction time-series data corresponding to the target task time-series data.

[0197] The specific implementation methods of steps 502 to 504 are the same as those described above. Figure 4 The implementation of the target task prediction method shown is similar and will not be described in detail here.

[0198] Step 506: Feed back the power load forecast time series data to the front-end display.

[0199] In one or more embodiments of this specification, after the end-side device performs prediction based on the power load time-series data input by the user and generates corresponding power load prediction time-series data, it feeds back the power load prediction time-series data to the front end for display to the user, so that the user can obtain the power load prediction time-series data corresponding to the power load time-series data through the front end display.

[0200] Optionally, after receiving the power load forecast time series data through the front end, the user can use it directly, or compare the power load forecast time series data with the actual time series data. If they are different, the user can feed the actual time series data back to the end-side device for corresponding processing.

[0201] In one optional embodiment of this specification, after step 506 above, the following specific steps are further included:

[0202] Receive the target power load forecast time series data fed back by the user;

[0203] Based on the target power load forecast time series data, the model parameters in the task forecast model are adjusted to obtain the adjusted task forecast model.

[0204] In one or more embodiments of this specification, to further improve the accuracy of the task prediction model, the user can obtain historical power load time-series data, divide the historical power load time-series data into two parts, use the first part of the historical power load time-series data as the input power load time-series data, and use the second part as the target power load prediction time-series data. When the task prediction model of the end-side device outputs the predicted power load time-series data based on the power load time-series data, the user can compare the target power load prediction time-series data with the power load prediction time-series data, so that the end-side device can adjust the model parameters based on the comparison results.

[0205] Specifically, the target power load forecast time series data refers to the forecast time series data corresponding to the power load time series data input by the user at the front end, so that the front end can use the input target power load forecast time series data to adjust the parameters of the task forecast model. For example, the target power load time series data can be the latter part of the historical power load time series data, or it can be the comparison result obtained by the user comparing the target power load forecast time series data with the power load forecast time series data.

[0206] Based on the target power load forecast time-series data, the model parameters in the task prediction model are adjusted to obtain the adjusted task prediction model. Specifically, based on the user-input target power load forecast time-series data and the power load forecast time-series data generated by the task prediction model in the end-side equipment, the loss value is calculated. Based on the loss value, the network layer parameters in the data adjustment layer and data restoration layer of the task prediction model are adjusted. Finally, based on the adjusted data adjustment layer and data restoration layer, combined with other network layers in the task prediction model, the final task prediction model is generated.

[0207] In practical applications, there are many functions for calculating pre-training loss values, such as cross-entropy loss function, L1 norm loss function, maximum loss function, mean squared error loss function, log loss function, etc. The specific function to be selected depends on the actual situation, and the embodiments in this specification do not impose any limitations on this.

[0208] After the end-side device feeds back the power load forecast time series data to the front end and displays it to the user, it can also receive the target power load forecast time series data fed back by the user based on the displayed power load forecast time series data, and adjust the model parameters in the task forecast model according to the fed-back target power load forecast time series data.

[0209] The scheme implemented in this specification receives target power load prediction time series data from user feedback, and adjusts the model parameters in the task prediction model based on the target power load time series data, so that the prediction efficiency of the finally trained task prediction model is higher.

[0210] See Figure 6 , Figure 6 This specification illustrates a flowchart of a data processing method for a task prediction model provided in one embodiment, applied to a cloud-side device, and specifically includes the following steps:

[0211] Step 602: Obtain the sample set.

[0212] The sample set includes multiple sample data, each carrying a sample label.

[0213] Step 604: Input the sample data into the initial task prediction model, and obtain the prediction data through the processing of the initial task prediction model.

[0214] The initial task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the sample data meets the data processing requirements of the natural language pre-training model.

[0215] Step 606: Compare the predicted data and the sample labels to calculate the loss value.

[0216] Step 608: Based on the loss value, adjust the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model, and return to the step of inputting the sample data into the initial task prediction model. If the training stopping condition is met, obtain the model parameters of the trained task prediction model.

[0217] Step 610: Send the model parameters of the trained task prediction model to the edge device.

[0218] In one possible implementation of this specification, the cloud-side device can obtain multiple sample data from a cloud database.

[0219] In another possible implementation of this specification, the cloud-side device can receive the sample set constructed and uploaded by the end-side device.

[0220] It should be noted that the specific implementation methods of steps 602 to 610 are the same as those described above. Figure 4 The training method for the task prediction model in the target task prediction method shown is the same, so it will not be described again in this specification.

[0221] Furthermore, after the cloud-side device trains and obtains the model parameters of the task prediction model, it can send the model parameters of the task prediction model to multiple end-side devices that have established connections with the cloud-side device, so that the end-side devices can use the task prediction model to perform time series data prediction.

[0222] See Figure 7 , Figure 7 This specification shows a model structure diagram of a task prediction model for a target task prediction method according to an embodiment. See also: Figure 7 The input time-series data is processed through an embedding layer, which includes input embeddings and positional embeddings. Specifically, the input data is processed through the embedding layer to obtain embedding vectors, and positional embeddings are performed on these vectors. The result from the embedding layer is then fed into a data adjustment layer, which includes a reversible normalization unit, a patchwise splitting unit, and a mapping unit (MLP). The output from the data adjustment layer is then fed into a decoder, which includes an attention layer and a fully connected layer (feedforward layer). The attention layer includes a multi-head attention layer, residual connections, and layer normalization (add&layer normalization). Specifically, the data output from the embedding layer is sequentially fed into the multi-head attention layer, residual connections, and layer normalization layer. Finally, the output from the attention layer is fed into the fully connected layer, which includes a feedforward layer, residual connections, and layer normalization (add&layer normalization). Specifically, the output of the attention layer is sequentially fed into the feedforward layer, residual connection, and layer normalization. The output of the fully connected layer is fed into the data restoration layer, which includes a mapping and restoration unit, a merging layer, and a second normalization unit. The output of the data restoration layer is then fed into the output layer to obtain the prediction result.

[0223] The following is in conjunction with the appendix Figure 8 Taking the application of the target task prediction method provided in this specification in power load prediction as an example, the target task prediction method will be further explained. Among other things, Figure 8 This specification illustrates a flowchart of a target task prediction method according to an embodiment, which specifically includes the following steps:

[0224] The edge device acquires the sample data input by the user at the front end, constructs a sample set based on the sample data, and sends the sample set to the cloud-side device. The sample data carries sample tags.

[0225] After acquiring the sample set, the cloud-side device trains the initial task prediction model based on the sample data and the sample labels it carries, obtains the trained task prediction model, and sends the model parameters of the task prediction model to the edge device.

[0226] The end-side device acquires the power load time-series data input by the user at the front end, inputs the power load time-series data into the task prediction model, obtains the power load prediction time-series data, and displays it to the user through the front end.

[0227] The scheme implemented in this specification involves acquiring sample data input by the user at the front end, feeding the sample data into an initial task prediction model for training, obtaining a trained task prediction model, sending the model parameters of the task prediction model to the edge device, and using the model parameters sent by the cloud device to predict the power load time series data input by the user at the front end, thereby obtaining the power load prediction time series data. This data is then displayed to the user at the front end. Through the interaction between the edge device and the cloud device, the training of the task prediction model and the prediction of the power load time series data input by the user based on the task prediction model are realized, thus obtaining the power load prediction time series data.

[0228] With the above Figure 4 Corresponding to the target task prediction method embodiments described in the previous section, this specification also provides target task prediction device embodiments. Figure 9 A schematic diagram of a target task prediction device according to one embodiment of this specification is shown. Figure 9 As shown, the device includes:

[0229] The first acquisition module 902 is configured to acquire the timing data of the target task.

[0230] The first acquisition module 904 is configured to input the target task time series data into the task prediction model, and obtain the target task prediction time series data through the processing of the task prediction model. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the target task time series data meets the data processing requirements of the natural language pre-training model.

[0231] Optionally, the target task prediction device includes a task prediction model, which includes an embedding layer, a data adjustment layer, a decoder, and a data restoration layer.

[0232] Optionally, the first obtaining module 904 is further configured to input the target task time series data into the embedding layer to obtain initial time series feature data; input the initial time series feature data into the data adjustment layer to obtain multiple sub-data feature vectors; input the multiple sub-data feature vectors into the decoder to obtain multiple predicted sub-data feature vectors; and input the multiple predicted sub-data feature vectors into the data restoration layer to obtain target task predicted time series data.

[0233] Optionally, the data adjustment layer includes a splitting unit and a mapping unit; the first obtaining module 906 is further configured to input initial time-series feature data into the splitting unit to obtain multiple split sub-data feature information; and input the multiple split sub-data feature information into the mapping unit to obtain the split sub-data feature vector corresponding to each split sub-data feature information.

[0234] Optionally, the first obtaining module 906 is further configured to input the initial time series feature data into the splitting unit, split the initial time series feature data according to preset splitting parameters, and obtain multiple split sub-data feature information.

[0235] Optionally, the data restoration layer includes a mapping restoration unit and a merging unit; the first obtaining module 910 is further configured to input multiple predicted sub-data feature vectors into the mapping restoration unit to obtain the predicted sub-data feature information corresponding to each predicted sub-data feature vector; and input the multiple predicted sub-data feature information into the merging unit to obtain the target task prediction time series data.

[0236] The above is an illustrative scheme of a target task prediction device according to this embodiment. It should be noted that the technical solution of this target task prediction device is similar to that described above. Figure 4 The technical solutions for the target task prediction methods described above belong to the same concept. For details not described in the technical solutions for the target task prediction devices, please refer to the above. Figure 4 The technical solution of the target task prediction method is described.

[0237] With the above Figure 5 Corresponding to the embodiments of the power load forecasting method in the specification, this specification also provides embodiments of the power load forecasting device. Figure 10 A schematic diagram of the structure of an electrical load forecasting device according to one embodiment of this specification is shown. Figure 10 As shown, the device includes:

[0238] The second acquisition module 1002 is configured to acquire the power load time-series data input by the user at the front end;

[0239] The second acquisition module 1004 is configured to input the power load time series data into the task prediction model, and obtain power load prediction time series data through the processing of the task prediction model. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the power load time series data meets the data processing requirements of the natural language pre-training model.

[0240] Feedback module 1006 is configured to feed back the power load forecast time series data to the front-end display.

[0241] Optionally, the device further includes: a user feedback module, configured to receive target power load forecast time series data fed back by users; and to adjust the model parameters in the task forecast model based on the target power load forecast time series data to obtain an adjusted task forecast model.

[0242] The above is a schematic scheme of a power load forecasting device according to this embodiment. It should be noted that the technical solution of this power load forecasting device is similar to that described above. Figure 5 The technical solutions for the power load forecasting methods described above belong to the same concept. Details not described in the technical solutions for the power load forecasting devices can be found in the above-mentioned... Figure 5 The technical solution of the power load forecasting method is described.

[0243] With the above Figure 6 Corresponding to the data processing method embodiments of the task prediction model in the present invention, this specification also provides embodiments of the data processing apparatus for the task prediction model. Figure 11 A schematic diagram of the structure of a data processing apparatus for a task prediction model provided in one embodiment of this specification is shown. Figure 11 As shown, the device includes:

[0244] The sample set acquisition module 1102 is configured to acquire a sample set, wherein the sample set includes multiple sample data, and the multiple sample data carry sample labels;

[0245] The third acquisition module 1104 is configured to input the sample data into the initial task prediction model, and obtain prediction data through the processing of the initial task prediction model. The initial task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding a data adjustment layer to the natural language pre-training model so that the sample data meets the data processing requirements of the natural language pre-training model.

[0246] The calculation module 1106 is configured to compare the predicted data and the sample labels to calculate the loss value;

[0247] The adjustment module 1108 is configured to adjust the parameters of the network layers other than the natural language pre-trained model in the initial task prediction model according to the loss value, and return to the step of inputting the sample data into the embedding layer of the initial task prediction model to obtain the initial temporal feature data. If the training stopping condition is reached, the model parameters of the trained task prediction model are obtained.

[0248] The sending module 1110 is configured to send the model parameters of the trained task prediction model to the end device.

[0249] The above is a schematic scheme of a data processing device for a task prediction model according to this embodiment. It should be noted that the technical solution of this data processing device for the task prediction model is similar to that described above. Figure 6 The technical solutions for the data processing methods of the task prediction model in the above-mentioned models belong to the same concept. For details not described in detail in the technical solution of the data processing device for the task prediction model, please refer to the above-mentioned... Figure 6 This describes the technical solution for data processing methods in the task prediction model.

[0250] Figure 12 A structural block diagram of a computing device according to one embodiment of this specification is shown. The components of the computing device 1200 include, but are not limited to, a memory 1210 and a processor 1220. The processor 1220 is connected to the memory 1210 via a bus 1230, and a database 1250 is used to store data.

[0251] The computing device 1200 also includes an access device 1240, which enables the computing device 1200 to communicate via one or more networks 1260. Examples of these networks include Public Switched Telephone Network (PSTN), Local Area Network (LAN), Wide Area Network (WAN), Personal Area Network (PAN), or combinations of communication networks such as the Internet. The access device 1240 may include one or more of any type of wired or wireless network interface (e.g., Network Interface Card (NIC)), such as an IEEE 802.11 Wireless Local Area Networks (WLAN) interface, a Wi-MAX (World Interoperability for Microwave Access) interface, an Ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a Bluetooth interface, a Near Field Communication (NFC) interface, and so on.

[0252] In one embodiment of this specification, the aforementioned components of the computing device 1200 and Figure 12 Other components, not shown, can also be connected to each other, for example, via a bus. It should be understood that... Figure 12 The block diagram of the computing device shown is for illustrative purposes only and is not intended to limit the scope of this specification. Those skilled in the art can add or replace other components as needed.

[0253] The computing device 1200 can be any type of stationary or mobile computing device, including mobile computers or mobile computing devices (e.g., tablet computers, personal digital assistants, laptop computers, notebook computers, netbooks, etc.), mobile phones (e.g., smartphones), wearable computing devices (e.g., smartwatches, smart glasses, etc.) or other types of mobile devices, or stationary computing devices such as desktop computers or personal computers (PCs). The computing device 1200 can also be a mobile or stationary server.

[0254] The processor 1220 is configured to execute the following computer-executable instructions, which, when executed by the processor, implement the steps of the data processing method of the above-mentioned target task prediction method, power load prediction method, or task prediction model.

[0255] The above is an illustrative scheme of a computing device according to this embodiment. It should be noted that the technical solution of this computing device belongs to the same concept as the technical solution of the data processing method of the target task prediction method, power load prediction method, or task prediction model described above. For details not described in detail in the technical solution of the computing device, please refer to the description of the technical solution of the data processing method of the target task prediction method, power load prediction method, or task prediction model described above.

[0256] An embodiment of this specification also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the data processing method for the aforementioned target task prediction method, power load prediction method, or task prediction model.

[0257] The above is an illustrative scheme of a computer-readable storage medium according to this embodiment. It should be noted that the technical solution of this storage medium belongs to the same concept as the technical solution of the data processing method of the target task prediction method, power load prediction method, or task prediction model described above. For details not described in detail in the technical solution of the storage medium, please refer to the description of the technical solution of the data processing method of the target task prediction method, power load prediction method, or task prediction model described above.

[0258] An embodiment of this specification also provides a computer program, wherein when the computer program is executed in a computer, it causes the computer to perform the steps of the data processing method of the target task prediction method, the power load prediction method, or the task prediction model described above.

[0259] The above is an illustrative scheme of a computer program according to this embodiment. It should be noted that the technical solution of this computer program belongs to the same concept as the technical solution of the target task prediction method, power load prediction method, or data processing method of the task prediction model described above. For details not described in detail in the technical solution of the computer program, please refer to the description of the technical solution of the target task prediction method, power load prediction method, or data processing method of the task prediction model described above.

[0260] The foregoing has described specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims may be performed in a different order than that shown in the embodiments and may still achieve the desired result. Furthermore, the processes depicted in the drawings do not necessarily require the specific or sequential order shown to achieve the desired result. In some embodiments, multitasking and parallel processing are possible or may be advantageous.

[0261] The computer instructions include computer program code, which may be in the form of source code, object code, executable file, or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording media, USB flash drive, portable hard drive, magnetic disk, optical disk, computer memory, read-only memory (ROM), random access memory (RAM), electrical carrier signals, telecommunication signals, and software distribution media, etc.

[0262] It should be noted that, for the sake of simplicity, the foregoing method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments in this specification are not limited to the described order of actions, because according to the embodiments in this specification, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in this specification are all preferred embodiments, and the actions and modules involved are not necessarily essential to the embodiments in this specification.

[0263] In the above embodiments, the descriptions of each embodiment have different focuses. For parts not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.

[0264] The preferred embodiments disclosed above are merely illustrative of this specification. The optional embodiments do not exhaustively describe all details, nor do they limit the invention to the specific implementations described. Clearly, many modifications and variations can be made based on the embodiments described herein. These embodiments are selected and specifically described in this specification to better explain the principles and practical applications of the embodiments, thereby enabling those skilled in the art to better understand and utilize this specification. This specification is limited only by the claims and their full scope and equivalents.

Claims

1. A target task prediction method, applied to an edge device, comprising: Acquire target task time-series data, wherein the target task time-series data includes power load time-series data; The target task time series data is input into the task prediction model. After processing by the task prediction model, target task prediction time series data is obtained. The task prediction model is generated by adjusting a natural language pre-training model. The adjustment refers to adding an embedding layer, a data adjustment layer, a decoder, and a data restoration layer to the natural language pre-training model so that the target task time series data meets the data processing requirements of the natural language pre-training model. The target task prediction time series data includes power load prediction time series data. The adjustment includes: inputting the target task time series data into the embedding layer to obtain initial time series feature data; inputting the initial time series feature data into the data adjustment layer to obtain multiple split sub-data feature vectors; inputting the multiple split sub-data feature vectors into the decoder to obtain multiple predicted split sub-data feature vectors; and inputting the multiple predicted split sub-data feature vectors into the data restoration layer to obtain target task predicted time series data.

2. The method according to claim 1, wherein the data adjustment layer comprises a splitting unit and a mapping unit; The initial time-series feature data is input into the data adjustment layer to obtain multiple split sub-data feature vectors, including: The initial time-series feature data is input into the splitting unit to obtain multiple split sub-data feature information; The multiple sub-data feature information is input into the mapping unit to obtain the sub-data feature vector corresponding to each sub-data feature information.

3. The method according to claim 2, wherein inputting the initial time-series feature data into the splitting unit to obtain multiple split sub-data feature information includes: The initial time-series feature data is input into the splitting unit, and the initial time-series feature data is split according to the preset splitting parameters to obtain multiple split sub-data feature information.

4. The method according to claim 1, wherein the data restoration layer comprises a mapping restoration unit and a merging unit; The step of inputting the feature vectors of the multiple predicted sub-data into the data restoration layer to obtain the target task prediction time series data includes: The multiple predicted split sub-data feature vectors are input into the mapping and restoration unit to obtain the predicted split sub-data feature information corresponding to each predicted split sub-data feature vector. The feature information of multiple predicted sub-data is input into the merging unit to obtain the target task prediction time series data.

5. A power load forecasting method, applied to end-side equipment, comprising: Obtain the power load time-series data input by the user on the front end; The power load time series data is input into the task prediction model. After processing by the task prediction model, power load prediction time series data is obtained. The task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding an embedding layer, a data adjustment layer, a decoder and a data restoration layer to the natural language pre-training model so that the power load time series data meets the data processing requirements of the natural language pre-training model. The adjustment includes: inputting the power load time series data into the embedding layer to obtain initial time series feature data; inputting the initial time series feature data into the data adjustment layer to obtain multiple sub-data feature vectors; inputting the multiple sub-data feature vectors into the decoder to obtain multiple predicted sub-data feature vectors; and inputting the multiple predicted sub-data feature vectors into the data restoration layer to obtain power load prediction time series data. The power load forecast time series data is fed back to the front end for display.

6. The method according to claim 5, further comprising, after feeding back the power load forecast time-series data to the front-end display: Receive the target power load forecast time series data fed back by the user; Based on the target power load forecast time series data, the model parameters in the task forecast model are adjusted to obtain the adjusted task forecast model.

7. A data processing method for a task prediction model, applied to cloud-side devices, comprising: Obtain a sample set, wherein the sample set includes multiple sample data, and the multiple sample data carry sample labels; The sample data is input into the initial task prediction model, and the initial task prediction model processes the data to obtain prediction data. The initial task prediction model is generated by adjusting the natural language pre-training model. The adjustment refers to adding an embedding layer, a data adjustment layer, a decoder, and a data restoration layer to the natural language pre-training model so that the sample data meets the data processing requirements of the natural language pre-training model. The sample data includes power load time series data. The adjustment includes: inputting the sample data into the embedding layer to obtain initial temporal feature data; inputting the initial temporal feature data into the data adjustment layer to obtain multiple split sub-data feature vectors; inputting the multiple split sub-data feature vectors into the decoder to obtain multiple predicted split sub-data feature vectors; and inputting the multiple predicted split sub-data feature vectors into the data restoration layer to obtain the predicted data. Compare the predicted data with the sample labels, and calculate the loss value; Based on the loss value, adjust the parameters of the network layers in the initial task prediction model other than the natural language pre-trained model, and return to the step of inputting the sample data into the initial task prediction model. If the training stopping condition is met, obtain the model parameters of the trained task prediction model. The model parameters of the trained task prediction model are sent to the edge device.

8. A data processing system for a task prediction model, comprising: An edge device is used to construct a sample set and send the sample set to a cloud-side device. The sample set includes multiple sample data, each carrying a sample tag, and the sample data includes power load time-series data. A cloud-based device is used to input sample data from the sample set into an initial task prediction model. After processing by the initial task prediction model, prediction data is obtained. The initial task prediction model is generated by adjusting a natural language pre-training model. This adjustment involves adding an embedding layer, a data adjustment layer, a decoder, and a data restoration layer to the natural language pre-training model to make the sample data conform to the data processing requirements of the natural language pre-training model. The adjustment includes: inputting the sample data into the embedding layer to obtain initial temporal feature data; inputting the initial temporal feature data into the data adjustment layer to obtain multiple sub-data feature vectors; and... The multiple sub-data feature vectors are input to the decoder to obtain multiple predicted sub-data feature vectors. These predicted sub-data feature vectors are then input to the data restoration layer to obtain the predicted data. The predicted data and the sample labels are compared to calculate the loss value. Based on the loss value, the parameters of the network layers in the initial task prediction model, excluding the natural language pre-trained model, are adjusted. The process of inputting the sample data into the initial task prediction model is then repeated. If the training stopping condition is met, the model parameters of the trained task prediction model are obtained. The model parameters of the trained task prediction model are then sent to the edge device.

9. A computing device, comprising: Memory and processor; The memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions, which, when executed by the processor, implement the steps of the method according to any one of claims 1 to 4, 5 to 6, or 7.

10. A computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the method according to any one of claims 1 to 4, 5 to 6, or 7.