A supply chain demand prediction method based on Transformer

By introducing lead time parameters and promotional events into supply chain demand forecasting, constructing a business timeline and performing time warping and resampling, and combining an improved Pyraformer model for cross-scale dependency modeling and stockout correction, the problems of inconsistent time dependencies and improper stockout handling in existing technologies are solved, achieving more stable and reliable demand forecasting.

CN122199042APending Publication Date: 2026-06-12ZHENGXING (TIANJIN) SUPPLY CHAIN TECHNOLOGY CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
ZHENGXING (TIANJIN) SUPPLY CHAIN TECHNOLOGY CO LTD
Filing Date
2026-03-12
Publication Date
2026-06-12

Smart Images

  • Figure CN122199042A_ABST
    Figure CN122199042A_ABST
Patent Text Reader

Abstract

The application discloses a supply chain demand prediction method based on a Transformer, comprising the following steps: S1, collecting historical sales volume, inventory, arrival events, promotion events and lead time parameters of a target commodity at a target sales node to construct a multivariate time series; S2, generating an availability label sequence; S3, performing time warping and resampling on the time series to generate a business time axis; S4, performing multi-scale segmentation and weighted aggregation in combination with the availability label to form a pyramid level window Token sequence; S5, constructing a cross-scale dependency modeling structure based on the lead time and event lag constraint in an improved Pyraformer model to generate a cross-scale context representation; S6, performing truncation correction on the demand representation; and S7, outputting a predicted window cumulative demand sequence corresponding to the lead time. The application improves the consistency and reliability of the demand prediction result in replenishment decision-making.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the fields of deep learning and supply chain demand forecasting technology, and in particular to a supply chain demand forecasting method based on Transformer. Background Technology

[0002] With the diversification of retail channels and the increasing complexity of supply chain networks, demand forecasting technologies for inventory control, replenishment decisions, and resource allocation have become an important research direction in the field of supply chain management. Existing demand forecasting methods typically build forecasting models based on historical sales data, combined with inventory levels, promotional information, and time characteristics. Some solutions introduce statistical analysis methods or machine learning models to improve forecasting accuracy. In recent years, with the improvement of computing power and the expansion of data scale, time series forecasting methods based on deep learning have been gradually applied to supply chain demand forecasting scenarios. Among them, models represented by recurrent neural networks, attention mechanisms, and Transformer structures are used to characterize the complex dependencies of demand changes over time.

[0003] However, in real-world supply chain operations, demand data is often influenced by a combination of business factors, including inventory constraints, replenishment lead times, and promotional events. Existing technologies still have significant shortcomings in the modeling process. On the one hand, most methods directly model sales sequences on the natural timeline without structuring business factors such as replenishment lead times and arrival events that cause shifts in demand response time. This can easily lead to inconsistencies between the time dependencies learned by the model and the actual business rhythm, thus affecting the usability of the prediction results in replenishment decision-making scenarios. On the other hand, while existing multi-scale time series modeling methods can extract features across different time spans, the scale division is usually fixed or empirically set, lacking a clear correlation with key business parameters such as lead times. This makes it difficult for multi-scale features to reflect the true time structure of supply chain operations.

[0004] Furthermore, under conditions of limited inventory or stockouts, historical sales sequences often fail to accurately reflect market demand levels. Existing methods often treat zero sales as zero demand, lacking effective mechanisms to distinguish and correct for stockout-related disruptions, which can easily introduce systematic biases during model training and prediction. Simultaneously, most existing Transformer-based models employ common attention-based computation methods to model the correlation between features at different time steps or scales, failing to constrain the cross-scale dependency establishment process by incorporating lead time and event lag relationships. This can easily introduce dependency connections that do not conform to business causal relationships, affecting model stability and predictive reliability.

[0005] Therefore, how to provide a Transformer-based supply chain demand forecasting method is a problem that urgently needs to be solved by those skilled in the art. Summary of the Invention

[0006] One objective of this invention is to propose a Transformer-based supply chain demand forecasting method. This invention introduces a lead time parameter and incorporates arrival and promotional events to time-distort and resample demand data, constructing a business timeline consistent with the supply chain's operational rhythm. Based on this, a pyramid-level feature representation is generated using multi-scale windows associated with lead time. Furthermore, by introducing a cross-scale time interval constraint structure into the improved Pyraformer model, the dependencies between multi-scale features are standardized, and availability tags are used to truncate and correct demand at stockout points. Finally, the cumulative demand forecast result corresponding to the lead time is output, improving the consistency and reliability of demand forecasting in replenishment decisions.

[0007] A supply chain demand forecasting method based on Transformer according to an embodiment of the present invention includes the following steps:

[0008] S1. Collect the historical sales volume sequence, inventory sequence, arrival event sequence, promotion event sequence, and lead time parameters of the target product at the target sales node, and align them according to the preset time granularity to form a multivariate time series;

[0009] S2. Generate an availability tag sequence based on the inventory sequence and the arrival event sequence;

[0010] S3. Generate a time-distorted sequence based on the lead time parameter, the arrival event sequence, and the promotion event sequence, and perform resampling on the multivariate time series according to the time-distorted sequence to generate a business timeline sequence;

[0011] S4. Perform segmentation processing on the business timeline sequence according to multiple scale windows, and perform weighted aggregation processing on the multivariate data in each scale window based on the availability tag sequence to generate a pyramid-level window token sequence.

[0012] S5. Using the improved Pyraformer model, based on the lead time parameter and the preset event lag interval, a cross-scale dependency modeling structure is constructed between the pyramid-level window token sequences to limit the information interaction direction and information interaction timing between window tokens of different scales, and to generate cross-scale context representations.

[0013] S6. Perform truncation correction processing based on the cross-scale context representation and the availability label sequence to generate a de-truncated demand representation sequence, and generate a prediction window demand representation based on the de-truncated demand representation sequence.

[0014] S7. Perform output calculation on the forecast window demand representation and output the forecast window cumulative demand sequence corresponding to the lead time parameter.

[0015] Optionally, S1 includes:

[0016] S11. Obtain the historical sales records of the target product within a continuous historical period at the target sales node;

[0017] S12. Obtain the inventory record corresponding to the historical sales record. The inventory record is the beginning inventory and ending inventory at each time point, and form an inventory change sequence based on the difference between the beginning inventory and ending inventory at adjacent time points.

[0018] S13. Obtain the arrival event record corresponding to the target product. The arrival event record includes the arrival time and the quantity of goods received. Map the arrival time to the time point corresponding to the preset time granularity to form an arrival event sequence.

[0019] S14. Obtain the promotional event record corresponding to the target product. The promotional event record includes the promotion start time and the promotion end time. Map the promotion start time and the promotion end time to the time interval corresponding to the preset time granularity to form a promotional event sequence.

[0020] S15. Obtain the lead time parameter of the target product at the target sales node, wherein the lead time parameter is the time interval from the issuance of the replenishment instruction to the arrival of the product at the target sales node;

[0021] S16. Based on the preset time granularity, perform time alignment processing on the historical sales records, the inventory sequence, the arrival event sequence, the promotion event sequence, and the lead time parameter to generate a multivariate time series arranged by a unified time index.

[0022] Optionally, S2 includes:

[0023] S21. Based on the inventory sequence, obtain the beginning inventory and ending inventory of the target product at each time point;

[0024] S22. At each point in time, the beginning inventory and the ending inventory are correlated with the historical sales sequence to determine whether the point in time meets the condition that the inventory is zero and there is sales demand.

[0025] S23. When the time point meets the condition that the inventory is zero and there is sales demand, the time point is marked as unsellable, and the time points other than the time points that meet the condition are marked as sellable.

[0026] S24. According to the time index order, encode the available and unavailable statuses corresponding to each time point into binary tags to form an availability tag sequence.

[0027] Optionally, S3 includes:

[0028] S31. Based on the lead time parameter, determine the demand response offset corresponding to each time point, wherein the demand response offset is the discretization result of the lead time parameter in the time dimension.

[0029] S32. Based on the arrival event sequence, identify the arrival time point corresponding to each time point, and use the arrival time point as a key time anchor point on the business timeline;

[0030] S33. Based on the promotion event sequence, determine the promotion start time and promotion end time corresponding to each promotion event, and use the promotion start time and promotion end time as event boundaries on the business timeline;

[0031] S34. Combining the demand response offset, the key time anchor point and the event boundary, perform non-uniform mapping processing on the original time index to generate a time warp sequence that corresponds one-to-one with the natural time index.

[0032] S35. Based on the time warp sequence, rearrange the multivariate time series according to the time index order, and perform data alignment processing on the multivariate time series at adjacent indices after time warp to generate the business time axis sequence.

[0033] Optionally, S4 includes:

[0034] S41. On the business timeline sequence, multiple scale window sets are determined based on the lead time parameter. The scale window set includes a first scale window and one or more second scale windows. The time span of the first scale window is equal to the time span corresponding to the lead time parameter. The time span of the second scale window is an integer multiple or an integer fraction multiple of the time span corresponding to the lead time parameter. Each scale window is arranged in time index order on the business timeline sequence.

[0035] S42. For each scale window in the business timeline sequence, extract multivariate time series samples within the time interval covered by the scale window to form a data subsequence that corresponds one-to-one with the scale window.

[0036] S43. Within each scale window, based on the availability label sequence, assign sample weights to each time point sample in the data subsequence. The time point sample in the available state corresponds to the first weight, and the time point sample in the unavailable state corresponds to the second weight, and the first weight is greater than the second weight.

[0037] S44. According to the time index order, perform weighted aggregation processing on the data subsequence after allocating sample weights to generate a window feature representation corresponding to the scale window, and calculate the ratio of the number of time points in the saleable state within the scale window to the total number of time points covered by the scale window, as the numerical window attribute of the window feature representation.

[0038] S45. Organize the window feature representations and their numerical window attributes corresponding to each scale window according to the scale hierarchy to form the pyramid-level window token sequence.

[0039] Optionally, S5 includes:

[0040] S51. In the improved Pyraformer model, based on the pyramid-level window token sequence, a corresponding time coverage interval and scale level identifier are determined for each window token, forming interval identifier information that corresponds one-to-one with the window token.

[0041] S52. Based on the lead time parameter and the preset event lag interval, a cross-scale time interval constraint table is generated in the improved Pyraformer model. The cross-scale time interval constraint table includes the upper and lower limits of the interval offset between the time coverage intervals of the Token window at different scale levels.

[0042] S53. Based on the cross-scale time interval constraint table, filter the time coverage intervals of window tokens at different scale levels to form a cross-scale interval connection set that satisfies the upper and lower limits of the interval offset constraint.

[0043] S54. Within the connection relationships defined by the cross-scale interval connection set, cross-scale information fusion processing is performed on the window tokens participating in the connection in the improved Pyraformer model according to the time index order to generate cross-scale context features corresponding to each window token.

[0044] S55. Combine the cross-scale context features with the scale level identifier of the corresponding window token and the numerical window attributes to form a cross-scale context representation.

[0045] Optionally, S6 includes:

[0046] S61. Based on the cross-scale context representation, calculate the corresponding intermediate feature sequence of demand for each time point in the order of time index;

[0047] S62. Based on the availability marker sequence, the time points in the intermediate feature sequence of demand are distinguished by state to form a set of available time points and a set of unavailable time points.

[0048] S63. Perform truncation correction processing on the intermediate demand features corresponding to the set of unsaleable time points. The truncation correction processing includes using the intermediate demand features corresponding to adjacent saleable time points as a reference to replace the values ​​of the intermediate demand features corresponding to the unsaleable time points.

[0049] S64. Merge the intermediate demand features after the truncation and correction process with the intermediate demand features corresponding to the set of available time points according to the time index order to form a continuous demand feature sequence.

[0050] S65. Based on the continuous demand feature sequence, the demand features are aggregated according to the future continuous time interval determined by the lead time parameter to form a forecast window demand representation.

[0051] Optionally, S7 includes:

[0052] S71. Based on the demand characterization of the forecast window, determine the future continuous time interval determined by the lead time parameter. The future continuous time interval starts from the current time point and the time span is equal to the time span corresponding to the lead time parameter.

[0053] S72. Within the future continuous time interval, extract the demand representation values ​​corresponding to each time point in order of time index.

[0054] S73. The demand representation values ​​within the future continuous time interval are accumulated according to the time index order to form a cumulative demand value sequence corresponding to the lead time parameter.

[0055] S74. Output the cumulative demand value sequence in time index order to form the cumulative demand sequence of the prediction window.

[0056] The beneficial effects of this invention are:

[0057] This invention introduces lead time parameters, arrival events, and promotional events into the model's structural layer, performing time distortion and resampling on the original multivariate time series to construct a business timeline consistent with the supply chain's operational rhythm. Based on this, demand data is segmented and weighted according to multi-scale windows associated with the lead time parameters, forming a pyramid-level window token sequence containing the available-for-sale percentage attribute. This allows the multi-scale feature representation to simultaneously reflect demand variation characteristics and inventory constraint information. In the modeling stage, this invention introduces a cross-scale time interval based on lead time parameters and event lag intervals into the improved Pyraformer model. The binding mechanism uses the time coverage interval of the window token as the dependency modeling object to explicitly limit the scope and temporal relationship of dependency establishment between features at different scales, avoiding cross-scale connections that do not conform to the causal logic of the supply chain. In the demand correction stage, the demand features corresponding to the stockout time point are truncated and corrected by combining the availability tag sequence, and the continuous demand feature sequence is restored by replacing features with time proximity features, effectively reducing the systematic impact of inventory constraints on demand forecast results. In the output stage, demand features are aggregated and accumulated based on the future continuous time interval determined by the lead time parameter, and the cumulative demand sequence of the forecast window corresponding to the replenishment decision cycle is directly output. Through the above technical means, this invention achieves consistency of demand forecast results in time structure, business constraints and output form, and improves the stability and availability of forecast results in actual supply chain replenishment and inventory decision scenarios. Attached Figure Description

[0058] The accompanying drawings are provided to further illustrate the invention and form part of the specification. They are used in conjunction with embodiments of the invention to explain the invention and do not constitute a limitation thereof. In the drawings:

[0059] Figure 1 This is a schematic diagram of the processing flow of constructing a business timeline based on the lead time parameter and generating a multi-scale window token sequence in this invention;

[0060] Figure 2 This is a schematic diagram of the structure of the improved Pyraformer model in this invention, which performs cross-scale dependency modeling, truncation correction, and outputs the cumulative demand sequence of the prediction window. Detailed Implementation

[0061] The present invention will now be described in further detail with reference to the accompanying drawings. These drawings are simplified schematic diagrams, illustrating only the basic structure of the invention, and therefore only show the components relevant to the invention.

[0062] refer to Figure 1-2 A Transformer-based supply chain demand forecasting method includes the following steps:

[0063] S1. Collect the historical sales volume sequence, inventory sequence, arrival event sequence, promotion event sequence, and lead time parameters of the target product at the target sales node, and align them according to the preset time granularity to form a multivariate time series;

[0064] S2. Generate an availability tag sequence based on the inventory sequence and the arrival event sequence;

[0065] S3. Generate a time-distorted sequence based on the lead time parameter, the arrival event sequence, and the promotion event sequence, and perform resampling on the multivariate time series according to the time-distorted sequence to generate a business timeline sequence;

[0066] S4. Perform segmentation processing on the business timeline sequence according to multiple scale windows, and perform weighted aggregation processing on the multivariate data in each scale window based on the availability tag sequence to generate a pyramid-level window token sequence.

[0067] S5. Using the improved Pyraformer model, based on the lead time parameter and the preset event lag interval, a cross-scale dependency modeling structure is constructed between the pyramid-level window token sequences to limit the information interaction direction and information interaction timing between window tokens of different scales, and to generate cross-scale context representations.

[0068] S6. Perform truncation correction processing based on the cross-scale context representation and the availability label sequence to generate a de-truncated demand representation sequence, and generate a prediction window demand representation based on the de-truncated demand representation sequence.

[0069] S7. Perform output calculation on the forecast window demand representation and output the forecast window cumulative demand sequence corresponding to the lead time parameter.

[0070] In this invention, the improved Pyraformer model is not a simple replacement of the original model structure, but rather, while maintaining its hierarchical temporal modeling framework, it introduces several structural improvements tailored to supply chain business scenarios. First, the model uses pyramid-level window tokens as basic input units, explicitly introducing time coverage intervals and scale hierarchy identifiers for each token. This expands the modeling objects within the model from single time points to interval-level representations with clear temporal semantics, structurally enhancing the distinguishability between multi-scale features. Second, by combining replenishment lead time parameters and event lag intervals, a cross-scale time interval constraint mechanism is constructed within the model. This imposes interval offset restrictions on the dependency relationships between window tokens of different scales, thereby structurally defining the direction and timing of cross-scale information interaction and avoiding dependency connections that do not conform to supply chain business logic. Furthermore, in the cross-scale information fusion stage, the model combines cross-scale context features with the scale hierarchy identifiers and numerical window attributes of the window tokens to form a unified contextual representation that includes demand change features and inventory constraint information. The above improvements enable the model to better align with the actual business characteristics of supply chain demand forecasting, such as lead time constraints, event-driven processes, and multi-scale coupling, while maintaining the original computational efficiency.

[0071] In this embodiment, S1 includes:

[0072] S11. Obtain the historical sales records of the target product within a continuous historical period at the target sales node;

[0073] S12. Obtain the inventory record corresponding to the historical sales record. The inventory record is the beginning inventory and ending inventory at each time point, and form an inventory change sequence based on the difference between the beginning inventory and ending inventory at adjacent time points.

[0074] S13. Obtain the arrival event record corresponding to the target product. The arrival event record includes the arrival time and the quantity of goods received. Map the arrival time to the time point corresponding to the preset time granularity to form an arrival event sequence.

[0075] S14. Obtain the promotional event record corresponding to the target product. The promotional event record includes the promotion start time and the promotion end time. Map the promotion start time and the promotion end time to the time interval corresponding to the preset time granularity to form a promotional event sequence.

[0076] S15. Obtain the lead time parameter of the target product at the target sales node, wherein the lead time parameter is the time interval from the issuance of the replenishment instruction to the arrival of the product at the target sales node;

[0077] S16. Based on the preset time granularity, perform time alignment processing on the historical sales records, the inventory sequence, the arrival event sequence, the promotion event sequence, and the lead time parameter to generate a multivariate time series arranged by a unified time index.

[0078] In this embodiment, S2 includes:

[0079] S21. Based on the inventory sequence, obtain the beginning inventory and ending inventory of the target product at each time point;

[0080] S22. At each point in time, the beginning inventory and the ending inventory are correlated with the historical sales sequence to determine whether the point in time meets the condition that the inventory is zero and there is sales demand.

[0081] S23. When the time point meets the condition that the inventory is zero and there is sales demand, the time point is marked as unsellable, and the time points other than the time points that meet the condition are marked as sellable.

[0082] S24. According to the time index order, encode the available and unavailable statuses corresponding to each time point into binary tags to form an availability tag sequence.

[0083] In this invention, the availability tag sequence reflects the actual salability status of goods at various points in time. This sequence is generated through a joint determination of inventory data and historical sales data, where inventory data reflects supply constraints and historical sales data reflects demand occurrences. By determining the combination relationship between inventory status and demand status point by point along the time dimension, points in time where inventory is zero and demand exists can be distinguished from other points in time, thus forming a stable and reproducible availability tag result. The generated availability tag sequence serves as the basic input for subsequent time series modeling and demand correction processing, enabling the model to distinguish whether zero sales are caused by insufficient demand or limited supply, ensuring that the predicted results are consistent with actual demand changes.

[0084] In this embodiment, S3 includes:

[0085] S31. Based on the lead time parameter, determine the demand response offset corresponding to each time point, wherein the demand response offset is the discretization result of the lead time parameter in the time dimension.

[0086] S32. Based on the arrival event sequence, identify the arrival time point corresponding to each time point, and use the arrival time point as a key time anchor point on the business timeline;

[0087] S33. Based on the promotion event sequence, determine the promotion start time and promotion end time corresponding to each promotion event, and use the promotion start time and promotion end time as event boundaries on the business timeline;

[0088] S34. Combining the demand response offset, the key time anchor point and the event boundary, perform non-uniform mapping processing on the original time index to generate a time warp sequence that corresponds one-to-one with the natural time index.

[0089] S35. Based on the time warp sequence, rearrange the multivariate time series according to the time index order, and perform data alignment processing on the multivariate time series at adjacent indices after time warp to generate the business time axis sequence.

[0090] In this invention, the generation of the time-warped sequence uses lead time parameters, arrival events, and promotional events as unified constraints to structurally reconstruct the natural time index. Specifically, by transforming the lead time parameter into a discrete time offset relationship and combining it with the key time anchors of arrival events and the time boundaries of promotional events, a non-uniform mapping is performed on the original time index. This changes the relative intervals of different time segments on the business time axis while maintaining consistency in time sequence. Based on the time-warped sequence, the multivariate time series is rearranged and sample aligned, ensuring that subsequent multi-scale segmentation and dependency modeling are performed within a unified business time coordinate system, guaranteeing consistency between the model input data and the supply chain business rhythm.

[0091] In this embodiment, S4 includes:

[0092] S41. On the business timeline sequence, multiple scale window sets are determined based on the lead time parameter. The scale window set includes a first scale window and one or more second scale windows. The time span of the first scale window is equal to the time span corresponding to the lead time parameter. The time span of the second scale window is an integer multiple or an integer fraction multiple of the time span corresponding to the lead time parameter. Each scale window is arranged in time index order on the business timeline sequence.

[0093] S42. For each scale window in the business timeline sequence, extract multivariate time series samples within the time interval covered by the scale window to form a data subsequence that corresponds one-to-one with the scale window.

[0094] S43. Within each scale window, based on the availability label sequence, assign sample weights to each time point sample in the data subsequence. The time point sample in the available state corresponds to the first weight, and the time point sample in the unavailable state corresponds to the second weight, and the first weight is greater than the second weight.

[0095] S44. According to the time index order, perform weighted aggregation processing on the data subsequence after allocating sample weights to generate a window feature representation corresponding to the scale window, and calculate the ratio of the number of time points in the saleable state within the scale window to the total number of time points covered by the scale window, as the numerical window attribute of the window feature representation.

[0096] S45. Organize the window feature representations and their numerical window attributes corresponding to each scale window according to the scale hierarchy to form the pyramid-level window token sequence.

[0097] In this invention, the construction of multi-scale windows is based on the business timeline. By introducing lead time parameters into the scale setting process, windows with different time spans are made consistent with replenishment rhythm in terms of business semantics. For data samples within each scale window, different weights are assigned based on availability tag sequences, distinguishing inventory-constrained time points from normal sellable time points in the window representation. By aggregating the weighted samples and simultaneously calculating the proportion of sellable time within the window as an independent numerical attribute, the generated window token simultaneously contains demand change characteristics and supply constraint information, thus providing a multi-level input representation with business consistency for subsequent cross-scale dependency modeling.

[0098] In this embodiment, S5 includes:

[0099] S51. In the improved Pyraformer model, based on the pyramid-level window token sequence, a corresponding time coverage interval and scale level identifier are determined for each window token, forming interval identifier information that corresponds one-to-one with the window token.

[0100] S52. Based on the lead time parameter and the preset event lag interval, a cross-scale time interval constraint table is generated in the improved Pyraformer model. The cross-scale time interval constraint table includes the upper and lower limits of the interval offset between the time coverage intervals of the Token window at different scale levels.

[0101] S53. Based on the cross-scale time interval constraint table, filter the time coverage intervals of window tokens at different scale levels to form a cross-scale interval connection set that satisfies the upper and lower limits of the interval offset constraint.

[0102] S54. Within the connection relationships defined by the cross-scale interval connection set, cross-scale information fusion processing is performed on the window tokens participating in the connection in the improved Pyraformer model according to the time index order to generate cross-scale context features corresponding to each window token.

[0103] S55. Combine the cross-scale context features with the scale level identifier of the corresponding window token and the numerical window attributes to form a cross-scale context representation.

[0104] In this invention, cross-scale dependency modeling is not based on a single point-in-time correspondence, but rather on the time coverage interval of window tokens on the business timeline as the modeling foundation. By establishing interval-level identifiers for each window token within the model and generating interval offset constraints based on lead time parameters and event lag intervals, dependencies between windows of different scales are established only when the interval offset condition is met. The event lag interval refers to the time offset range used to characterize the impact of arrival events or promotional events on demand changes. This event lag interval is obtained statistically from historical business data and is a continuous time interval extending forward or backward relative to the event occurrence time. This interval-level dependency is transformed into a defined set of connections during model calculation, and cross-scale information fusion is performed on the window tokens based on this set, thereby forming a contextual representation that simultaneously includes time interval information, scale hierarchy information, and window attributes, providing a stable and consistent structured input for subsequent demand representation generation.

[0105] In this embodiment, S6 includes:

[0106] S61. Based on the cross-scale context representation, calculate the corresponding intermediate feature sequence of demand for each time point in the order of time index;

[0107] S62. Based on the availability marker sequence, the time points in the intermediate feature sequence of demand are distinguished by state to form a set of available time points and a set of unavailable time points.

[0108] S63. Perform truncation correction processing on the intermediate demand features corresponding to the set of unsaleable time points. The truncation correction processing includes using the intermediate demand features corresponding to adjacent saleable time points as a reference to replace the values ​​of the intermediate demand features corresponding to the unsaleable time points.

[0109] S64. Merge the intermediate demand features after the truncation and correction process with the intermediate demand features corresponding to the set of available time points according to the time index order to form a continuous demand feature sequence.

[0110] S65. Based on the continuous demand feature sequence, the demand features are aggregated according to the future continuous time interval determined by the lead time parameter to form a forecast window demand representation.

[0111] In this invention, the truncation correction process is based on the joint constraints of cross-scale contextual representation and availability tag sequence. It first forms a continuous intermediate demand feature representation along the time dimension, then classifies time points according to availability tags, thereby performing consistent numerical replacement processing on the demand features corresponding to unsaleable time points. This numerical replacement is based on the principle of temporal proximity and maintains the time index order, ensuring that the corrected demand feature sequence is continuous and aligned along the time axis. Furthermore, demand features are aggregated according to future continuous time intervals determined by the lead time parameter, completing the demand representation generation and ensuring that the output is consistent with the supply chain decision-making cycle on the time scale.

[0112] In this embodiment, S7 includes:

[0113] S71. Based on the demand characterization of the forecast window, determine the future continuous time interval determined by the lead time parameter. The future continuous time interval starts from the current time point and the time span is equal to the time span corresponding to the lead time parameter.

[0114] S72. Within the future continuous time interval, extract the demand representation values ​​corresponding to each time point in order of time index.

[0115] S73. The demand representation values ​​within the future continuous time interval are accumulated according to the time index order to form a cumulative demand value sequence corresponding to the lead time parameter.

[0116] S74. Output the cumulative demand value sequence in time index order to form the cumulative demand sequence of the prediction window.

[0117] In this invention, the output stage does not directly rely on the intermediate calculation state within the model. Instead, based on the already formed demand representation within the prediction window, the output time interval is explicitly limited by the lead time parameter. Specifically, the current time point is used as the prediction starting point. A continuous time interval corresponding to the lead time parameter is selected on the business timeline, and the demand representation data is processed in time index order within this interval. By sequentially accumulating the demand representation values ​​within the interval while maintaining time index consistency, the final output cumulative demand result directly corresponds to the supply chain replenishment decision cycle in the time dimension, thereby ensuring the consistency of the prediction result in both numerical form and temporal semantics.

[0118] Example 1:

[0119] To verify the feasibility and effectiveness of this invention in a real-world supply chain scenario, it was applied to the daily replenishment demand forecasting business of a nationwide chain retailer. This company operates hundreds of stores across the country, primarily selling fast-moving consumer goods (FMCG), characterized by high turnover and frequent replenishment. Taking a mid-sized store in a first-tier city as an example, a popular packaged food product was selected as the target item. Demand for this product fluctuates significantly during promotional periods. Furthermore, due to fixed replenishment cycles and unstable delivery conditions, stockouts and inventory backlogs have historically occurred simultaneously. Traditional forecasting methods struggle to balance forecasting accuracy and business availability.

[0120] In this application scenario, the company's original demand forecasting methods were primarily based on historical sales sequences over a natural timeline, employing simple time series models or general deep learning models for prediction. Because they lacked structured modeling of replenishment lead times, arrival events, and promotional rhythms, the forecast results frequently resulted in time misalignments during actual replenishment execution. For example, predicted high demand might occur after actual delivery, making the forecast results unsuitable for direct use in replenishment decisions. Furthermore, during stockout periods, sales data was recorded as zero. The original method directly input zero sales as zero demand into the model, causing the model to systematically underestimate the true demand level, further amplifying the forecast error.

[0121] In this embodiment, firstly, according to the technical solutions of claims 1 and 2 of this invention, historical sales records, inventory records, arrival event records, promotion event records, and corresponding lead time parameters of the target product are collected from the store for 180 consecutive days. Historical sales records are in daily units, representing the actual sales quantity completed each day; inventory records include the initial and ending inventory quantities for each day; arrival event records include the arrival date and quantity; promotion event records include the start and end dates of promotions; the lead time parameter is the average time interval from the issuance of a replenishment order to the actual arrival of the product at the store, with a statistical result of 7 days. All data are time-aligned at a daily granularity to form a multivariate time series as model input.

[0122] In the data preprocessing stage, based on the inventory sequence and historical sales records, an availability labeling sequence is generated according to the steps of claim 3. When both the beginning and ending inventory levels on a certain date are zero, and there is unmet sales demand based on historical records, that date is marked as unsellable, and the remaining dates are marked as sellable. This process clearly distinguishes between dates with genuinely low demand and dates where sales are cut off due to stockouts, providing a basis for subsequent demand correction.

[0123] Subsequently, lead time parameters, arrival event sequences, and promotion event sequences are introduced. A non-uniform mapping is performed on the natural time index to generate a time-distorted sequence. Based on this, the multivariate time series is resampled to construct a business timeline sequence. This business timeline aligns with replenishment takt time and promotional cycles in terms of time structure, making demand changes more closely match the actual business operation process in the time dimension.

[0124] On the business timeline, multiple scale windows are divided based on lead time parameters. The first scale window spans 7 days, while the second scale window spans 14 and 28 days. For each scale window, multivariate data subsequences within the corresponding time interval are extracted. These subsequences are then weighted differently for available and unavailable time points, and weighted aggregation is performed on the data subsequences to generate a pyramid-level window token sequence containing numerical window attributes. These window attributes reflect the proportion of available time points within the window, effectively characterizing the impact of inventory constraints on demand observation.

[0125] In the modeling phase, the improved Pyraformer model is employed. Internally, the model determines the corresponding time coverage interval and scale hierarchy identifier for each window token, and generates a cross-scale time interval constraint table based on the lead time parameter and preset event lag intervals. Dependency connections are only allowed between window tokens that satisfy the interval offset condition. Under this constraint, cross-scale information fusion is performed to generate cross-scale context representations, avoiding cross-scale dependencies that do not conform to business causality.

[0126] During the demand correction phase, an intermediate demand feature sequence is calculated based on cross-scale contextual representation, and truncation correction is performed on the intermediate demand features corresponding to unsaleable time points in conjunction with the availability label sequence. Specifically, the intermediate demand features of adjacent saleable time points are used as references to numerically replace the demand features of unsaleable time points, thereby restoring a continuous demand feature sequence. Subsequently, the demand features are aggregated according to the future continuous time intervals determined by the lead time parameter to form a forecast window demand representation.

[0127] In the output phase, the current time point is used as the starting position of the forecast, and a future continuous time interval spanning 7 days is determined. The demand representation of the forecast window is accumulated and the cumulative demand sequence of the forecast window is output as the direct input for replenishment decision.

[0128] To verify the beneficial effects of the present invention, a comparative experiment was conducted between the method of the present invention and the traditional method. A continuous 60-day period was selected as the test interval. The evaluation indicators included the mean absolute percentage error, the proportion of demand underestimation during the stockout period, and the satisfaction rate of the prediction results in the replenishment execution. The experimental results are shown in Table 1.

[0129] Table 1. Comparison of the effectiveness of different demand forecasting methods in actual store scenarios.

[0130] Comparison indicators Method of the present invention Traditional methods Test interval days (days) 60 60 Average daily sales (units) 128 128 Mean absolute percentage error (%) 12.3 21.6 Out-of-stock date and quantity (days) 9 9 Percentage of demand underestimated during stockout periods (%) 8.7 34.8 Average cumulative demand error (pieces) within the forecast window 41 96 Replenishment fulfillment rate (%) 94.1 82.5 Forecast deviation during the promotion period (%) 13.6 28.2

[0131] As shown in Table 1, under the same test interval and average daily sales conditions, the method of this invention outperforms the traditional method in all indicators, demonstrating good comparability. Regarding overall prediction accuracy, the average absolute percentage error of this invention is 12.3%, significantly lower than the 21.6% of the traditional method, indicating a more accurate portrayal of demand changes. In stockout scenarios, both methods face the same number of stockout dates, but this invention, through a truncation correction mechanism, reduces the underestimation of demand during stockout periods from 34.8% to 8.7%, effectively restoring the true demand masked by inventory constraints. In replenishment-related indicators, the average cumulative demand error of this invention's prediction window decreases from 96 units to 41 units, and the replenishment fulfillment rate increases from 82.5% to 94.1%, significantly enhancing the prediction results' support for replenishment decisions. Furthermore, when demand fluctuates significantly during promotional periods, this invention reduces the prediction deviation from 28.2% to 13.6%, demonstrating stronger adaptability to promotion-driven demand changes.

[0132] The above analysis shows that, under the premise of ensuring consistent data conditions, the method of the present invention is significantly superior to traditional methods in key indicators such as prediction accuracy, suppression of stockout impact, and availability of replenishment decisions. This fully verifies the technical advantages and application value of the present invention in actual supply chain demand forecasting applications.

[0133] The above description is only a preferred embodiment of the present invention, but the scope of protection of the present invention is not limited thereto. Any equivalent substitutions or modifications made by those skilled in the art within the scope of the technology disclosed in the present invention, based on the technical solution and inventive concept of the present invention, should be covered within the scope of protection of the present invention.

Claims

1. A supply chain demand forecasting method based on Transformer, characterized in that, Includes the following steps: S1. Collect the historical sales volume sequence, inventory sequence, arrival event sequence, promotion event sequence, and lead time parameters of the target product at the target sales node, and align them according to the preset time granularity to form a multivariate time series; S2. Generate an availability tag sequence based on the inventory sequence and the arrival event sequence; S3. Generate a time-distorted sequence based on the lead time parameter, the arrival event sequence, and the promotion event sequence, and perform resampling on the multivariate time series according to the time-distorted sequence to generate a business timeline sequence; S4. Perform segmentation processing on the business timeline sequence according to multiple scale windows, and perform weighted aggregation processing on the multivariate data in each scale window based on the availability tag sequence to generate a pyramid-level window token sequence. S5. Using the improved Pyraformer model, based on the lead time parameter and the preset event lag interval, a cross-scale dependency modeling structure is constructed between the pyramid-level window token sequences to limit the information interaction direction and information interaction timing between window tokens of different scales, and to generate cross-scale context representations. S6. Perform truncation correction processing based on the cross-scale context representation and the availability label sequence to generate a de-truncated demand representation sequence, and generate a prediction window demand representation based on the de-truncated demand representation sequence. S7. Perform output calculation on the forecast window demand representation and output the forecast window cumulative demand sequence corresponding to the lead time parameter.

2. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S1 includes: S11. Obtain the historical sales records of the target product within a continuous historical period at the target sales node; S12. Obtain the inventory record corresponding to the historical sales record. The inventory record is the beginning inventory and ending inventory at each time point, and form an inventory change sequence based on the difference between the beginning inventory and ending inventory at adjacent time points. S13. Obtain the arrival event record corresponding to the target product. The arrival event record includes the arrival time and the quantity of goods received. Map the arrival time to the time point corresponding to the preset time granularity to form an arrival event sequence. S14. Obtain the promotional event record corresponding to the target product. The promotional event record includes the promotion start time and the promotion end time. Map the promotion start time and the promotion end time to the time interval corresponding to the preset time granularity to form a promotional event sequence. S15. Obtain the lead time parameter of the target product at the target sales node, wherein the lead time parameter is the time interval from the issuance of the replenishment instruction to the arrival of the product at the target sales node; S16. Based on the preset time granularity, perform time alignment processing on the historical sales records, the inventory sequence, the arrival event sequence, the promotion event sequence, and the lead time parameter to generate a multivariate time series arranged by a unified time index.

3. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S2 includes: S21. Based on the inventory sequence, obtain the beginning inventory and ending inventory of the target product at each time point; S22. At each point in time, the beginning inventory and the ending inventory are correlated with the historical sales sequence to determine whether the point in time meets the condition that the inventory is zero and there is sales demand. S23. When the time point meets the condition that the inventory is zero and there is sales demand, the time point is marked as unsellable, and the time points other than the time points that meet the condition are marked as sellable. S24. According to the time index order, encode the available and unavailable statuses corresponding to each time point into binary tags to form an availability tag sequence.

4. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S3 includes: S31. Based on the lead time parameter, determine the demand response offset corresponding to each time point, wherein the demand response offset is the discretization result of the lead time parameter in the time dimension. S32. Based on the arrival event sequence, identify the arrival time point corresponding to each time point, and use the arrival time point as a key time anchor point on the business timeline; S33. Based on the promotion event sequence, determine the promotion start time and promotion end time corresponding to each promotion event, and use the promotion start time and promotion end time as event boundaries on the business timeline; S34. Combining the demand response offset, the key time anchor point and the event boundary, perform non-uniform mapping processing on the original time index to generate a time warp sequence that corresponds one-to-one with the natural time index. S35. Based on the time warp sequence, rearrange the multivariate time series according to the time index order, and perform data alignment processing on the multivariate time series at adjacent indices after time warp to generate the business time axis sequence.

5. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S4 includes: S41. On the business timeline sequence, multiple scale window sets are determined based on the lead time parameter. The scale window set includes a first scale window and one or more second scale windows. The time span of the first scale window is equal to the time span corresponding to the lead time parameter. The time span of the second scale window is an integer multiple or an integer fraction multiple of the time span corresponding to the lead time parameter. Each scale window is arranged in time index order on the business timeline sequence. S42. For each scale window in the business timeline sequence, extract multivariate time series samples within the time interval covered by the scale window to form a data subsequence that corresponds one-to-one with the scale window. S43. Within each scale window, based on the availability label sequence, assign sample weights to each time point sample in the data subsequence. The time point sample in the available state corresponds to the first weight, and the time point sample in the unavailable state corresponds to the second weight, and the first weight is greater than the second weight. S44. According to the time index order, perform weighted aggregation processing on the data subsequence after allocating sample weights to generate a window feature representation corresponding to the scale window, and calculate the ratio of the number of time points in the saleable state within the scale window to the total number of time points covered by the scale window, as the numerical window attribute of the window feature representation. S45. Organize the window feature representations and their numerical window attributes corresponding to each scale window according to the scale hierarchy to form the pyramid-level window token sequence.

6. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S5 includes: S51. In the improved Pyraformer model, based on the pyramid-level window token sequence, a corresponding time coverage interval and scale level identifier are determined for each window token, forming interval identifier information that corresponds one-to-one with the window token. S52. Based on the lead time parameter and the preset event lag interval, a cross-scale time interval constraint table is generated in the improved Pyraformer model. The cross-scale time interval constraint table includes the upper and lower limits of the interval offset between the time coverage intervals of the Token window at different scale levels. S53. Based on the cross-scale time interval constraint table, filter the time coverage intervals of window tokens at different scale levels to form a cross-scale interval connection set that satisfies the upper and lower limits of the interval offset constraint. S54. Within the connection relationships defined by the cross-scale interval connection set, cross-scale information fusion processing is performed on the window tokens participating in the connection in the improved Pyraformer model according to the time index order to generate cross-scale context features corresponding to each window token. S55. Combine the cross-scale context features with the scale level identifier of the corresponding window token and the numerical window attributes to form a cross-scale context representation.

7. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S6 includes: S61. Based on the cross-scale context representation, calculate the corresponding intermediate feature sequence of demand for each time point in the order of time index; S62. Based on the availability marker sequence, the time points in the intermediate feature sequence of demand are distinguished by state to form a set of available time points and a set of unavailable time points. S63. Perform truncation correction processing on the intermediate demand features corresponding to the set of unsaleable time points. The truncation correction processing includes using the intermediate demand features corresponding to adjacent saleable time points as a reference to replace the values ​​of the intermediate demand features corresponding to the unsaleable time points. S64. Merge the intermediate demand features after the truncation and correction process with the intermediate demand features corresponding to the set of available time points according to the time index order to form a continuous demand feature sequence. S65. Based on the continuous demand feature sequence, the demand features are aggregated according to the future continuous time interval determined by the lead time parameter to form a forecast window demand representation.

8. The supply chain demand forecasting method based on Transformer according to claim 1, characterized in that, S7 includes: S71. Based on the demand characterization of the forecast window, determine the future continuous time interval determined by the lead time parameter. The future continuous time interval starts from the current time point and the time span is equal to the time span corresponding to the lead time parameter. S72. Within the future continuous time interval, extract the demand representation values ​​corresponding to each time point in order of time index. S73. The demand representation values ​​within the future continuous time interval are accumulated according to the time index order to form a cumulative demand value sequence corresponding to the lead time parameter. S74. Output the cumulative demand value sequence in time index order to form the cumulative demand sequence of the prediction window.