Intelligent calculation method and device for refined oil demand

What is AI technical title?
AI technical title is built by Patsnap AI team. It summarizes the technical point description of the patent document.
By combining clustering algorithms and neural network models, the problem of difficulty in considering the interaction of factors in the analysis of refined oil demand has been solved, enabling accurate demand calculation and resource allocation, and improving the accuracy and efficiency of the analysis.

CN122288751APending Publication Date: 2026-06-26RICHFIT INFORMATION TECH +1

View PDF 0 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Applications(China)
Current Assignee / Owner: RICHFIT INFORMATION TECH
Filing Date: 2024-12-24
Publication Date: 2026-06-26

Application Information

Patent Timeline

24 Dec 2024

Application

26 Jun 2026

Publication

CN122288751A

IPC: G06Q30/0202; G06Q50/02; G06F18/23; G06F18/243; G06N20/20

AI Tagging

Technology Topics

Analytic model Cluster algorithm

Technical Efficacy Phrases

Avoid the one-size-fits-all problemimprove accuracy

Explore More Agents

Novelty Search
Search existing technologies and assess novelty
↗
FTO
Analyze whether a product may infringe others' patents
↗
Design FTO
Check prior-design risk for exterior design
↗
Drafting
Draft patent application text based on a technical solution
↗
Find Solutions with TRIZ
Generate feasible solution to solve your technical challenge
↗

Similar Technology Patents

Digital human interaction control method and device based on multi-modal
CN122389919AAccurately capture instant intentionsimprove accuracy Interaction control Feature vector
Fault diagnosis method and device and model prototype acquisition method
CN115658361Bfully excavatedimprove accuracy
A device for visualizing calibration of astigmatic eye focal lines
CN224483971Uimprove accuracy High measurement accuracy Target line Astigmatism
Hub node modeling and analysis data generation method and device for single-layer latticed shell structure
CN122221370AGeometric CAD Design optimisation/simulation
Data processing method and apparatus, storage medium, and electronic device
CN121943471BSolve technical problems with low accuracyimprove accuracyBone tibiaSurgery

Get free access to AI patent search and analysis

Check patentability, review prior art and ask IP Agent with full patent context.

Smart Images

Figure CN122288751A_ABST

Patent Text Reader

Abstract

This invention discloses an intelligent method and apparatus for calculating refined oil demand. The method includes: acquiring collected seasonal and historical demand information for refined oil; performing cluster analysis on the collected information based on a clustering algorithm to obtain data information at different regional category levels; performing feature engineering processing on the data information at each regional category level, and determining the feature importance value of each feature among the seasonal characteristics, trend characteristics, and cross-features of refined oil at the corresponding regional category level based on a decision tree model ensemble learning algorithm; determining the target features for different regional category levels and establishing a second training dataset; training a neural network model to obtain a refined oil demand analysis model; and calculating the refined oil demand data of different merchants at different regional category levels. This invention aims to improve the accuracy and efficiency of refined oil demand calculation.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of artificial intelligence technology, and in particular to a method and apparatus for intelligent calculation of refined oil demand. Background Technology

[0002] This section is intended to provide background or context for the embodiments of the invention described herein. The description herein is not an admission that it is prior art simply because it is included in this section.

[0003] Analyzing refined oil demand is a complex and crucial business activity. It involves not only simple estimations of market demand but also comprehensive considerations of technological advancements, relevant policy orientations, and the global energy market. Existing methods for analyzing refined oil demand often fail to fully account for the influence of these multiple factors, resulting in insufficient accuracy and an inability to provide a scientific basis for companies to adjust their refined oil demand control strategies and allocate resources. Traditional analytical models typically consider only one or a few factors, failing to fully capture the complex multi-factor interactions behind changes in refined oil demand, leading to inaccurate results. Current technologies struggle to provide refined management and customized marketing based on the market characteristics of different regions, and cannot effectively address significant regional differences. Due to an inaccurate grasp of market demand trends, companies may face inventory backlogs or shortages, resulting in resource waste or lost opportunities. Summary of the Invention

[0004] This invention provides an intelligent method for calculating refined oil demand, which improves the accuracy and efficiency of refined oil demand calculation and enables targeted analysis of refined oil demand at different regional categories. The method includes:

[0005] The system collects seasonal and historical demand information for refined oil products; it then performs cluster analysis on the collected information based on a clustering algorithm to obtain data information at different regional category levels. These regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical scope.

[0006] For the data information under each regional category level, feature engineering is performed on the data information under the regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products.

[0007] Based on the decision tree model ensemble learning algorithm, the importance value of each feature in the seasonality feature, trend feature, and cross feature of refined oil products in the corresponding category hierarchy of the region is determined; and multiple features whose feature importance values exceed the preset values are taken as the target features in the corresponding category hierarchy of the region.

[0008] Based on the refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories, a second training dataset is established; based on the neural network algorithm, the pre-set neural network model is trained with the second training dataset to obtain the refined oil demand analysis model.

[0009] Based on the refined oil demand analysis model, the refined oil demand data of different merchants in different regions and categories are calculated.

[0010] This invention also provides an intelligent calculation device for refined oil demand, used to improve the accuracy and efficiency of refined oil demand calculation and to achieve targeted analysis of refined oil demand at different regional categories. The device includes:

[0011] The information acquisition and cluster analysis module is used to acquire seasonal and historical demand information of refined oil products; based on the clustering algorithm, the collected information is clustered to obtain data information under different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants in the same city, arranged from largest to smallest geographical range.

[0012] The feature engineering processing module is used to perform feature engineering processing on the data information under each regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products.

[0013] The target feature determination module is used to determine the feature importance value of each feature in the seasonal features, trend features, and cross features of refined oil products at the category level of the corresponding region based on the decision tree model ensemble learning algorithm; and to select multiple features whose feature importance values exceed preset values as target features at the category level of the corresponding region.

[0014] The refined oil demand analysis modeling module is used to establish a second training dataset based on the refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories; based on the neural network algorithm, the pre-set neural network model is trained with the second training dataset to obtain the refined oil demand analysis model.

[0015] The refined oil demand analysis module is used to calculate the refined oil demand data of different merchants in different regions and categories based on the refined oil demand analysis model.

[0016] This invention also provides a computer device, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the above-mentioned intelligent calculation method for refined oil demand.

[0017] This invention also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described intelligent calculation method for refined oil demand.

[0018] This invention also provides a computer program product, which includes a computer program that, when executed by a processor, implements the above-mentioned intelligent calculation method for refined oil demand.

[0019] In this embodiment of the invention, seasonal information and historical demand information of refined oil products are collected; based on a clustering algorithm, the collected information is clustered to obtain data information at different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical scope; for the data information at each regional category level, feature engineering processing is performed to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products corresponding to that regional category level; the cross characteristics are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products; based on An ensemble learning algorithm based on a decision tree model is used to determine the importance value of each feature among the seasonal characteristics, trend characteristics, and cross-features of refined oil products at the corresponding category level for a given region. Multiple features with importance values exceeding a preset value are selected as target features for the corresponding category level of that region. A second training dataset is established based on the obtained refined oil demand data at different regional category levels and the data of multiple target features corresponding to different regional category levels. A pre-set neural network model is trained using the second training dataset using a neural network algorithm to obtain a refined oil demand analysis model. Based on the refined oil demand analysis model, refined oil demand data for different merchants at different regional category levels is calculated. This invention employs clustering algorithms to perform multi-level clustering analysis on data (across provinces, provinces, cities within the same province, and different merchants within the same city), enabling more accurate capture of differences between regions and avoiding the one-size-fits-all problem of traditional methods. Feature engineering is performed on the data information at each level to extract various features, including seasonal features, trend features, and cross-features, allowing the model to comprehensively consider various influencing factors. An ensemble learning algorithm based on a decision tree model is used to determine the importance values of key features and select target features, ensuring that the dataset used to train the neural network model contains the most influential variables. A neural network algorithm is used to train the refined oil demand analysis model; this deep... The degree-learning model can automatically learn useful feature representations from raw data, capture complex patterns and long-term dependencies, thereby improving the accuracy of refined oil demand calculation. Furthermore, by using clustering algorithms to obtain data information at different regional category levels, management at all levels can provide personalized services based on the characteristics of specific regions. Through accurate analysis of future demand using the refined oil demand analysis model, enterprises can allocate resources more rationally, avoiding inventory backlogs or shortages. This invention addresses the problems of insufficient analytical precision, lack of targeted management, unreasonable resource allocation, and slow response speed in existing technologies, improving the accuracy and efficiency of refined oil demand calculation. Attached Figure Description

[0020] To more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, the drawings described below are only some embodiments of the present invention. For those skilled in the art, other drawings can be obtained based on these drawings without creative effort. In the drawings:

[0021] Figure 1 This is a flowchart illustrating an intelligent calculation method for refined oil demand in an embodiment of the present invention.

[0022] Figure 2 This is a specific example diagram of an intelligent calculation method for refined oil demand in an embodiment of the present invention;

[0023] Figure 3 This is a specific example diagram of an intelligent calculation method for refined oil demand in an embodiment of the present invention;

[0024] Figure 4 This is a specific example diagram of an intelligent calculation method for refined oil demand in an embodiment of the present invention;

[0025] Figure 5 This is a schematic diagram of the structure of an intelligent calculation device for refined oil demand in an embodiment of the present invention;

[0026] Figure 6 This is a specific example diagram of an intelligent calculation device for refined oil demand in an embodiment of the present invention;

[0027] Figure 7 This is a schematic diagram of a computer device used for intelligent calculation of refined oil demand in an embodiment of the present invention. Detailed Implementation

[0028] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the embodiments of the present invention will be further described in detail below with reference to the accompanying drawings. Here, the illustrative embodiments of the present invention and their descriptions are used to explain the present invention, but are not intended to limit the present invention.

[0029] In this document, the term "and / or" merely describes a relationship, indicating that three relationships can exist. For example, A and / or B can represent three cases: A alone, A and B simultaneously, and B alone. Furthermore, the term "at least one" in this document means any combination of at least two of any one or more elements. For example, including at least one of A, B, and C can mean including any one or more elements selected from the set consisting of A, B, and C.

[0030] In the description of this specification, the terms "comprising," "including," "having," and "containing" are open-ended terms, meaning that they include but are not limited to. The terms "an embodiment," "a specific embodiment," "some embodiments," and "for example," etc., refer to specific features, structures, or characteristics described in connection with that embodiment or example that are included in at least one embodiment or example of this application. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the specific features, structures, or characteristics described can be combined in any suitable manner in one or more embodiments or examples. The order of steps involved in the various embodiments is used to illustrate the implementation of this application, and the order of steps is not limited and can be adjusted appropriately as needed.

[0031] The acquisition, storage, use, and processing of data in this application comply with relevant laws and regulations. The information collected in this application is authorized by the user or fully authorized by all parties. Furthermore, the collection, storage, use, processing, transmission, provision, disclosure, and application of this data all comply with relevant laws, regulations, and standards, and necessary confidentiality measures have been taken. This does not violate public order and good morals, and corresponding operation entry points are provided for users to choose to authorize or refuse. In addition, this application provides users with corresponding operation entry points to choose to agree to or refuse automated decision-making results. If the user chooses to refuse, the process proceeds to the expert decision-making stage.

[0032] It should be noted that in the embodiments of this application, certain existing solutions in the industry, such as software, components, and models, may be mentioned. For example, some existing software tools, components, algorithm models, or solutions well-known in other technical fields may be cited. These should be considered exemplary, and their purpose is only to illustrate the feasibility of implementing the technical solution of this application. These mentions should be understood as typical examples, and their core purpose is to illustrate and verify the rationality and feasibility of implementing the technical solution proposed in this application. However, this does not mean that the applicant has already used or necessarily used the solution. Such citations do not imply that the applicant has actually adopted these existing solutions, or that it will necessarily adopt these methods in its technical implementation process in the future. In other words, these mentions are only illustrative in nature, helping to understand the connection and transcendence of the innovation points of this application with the prior art, and do not constitute an endorsement or reliance statement on a specific prior art product.

[0033] When exploring the complex business context of refined oil demand analysis, it is necessary to delve into multiple interrelated and dynamically changing factors. This analysis process is not merely a simple estimation of market demand, but a comprehensive consideration of multiple factors such as technological advancements, relevant policy orientations, and the global energy market.

[0034] The complex background of refined oil demand analysis involves multiple aspects, including technological advancements, relevant policy orientations, and the global energy market. These factors are intertwined and mutually influential, collectively contributing to the complexity and uncertainty of refined oil demand analysis. Therefore, when conducting refined oil demand analysis, it is necessary to comprehensively consider the changing trends and interrelationships of various factors to improve the accuracy and reliability of the analysis.

[0035] To address the aforementioned problems, this invention provides an intelligent method for calculating refined oil demand, thereby improving the accuracy and efficiency of refined oil demand calculation and enabling targeted analysis of refined oil demand at different regional category levels. Figure 1 This is a flowchart illustrating an intelligent calculation method for refined oil demand in an embodiment of the present invention. (See attached diagram.) Figure 1 The method may include:

[0036] Step 101: Obtain the collected seasonal information and historical demand information of refined oil products; based on the clustering algorithm, perform cluster analysis on the collected information to obtain data information under different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants in the same city, arranged from largest to smallest geographical range.

[0037] Step 102: For the data information under each regional category level, perform feature engineering processing on the data information under the regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products.

[0038] Step 103: Based on the decision tree model ensemble learning algorithm, determine the feature importance value of each feature in the seasonal features, trend features, and cross features of refined oil products at the category level for the corresponding region; select multiple features whose feature importance values exceed the preset values as target features at the category level for the corresponding region.

[0039] Step 104: Based on the acquired refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories, establish a second training dataset; based on the neural network algorithm, train the pre-set neural network model with the second training dataset to obtain the refined oil demand analysis model.

[0040] Step 105: Based on the refined oil demand analysis model, calculate the refined oil demand data for different merchants in different regions and categories.

[0041] In this embodiment of the invention, seasonal information and historical demand information of refined oil products are collected; based on a clustering algorithm, the collected information is clustered to obtain data information at different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical scope; for the data information at each regional category level, feature engineering processing is performed to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products corresponding to that regional category level; the cross characteristics are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products; based on An ensemble learning algorithm based on a decision tree model is used to determine the importance value of each feature among the seasonal characteristics, trend characteristics, and cross-features of refined oil products at the corresponding category level for a given region. Multiple features with importance values exceeding a preset value are selected as target features for the corresponding category level of that region. A second training dataset is established based on the obtained refined oil demand data at different regional category levels and the data of multiple target features corresponding to different regional category levels. A pre-set neural network model is trained using the second training dataset using a neural network algorithm to obtain a refined oil demand analysis model. Based on the refined oil demand analysis model, refined oil demand data for different merchants at different regional category levels is calculated. This invention employs clustering algorithms to perform multi-level clustering analysis on data (across provinces, provinces, cities within the same province, and different merchants within the same city), enabling more accurate capture of differences between regions and avoiding the one-size-fits-all problem of traditional methods. Feature engineering is performed on the data information at each level to extract various features, including seasonal features, trend features, and cross-features, allowing the model to comprehensively consider various influencing factors. An ensemble learning algorithm based on a decision tree model is used to determine the importance values of key features and select target features, ensuring that the dataset used to train the neural network model contains the most influential variables. A neural network algorithm is used to train the refined oil demand analysis model; this deep... The degree-learning model can automatically learn useful feature representations from raw data, capture complex patterns and long-term dependencies, thereby improving the accuracy of refined oil demand calculation. Furthermore, by using clustering algorithms to obtain data information at different regional category levels, management at all levels can provide personalized services based on the characteristics of specific regions. Through accurate analysis of future demand using the refined oil demand analysis model, enterprises can allocate resources more rationally, avoiding inventory backlogs or shortages. This invention addresses the problems of insufficient analytical precision, lack of targeted management, unreasonable resource allocation, and slow response speed in existing technologies, improving the accuracy and efficiency of refined oil demand calculation.

[0042] In specific implementation, the first step is to obtain seasonal information and historical demand information of the collected refined oil products; based on the clustering algorithm, the collected information is clustered to obtain data information under different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants in the same city, arranged from largest to smallest geographical range.

[0043] In this embodiment, the collection of seasonal and historical demand information for refined oil products includes:

[0044] The time data of refined oil consumption is extracted from the fuel consumption record system of gas stations, including the date and specific time of consumption. This time data of demand forms the basis for analyzing seasonality-related information.

[0045] Referencing holiday information, clarify the specific date range and type of each holiday, and obtain demand trend data related to holidays, such as the impact of changes in travel patterns on refined oil demand during holidays.

[0046] Based on demand time data, we can accurately determine the consumption month and quarter information, and analyze the changing patterns of refined oil demand in different months and quarters. For example, increased air conditioning use in summer may lead to increased vehicle fuel consumption, thus affecting refined oil demand; increased heating demand in winter may affect industrial and residential oil consumption.

[0047] Calculate the relative distance between the time of increased demand for refined oil products and holidays, such as a few days before and a few days after the holiday. Analyze the fluctuations in demand for refined oil products during these time periods using historical refined oil consumption data to determine whether holidays have an early or delayed impact on demand.

[0048] Non-oil consumption data is collected from the gas station's refined oil demand management system, including information such as the types, quantities, and transaction amounts of goods traded in convenience stores, as well as business data on services such as car washing and car repair provided by the gas station.

[0049] Collect data on non-oil and refined oil-related activities conducted by gas stations, such as the time, content, and participation rate of refined oil-related activities. Also record information on joint refined oil-related activities related to oil data, such as giving away convenience store goods or service coupons when refueling for a certain amount.

[0050] The study assesses the impact of non-oil refined product related activities on oil demand by comparing changes in oil demand before and after these activities. This determines whether non-oil refined product related activities can increase oil demand and the differences in the effects of different types of non-oil refined product related activities on increasing oil demand.

[0051] After acquiring and integrating the above-mentioned information, cluster analysis is performed on the collected information based on clustering algorithms to reveal the inherent structure and patterns of the data at different regional category levels, providing support for targeted demand analysis.

[0052] The following is a detailed explanation of the different regional categories:

[0053] 1. Cross-provincial address hierarchical clustering

[0054] The main characteristic data are the total demand for refined oil products, the growth rate of demand, and the structure of oil product demand (such as the ratio of gasoline to diesel) for each province. At the same time, auxiliary characteristics are considered, such as the geographical location of the province, the level of regional development (such as regional GDP and industrial added value), and the development level of transportation network (such as highway mileage and railway freight volume).

[0055] These feature data are standardized to give them uniform dimensions and comparability, making them easier for clustering algorithms to calculate and analyze.

[0056] By employing appropriate clustering algorithms (such as K-means), different regions are divided into different cluster groups based on a predetermined number of clusters (which can be determined empirically or through data analysis). For example, provinces with heavy traffic may cluster in one group; these provinces have a large and rapidly growing demand for refined oil products, and their demand structure may be more inclined towards gasoline, mainly used for transportation and residential travel. Conversely, major industrial provinces may cluster in another group, with relatively high diesel consumption, used for industrial production and logistics. The clustering results help analyze the characteristics and demand trends of the refined oil market in different types of provinces at a macro level, providing a basis for cross-provincial resource allocation.

[0057] 2. Provincial address hierarchical clustering

[0058] At the provincial level, data characteristics should be further refined. In addition to considering basic indicators such as total demand for refined oil products and growth rate, the focus should be on the consumption differences in different regions within the province (such as urban and rural areas), the impact of local pillar industries (such as manufacturing and tourism) on refined oil product demand, and the distribution of transportation hubs within the province (such as airports, ports, and railway hubs).

[0059] Collect and analyze provincial regulations related to refined oil products (such as environmental protection decisions on oil quality requirements, energy subsidy decisions, etc.) and use them as one of the clustering features to reflect the impact of relevant decision-making factors on the refined oil product market in the province.

[0060] Clustering algorithms are used to group cities and prefectures within a province. For example, the provincial capital and surrounding developed cities may cluster together due to similarities in refined oil consumption patterns and demand characteristics caused by frequent commercial activities; while cities and prefectures primarily engaged in agriculture or resource development may form another group due to different industrial structures. The clustering results at the provincial geographical level help provincial petroleum companies formulate differentiated market promotion and refined oil demand management plans based on the characteristics of each city and prefecture, rationally allocate resources, and improve operational efficiency.

[0061] 3. Hierarchical clustering of city addresses within the province

[0062] Within a city, the focus is on the impact of factors such as city size (e.g., urban area and population), urban functional zoning (e.g., distribution of commercial, industrial, and residential areas), public transportation development level (e.g., bus route coverage density and subway operating mileage), and the development of industrial parks around the city on the demand for refined oil products.

[0063] This study analyzes the relationship between urban residents' fuel consumption habits and travel patterns (such as private car ownership, public transportation usage, and shared bicycle usage) and the demand for refined oil products, while also considering the constraining effect of urban environmental decisions (such as vehicle restriction policies and environmental emission standards) on the demand for refined oil products.

[0064] Clustering algorithms are used to classify gas stations in different areas or business districts within a city. For example, gas stations located in commercial areas may have different peak consumption periods and consumption patterns compared to those located in industrial areas due to concentrated commercial activities and residential fuel consumption. These gas stations are thus clustered into different categories. The hierarchical clustering results at the city address level within a province can provide decision support for optimizing the layout of gas stations, adjusting fuel delivery plans, and designing fuel-related activities for different regions, thereby improving the service quality and market competitiveness of gas stations.

[0065] 4. Hierarchical clustering of different merchant addresses within the same city

[0066] At the same city level, detailed micro-data was collected for each merchant (gas station), including geographical location (such as coordinates, surrounding road conditions, distance from residential and commercial areas), service facilities (such as the number of fuel pumps, convenience store size, and whether additional services such as car washing are provided), and customer characteristics (such as customer origin, customer loyalty, customer fuel usage frequency, and spending distribution).

[0067] Detailed information on various refined oil-related activities carried out by merchants in the past (such as the frequency, type, intensity, and duration of refined oil-related activities) and the impact of refined oil-related activities on oil demand and customer traffic are collected and transformed into feature vectors that can be used for cluster analysis.

[0068] Clustering algorithms are used to group different businesses within the same city. For example, businesses with similar geographical locations and service facilities may cluster together, facing similar customer groups and market environments in market competition; while businesses with distinctive services may form a separate group. Clustering results at the address level of different businesses within the same city help businesses understand their market positioning, learn from the successful experiences of similar businesses, and develop personalized marketing strategies. These strategies include targeted promotions for specific customer groups in the surrounding area related to refined oil products, optimizing service facilities to improve customer satisfaction, and thus standing out in fierce market competition, increasing demand for refined oil products and market share.

[0069] In one embodiment, it also includes:

[0070] The collected seasonal and historical demand information for refined oil products is cleaned and standardized to obtain standardized data.

[0071] Clustering algorithms are used to perform cluster analysis on the collected information, including: cluster analysis on standardized data based on the K-means clustering algorithm.

[0072] In the above embodiments, data cleaning is a crucial step in ensuring data quality and analytical accuracy within the refined oil demand analysis system. Seasonal and historical demand information collected from multiple data sources may contain various noises and errors, such as data entry errors, missing data, and outliers. These issues can interfere with subsequent cluster analysis and model construction, reducing the reliability of the results. Therefore, data cleaning aims to remove these invalid or erroneous data, providing an accurate and consistent data foundation for subsequent analysis.

[0073] For collected seasonal information, check the accuracy of demand time data, ensuring consistent date formats and the absence of errors or duplicate records. For holiday information, verify its consistency with officially released information to avoid biases in subsequent analysis due to incorrect holiday information. If data is missing, such as some oil consumption records lacking periods of increased demand for refined oil products, appropriate interpolation can be used to supplement the data based on the temporal patterns of preceding and following data. If there is obviously erroneous demand time data (such as data that does not conform to actual consumption cycles), it should be corrected or deleted according to business logic.

[0074] For non-oil related information, check the completeness of non-oil consumption data to ensure the accuracy of all commodity oil consumption records, without omissions or errors. When compiling data on refined oil-related activities, verify the accuracy of key information such as the start and end dates of these activities and promotional details to avoid information errors affecting the analysis of oil-non-oil interaction. If any anomalies are found in the correlation between non-oil consumption data and oil consumption data (such as data mismatch or logical contradictions), thoroughly investigate the data source and collection process, and make necessary corrections or adjustments.

[0075] The purpose of data standardization is to transform data with different characteristics into a form with uniform dimensions and comparable scales, so that each characteristic can be treated fairly in cluster analysis and subsequent model construction, and to avoid certain characteristics dominating the calculation due to differences in feature dimensions, thereby affecting the accuracy of the analysis results.

[0076] For categorical data such as consumption months and quarters related to seasonality, one-hot encoding or similar methods can be used to convert them into numerical features, ensuring that each category has an equal representation in the feature space. For numerical features such as time intervals with holidays, normalization can be performed, such as mapping them to specific intervals, to ensure that different time interval features have the same scale.

[0077] For non-oil consumption data within oil-related information, standardization is performed based on characteristics such as commodity category and refined oil consumption amount. For example, the consumption of different types of non-oil products is normalized so that it can work synergistically with other data features in cluster analysis to jointly reflect the impact of oil-non-oil interaction on refined oil consumption. Through data cleaning and standardization, standardized data suitable for cluster analysis is finally obtained.

[0078] Figure 2 This is a specific example diagram of an intelligent calculation method for refined oil demand in an embodiment of the present invention. In one embodiment, based on a clustering algorithm, the collected information is clustered and analyzed to obtain data information at different regional category levels, such as... Figure 2 As shown, it includes:

[0079] Step 201: Determine the number of clusters of the collected information based on prior knowledge of the collected information;

[0080] Step 202: Randomly select multiple samples from the collected information as initial centroids; calculate the distance from each sample to each centroid in the collected information, and assign the sample to the cluster to which the nearest centroid belongs; recalculate the new centroid for each cluster; repeat the above steps of assignment and recalculation of new centroids until the centroids no longer move significantly or the preset maximum number of iterations is reached;

[0081] Step 203: The centroid obtained when it no longer moves significantly or reaches the preset maximum number of iterations is used as the clustering analysis result of the collected information; based on the clustering analysis result, data information under different regional category levels is obtained.

[0082] In the above embodiments, determining the appropriate number of clusters is a key step when performing cluster analysis on standardized data using the K-means clustering algorithm. Based on prior knowledge of the collected information, the number of clusters is determined by comprehensively considering the characteristics and business needs of the refined oil market at different regional category levels. For example, at the cross-provincial address level, a reasonable range for the number of clusters is initially determined based on differences in regional development levels, geographical location, and energy refined oil consumption structure. Then, by performing cluster analysis on a subset of sample data and evaluating the rationality of the clustering results (such as intra-cluster similarity and inter-cluster differences), the final number of clusters is further optimized and determined. At the provincial address level, based on factors such as industrial structure characteristics and transportation network layout, and referring to historical refined oil consumption data and market research information, a suitable number of clusters for provincial cluster analysis is determined. Similarly, at the intra-provincial city address level and the address level of different merchants within the same city, the corresponding number of clusters is determined based on prior knowledge such as differences in regional functions within the city, merchant operating models, and market competition, ensuring that the clustering results accurately reflect the data characteristics and market patterns at different regional category levels.

[0083] Multiple samples are randomly selected as initial centroids from the preprocessed standardized data. To ensure the randomness and representativeness of the initial centroids, a random number generation algorithm is used to randomly sample samples from the dataset. During the selection process, it is important to ensure that the initial centroids are relatively evenly distributed in the data space to avoid excessive concentration of initial centroids, which could lead to biased clustering results. For example, when clustering across provincial address levels, initial centroids are randomly selected from provincial data in different geographical regions; at the provincial address level, they are selected from city data with different development levels and industrial structures within the province; at the city address level within the province, they are randomly determined from gas station data in different functional areas within the city; and at the address level of different merchants within the same city, they are randomly sampled from merchant data with different business scales and geographical locations.

[0084] The distance from each sample to each centroid in the collected information is calculated using the Euclidean distance formula. Based on the calculated distance, the sample is assigned to the cluster to which the nearest centroid belongs. During the distance calculation process, it is ensured that the distance between each sample and the centroid is accurately calculated across all feature dimensions, fully considering the combined influence of various features such as seasonal information and historical demand information on the sample distance. For example, the distance calculation for a gas station sample must comprehensively consider the differences between its seasonal consumption characteristics and oil-non-oil interaction with the centroid.

[0085] Recalculate the new centroid for each cluster. After sample allocation, for each cluster, calculate the average value of all samples within that cluster across all feature dimensions, using this as the new centroid coordinates. During the calculation process, accurately count the number of samples within each cluster to avoid centroid calculation errors caused by incorrect sample counts. For example, in cross-provincial address-level clustering, calculate the average value of features such as total demand and demand growth rate within each provincial cluster to update the centroid of that cluster; at the provincial address level, calculate the average value of relevant features within the city cluster to update the centroid; at the intra-provincial city address level and the address level of different merchants within the same city, calculate the average value of features within the corresponding clusters to update the centroid.

[0086] Repeat the above steps of assigning and recalculating new centroids until the centroids no longer move significantly or the preset maximum number of iterations is reached. To determine whether the centroids have moved significantly, a reasonable movement threshold is set (e.g., the movement distance of the centroids in each feature dimension is less than a certain value). After each iteration, the movement distance of the centroids is calculated and compared with the threshold. Simultaneously, the number of iterations is recorded. When the number of iterations reaches the preset maximum number of iterations, iteration stops even if the centroids have still moved to some extent. Through multiple iterations, the clustering results are continuously optimized, gradually increasing the sample similarity within each cluster and gradually increasing the differences between clusters, ultimately obtaining a stable and reasonable clustering result. The centroids obtained when they no longer move significantly or the preset maximum number of iterations is reached are used as the clustering analysis results of the collected information. Based on this clustering analysis result, data information at different regional category levels can be clearly obtained, that is, the regional categories represented by different clusters have similarities in refined oil consumption-related characteristics, providing a structured data foundation for further analysis.

[0087] In specific implementation, after step 101: acquiring the collected seasonal information and historical demand information of refined oil products; and performing cluster analysis on the collected information based on the clustering algorithm to obtain data information under different regional category levels, step 102: for the data information under each regional category level, performing feature engineering processing on the data information under the category level of that region to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level.

[0088] In this embodiment, after acquiring demand time data and determining consumption months and quarters for each regional category, the seasonal characteristics are further explored. The distribution of refined oil consumption at different times (e.g., weekdays, weekends, holidays) within each month or quarter is calculated, and the differences in demand at different times and their correlation with seasonal factors are analyzed. For example, during peak tourist seasons, it is observed whether the demand for refined oil at gas stations in popular tourist areas is significantly higher on weekends and holidays than on weekdays, and the changing patterns of this difference across different seasons. Simultaneously, the impact of seasonal climate change on refined oil demand in different regions is considered. For instance, in cold winter regions of northern China, increased heating demand leads to higher demand for industrial diesel and residential heating oil. The manifestation of this demand change at different regional categories, such as provinces and cities within provinces, is analyzed. By statistically analyzing the correlation between historical data on temperature, precipitation, and other climate data for different seasons and refined oil demand, climate-related seasonal characteristics are extracted, such as the growth rate of diesel demand during low-temperature periods and the fluctuation characteristics of gasoline demand during high-temperature periods. This comprehensively depicts the seasonal characteristics of refined oil in different regions, providing richer evidence for subsequent precise analysis.

[0089] When analyzing the trend characteristics of refined oil products, the long-term series of refined oil product consumption data is first decomposed using methods such as moving averages and exponential smoothing to separate the data into long-term trends, seasonal cycles, and random fluctuations. At the cross-provincial geographical level, the overall growth or decline trend of total refined oil product demand over multiple years is observed across large regions. This analysis examines how macroeconomic factors such as energy-related policy adjustments affect this long-term trend, and the commonalities and differences in long-term trends among different provincial clusters. At the provincial geographical level, the relationship between refined oil product consumption trends within each province and local industrial restructuring and transportation infrastructure construction is studied. For example, as a province's highway network continues to improve, the changes in gasoline demand over the long term are observed, as well as the changing characteristics of diesel demand trends in provinces undergoing industrial transformation. At the intra-provincial city geographical level, the impact of urban development planning and population flow changes on refined oil product demand trends is analyzed. This includes the upward trend in refined oil product demand in emerging cities due to rapid population growth and urban expansion, and the downward trend in refined oil product demand during industrial upgrading in old industrial cities. Within the same city, at different merchant address levels, we focus on the consumption trends of individual merchants under the influence of changes in the surrounding competitive environment and customer groups. For example, if a merchant's market share declines due to the opening of a new gas station nearby, we can observe the short-term and long-term trends in its refined oil demand. Through these analyses, we can accurately grasp the trend characteristics of refined oil at different regional categories.

[0090] When constructing the cross-features of refined oil products, the interrelationships between different types of information are fully considered. When considering the cross-features of non-oil product interactions and local development indicators, the relationship between non-oil product refined oil-related activities at gas stations and oil demand is studied in regions with different levels of development. In developed regions, the correlation between convenience store refined oil-related activities and gasoline demand is analyzed, as well as the impact of car-related services (such as car washing and repair) on customer refueling loyalty and oil demand. In less developed regions, the interaction between non-oil basic necessities and diesel demand (mainly used for agricultural production and logistics) is observed. Through this cross-analysis, refined oil product cross-features that reflect the characteristics of different regional categories are constructed, more comprehensively capturing the complex factors influencing refined oil product demand.

[0091] In specific implementation, after step 102: for the data information under each regional category level, perform feature engineering processing on the data information under the regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level, proceed to step 103: based on the decision tree model ensemble learning algorithm, determine the refined oil feature importance value of each feature among the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; and select multiple features whose refined oil feature importance values exceed the preset values as the target features of the corresponding regional category level.

[0092] Figure 3 This is a specific example diagram of an intelligent calculation method for refined oil demand in an embodiment of the present invention. In one embodiment, based on a decision tree model ensemble learning algorithm, the importance value of each feature among the seasonal characteristics, trend characteristics, and cross-characteristics of refined oil at the category level corresponding to the region is determined, such as... Figure 3 As shown, it includes:

[0093] Step 301: Establish the first training dataset based on the seasonal characteristics, trend characteristics, and cross-characteristics of refined oil products at the corresponding regional category level;

[0094] Step 302: Based on the decision tree model ensemble learning algorithm, train the pre-set decision tree model with the first training dataset to obtain the model for determining the importance of refined oil features at the category level of the corresponding region;

[0095] Step 303: Determine the model based on the importance of refined oil features, and determine the number of times each feature is selected as a splitting feature during the decision tree splitting process, the information gain it brings at the splitting node, and the number of samples covered in the tree;

[0096] Step 304: Determine the feature importance value of each feature based on the frequency of each feature, the information gain, and the number of samples.

[0097] In the above embodiments, a first training dataset is established based on the seasonal characteristics, trend characteristics, and cross-characteristics of refined oil products at the corresponding regional category level. At the cross-provincial address level, historical refined oil consumption data over many years is collected at different regional levels. For example, for a specific province, the fluctuations in refined oil consumption during different seasons, such as the summer peak tourist season and the winter heating season, are recorded, along with the changes in consumption trends over time due to industrial restructuring. Simultaneously, data on the cross-influence between demand fluctuations and local seasonal demand, as well as regional development levels, are included. This data is categorized by province, with each province's data serving as a sample. The sample's characteristics include seasonal characteristics (such as the proportion of demand in each season, changes in demand under the influence of special seasonal events), trend characteristics (such as the slope of the multi-year demand trend curve, trend inflection points, etc.), and cross-characteristics (such as demand elasticity under the interaction of demand and seasonal factors, changes in demand under the non-interactive correlation between regional development indicators and oil), thus constructing the first training dataset at the cross-provincial address level.

[0098] At the provincial-level geographic level, data is collected from each city within the province. Detailed records are kept of the refined oil consumption characteristics of each city in different seasons, such as the peak diesel demand characteristics of major agricultural cities during the busy farming season, and the surge in gasoline demand in tourist cities during the peak tourist season. Simultaneously, the long-term consumption trends of each city are analyzed in relation to local policy decisions and industrial development, such as the changing trends in refined oil demand structure in industrial cities during industrial upgrading. The data from each city are compiled into samples to construct the first training dataset at the provincial-level geographic level, with each sample encompassing refined oil-related characteristic information for that city.

[0099] At the city address level within the province, data was collected from gas stations in different areas or business districts. Data on refined oil consumption at each gas station was collected for different seasons, including consumption differences during weekdays, weekends, and holidays, analyzing how long-term consumption trends are affected by surrounding transportation development and population flow. The cross-influence of environmental factors surrounding gas stations and non-oil data was considered; for example, the impact of the correlation between non-oil and oil data on demand at gas stations located in commercial areas during holidays was examined. The data from each gas station was organized into samples to construct the first training dataset at the city address level within the province. The sample characteristics reflect the seasonality, trends, and cross-influence of refined oil consumption at the gas station level.

[0100] This study focuses on the refined oil consumption data of individual businesses across different business locations within the same city. It meticulously records consumption fluctuations across seasons, such as the impact of increased air conditioning use during the summer's high temperatures on gasoline demand from surrounding businesses, and the impact of heating needs in winter on diesel demand from businesses near residential areas. The study also analyzes individual business consumption trends, including increases or decreases due to changes in customer demographics or the emergence of competitors. The data from each business is compiled into a sample, constructing the first training dataset across different business locations within the same city. This sample comprehensively reflects the refined oil consumption characteristics of individual businesses.

[0101] An ensemble learning algorithm based on decision tree models (such as XGBoost) is used to train the pre-configured decision tree model on a pre-established training dataset. Before training begins, the initial parameters of the decision tree model are set appropriately. For the learning rate, a suitable value between 0.1 and 0.3 is chosen based on the size and complexity of the dataset. For larger datasets with more features, a smaller learning rate (e.g., 0.1) is chosen to ensure the stability and accuracy of model training; if the dataset is relatively simple, the learning rate can be appropriately increased (e.g., 0.2) to speed up training. The number of trees is determined, initially between 100 and 500, and adjusted based on the model's training performance. The maximum tree depth is set between 3 and 6 layers to avoid overfitting due to excessive model complexity, while still capturing the relationships between features effectively. The regularization parameter is estimated based on the feature distribution of the dataset. An appropriate regularization strength is initially determined through analysis of a subset of sample data to prevent the model from overfitting the training data.

[0102] During training, the gradient and second derivative of the loss function (e.g., the squared loss function) are calculated using the sample features of the first training dataset. For each sample, the gradient of the loss function is calculated based on its feature value and the current model analysis value, reflecting the rate of change of the model analysis error; the second derivative (approximately the trace of the Hessian matrix) is calculated to measure the quadratic rate of change of the loss function, thereby determining the optimal split point and structure of the decision tree. When constructing the decision tree, starting from the root node, the optimal splitting features and split points are selected based on the importance of the features and information gain, dividing the dataset into two child nodes. This process is recursively repeated until the stopping growth condition is met (e.g., reaching the maximum depth, the number of node samples being less than a threshold, or the information gain being less than a threshold). By iteratively adding to the decision tree model, the objective function (including the loss function and regularization term) is continuously optimized, allowing the model to gradually approach the optimal solution. In each iteration, the model parameters are updated according to the learning rate, while column-block parallel computing and sparsity-aware optimization strategies are used to improve computational efficiency. After multiple iterations of training, a model for determining the importance of refined oil features at the category level corresponding to the region is obtained.

[0103] Using a trained model to determine the importance of refined oil features, we determine the number of times each feature is selected as a splitting feature during decision tree splitting, the information gain it brings at splitting nodes, and the number of samples it covers in the tree. At the cross-provincial address level, for the feature of regional development level, we observe its frequency of selection during decision tree splitting. If it is preferentially selected in multiple splits, it indicates that this feature is significant in distinguishing the refined oil consumption of different provincial clusters, and its selection frequency is high. We calculate the information gain brought by this feature at splitting nodes, i.e., the degree to which the purity of the data within the child node is improved after splitting using this feature. The greater the information gain, the greater the contribution of the feature to classification. We count the number of samples covered by this feature in the tree. If it covers samples from most provinces, it indicates that it has a wide influence on the entire dataset.

[0104] At the provincial address level, taking industrial structure characteristics as an example, we analyze their role in classifying data from various cities. If this characteristic frequently appears in decision tree splitting, it indicates a significant impact on the differences in refined oil consumption among different cities within the province. We calculate its information gain to understand how this characteristic effectively distinguishes the consumption patterns of cities with different industrial structures. We determine the number of samples it covers to assess the representativeness of this characteristic at the provincial address level.

[0105] At the city address level within the province, for the geographical location feature of gas stations, examine its selection in decision tree splits. Frequent use in splits indicates its significant role in differentiating consumption patterns among gas stations in different areas of the city. Calculate its information gain to measure the explanatory power of this feature for local consumption differences within the city. Statistically determine the number of covered samples to assess its influence at the city address level.

[0106] At different merchant address levels within the same city, taking merchant service facility characteristics as an example, we observe their performance in decision tree splitting. If this feature is selected multiple times, it indicates its significant role in distinguishing the refined oil consumption of different merchants. We calculate the information gain to determine its contribution to the classification of individual merchant consumption. We statistically analyze the number of covered samples to understand its degree of influence at the merchant level.

[0107] Based on the aforementioned indicators for each feature, the feature importance value is calculated comprehensively. A common calculation method is to use weighted summation, assigning reasonable weights (such as determining the weight ratios based on experience or data analysis) to the number of times selected, information gain, and number of samples covered, and then calculating the weighted sum as the feature importance value. For example, for a certain feature, its number of selections might have a weight of 0.4, information gain a weight of 0.3, and number of samples covered a weight of 0.3. The corresponding indicator values for this feature are multiplied by their respective weights and then summed to obtain its feature importance value. In this way, the importance value of each feature at different regional category levels is determined, providing a basis for subsequent target feature selection.

[0108] Preset values are established based on the historical data distribution and business needs of different regional categories. At the cross-provincial address level, due to the wide data coverage and significant feature differences, the preset values can be relatively lenient to select features that significantly impact refined oil consumption across large regions. At the address level of different merchants within the same city, due to the more micro-level data and more complex competitive environment, the preset values are relatively strict to focus on the features most critical to the consumption of individual merchants. The calculated importance values of refined oil features are compared with the preset values, and features exceeding the preset values are selected as target features for the corresponding regional category level. At the cross-provincial address level, macro-level features such as regional development level and transportation infrastructure construction may be selected as target features, as these features have a significant impact on the refined oil consumption patterns and trends across different areas. At the provincial address level, features such as industrial restructuring, regional energy consumption habits, and the status of provincial transportation hubs may become target features, contributing to the formulation of refined oil demand control strategies and resource allocation at the provincial level. At the intra-provincial city address level, features such as urban planning layout, public transportation development level, and regional commercial activity intensity may be selected as target features, providing direction for optimizing the refined oil market within cities. Within the same city, at different merchant address levels, characteristics such as geographical advantages, service facilities, and customer loyalty may become target features, guiding individual merchants' business decisions and market competition strategies. These target features accurately reflect the key factors influencing refined oil demand at different regional categories, providing a core feature set for subsequent demand analysis model construction and precise analysis. This improves the relevance and accuracy of the analysis model and helps to achieve scientific management and decision optimization of refined oil consumption at different regional categories.

[0109] Figure 4 This is a specific example diagram of an intelligent calculation method for refined oil demand in an embodiment of the present invention. In one embodiment, based on a decision tree model ensemble learning algorithm, a pre-set decision tree model is trained using a first training dataset to obtain a model for determining the importance of refined oil features at the category level corresponding to the region, such as... Figure 4 As shown, it includes:

[0110] Step 401: Select the XGBoost model, random forest model, or gradient boosting decision tree model as the decision tree model;

[0111] Step 402: Initialize the model hyperparameters of the decision tree model; the model hyperparameters include: learning rate, maximum tree depth, subsample ratio, column sampling ratio, and regularization parameter;

[0112] Step 403: Combining the model hyperparameters, the decision tree model is trained using the first training dataset based on the decision tree model ensemble learning algorithm to obtain a model for determining the importance of refined oil features at the category level for the corresponding region.

[0113] In the above embodiments, when determining the decision tree model, XGBoost, Random Forest, or Gradient Boosting Decision Tree models can be selected. These three models each have their own advantages in handling complex data relationships and analytical problems. XGBoost, through its optimized gradient boosting algorithm, can efficiently handle large-scale datasets and finely control the loss function and regularization term during model training, effectively preventing overfitting. It is suitable for scenarios involving complex data features and strong nonlinear relationships in refined oil demand analysis. Random Forest, by constructing multiple decision trees and randomly selecting features for training, can effectively reduce model variance and improve model stability and generalization ability. When the data contains some noise or some features are highly correlated, Random Forest can provide more robust analytical results. Gradient Boosting Decision Tree models construct decision trees iteratively, with each tree improving the analytical error based on the previous tree. It has strong learning capabilities, especially performing well when handling data with obvious trends and hierarchical structures, and can effectively capture long-term trends and seasonal variations in refined oil consumption data. Appropriate models can be flexibly selected based on the characteristics of refined oil consumption data at different regional levels and business needs. For example, at the cross-provincial address level, due to the large data scale and the involvement of various complex factors such as geography, the XGBoost model may be more suitable for mining deep feature relationships; while at the intra-provincial city address level, the random forest model can better handle the diversity and uncertainty of data in local areas.

[0114] For the selected decision tree model, proper initialization of hyperparameters is a crucial step in ensuring model performance. The learning rate determines the step size for updating parameters in each iteration. During initialization, if the relationships between data features are complex and the dataset is large, a smaller learning rate (e.g., 0.05-0.1) can be set to allow for more precise parameter adjustments and avoid missing the optimal solution, but training time may be longer. If the data is relatively simple and faster training is desired, the learning rate can be increased appropriately (e.g., 0.15-0.2). The maximum tree depth controls the growth scale of the decision tree, preventing overfitting due to excessive model complexity. For cases with many features and a relatively dispersed data distribution, a larger maximum tree depth (e.g., 5-8 layers) can be set to allow the model to fully learn the data features. If the data features are relatively simple and concentrated, a maximum tree depth of 3-5 layers may be more appropriate. The subsample ratio controls the proportion of samples used in each iteration. This parameter can introduce randomness and improve the model's generalization ability. It is generally set between 0.5 and 1; for example, setting it to 0.8 means randomly selecting 80% of the samples for training in each iteration. The column sampling ratio determines the proportion of features considered when constructing a decision tree each time, which can also increase the model's randomness and generalization ability. When the feature dimension is high, a lower column sampling ratio (e.g., 0.5-0.7) can be set to avoid the model over-relying on certain features; if the number of features is relatively small, the ratio can be appropriately increased (e.g., 0.8-1). Regularization parameters are used to control the complexity of the model and prevent overfitting. For the L1 regularization parameter, it can be set according to the sparsity of the features. If the features have many zero values or are sparse, the L1 regularization parameter can be appropriately increased (e.g., 0.1-0.5); for the L2 regularization parameter, it is generally set between 0.01 and 0.1, and the specific value needs to be adjusted according to the characteristics of the dataset and the model training effect.

[0115] Based on the initialized model hyperparameters, an ensemble learning algorithm for decision tree models (using XGBoost as an example) is used to train the decision tree model on the first training dataset. At the start of training, the model is gradually adjusted from the initial parameters according to a pre-set learning rate. For each sample, the difference between its analytical value and the true value under the current model is calculated, and then the gradient and second derivative of the loss function (such as the squared loss function) are calculated. When constructing the decision tree, some features are randomly selected according to the column sampling ratio, and then the best splitting features and splitting points are selected according to the importance of the features and information gain (calculated by means of the Gini index or the reduction of information entropy), dividing the dataset into two child nodes. In this process, some samples are randomly sampled according to the subsample ratio to construct the decision tree, increasing the randomness and generalization ability of the model. At the same time, the constraint of regularization parameters on model complexity is considered to prevent the decision tree from overgrowing. After each decision tree is constructed, the model updates the parameters according to the learning rate, so that the model gradually approaches the optimal solution. As the iteration progresses, the above process is repeated continuously, adding new decision tree models and optimizing the objective function (including the loss function and regularization terms). During training, the model's training effect is monitored using built-in evaluation metrics (such as mean squared error and accuracy). When the model performance no longer improves on the validation set or reaches the preset number of iterations, training is stopped, and a model for determining the importance of refined oil features at the category level for the corresponding region is obtained.

[0116] If a random forest model is chosen, the training process differs slightly. First, based on the initial maximum tree depth and other hyperparameters, the random forest model uses bootstrap sampling to extract multiple subsets with replacement from the first training dataset. Each subset is the same size as the original dataset. For each subset, a decision tree is constructed. During decision tree construction, some features are randomly selected (based on column sampling ratios) for splitting nodes, thereby reducing the correlation between features and improving the model's generalization ability. Each decision tree grows independently without pruning to fully utilize data diversity. After all decision trees are constructed, the random forest model obtains the final analysis result by averaging or voting (for classification problems) the analysis results of all decision trees. During training, model performance can also be monitored using evaluation metrics to ensure that the model can accurately determine the importance of refined oil features at different regional category levels.

[0117] The training process for the gradient boosting decision tree model is similar to that of the XGBoost model, optimizing the objective function by iteratively adding decision trees. In each iteration, the negative gradient of the loss function is calculated, and a decision tree is constructed based on this negative gradient. This ensures that the analysis value of the decision tree fits the negative gradient to the greatest extent possible, thereby reducing the value of the loss function. When constructing the decision tree, hyperparameters such as subsample ratio and column sampling ratio are considered, and model complexity is controlled through regularization parameters. As iterations proceed, the model continuously adjusts the weights of the decision trees to make the final analysis results more accurate. Through multiple iterations of training, a model for determining the importance of refined oil features applicable to different regional category levels is obtained. This model can accurately assess the importance of each feature to refined oil consumption at different regional category levels, providing a reliable basis for subsequent target feature selection and demand analysis model construction.

[0118] In specific implementation, after step 103: based on the decision tree model ensemble learning algorithm, determine the feature importance value of each feature in the seasonal features, trend features, and cross features of refined oil products at the corresponding regional category level; after selecting multiple features whose feature importance values exceed the preset values as target features at the corresponding regional category level, proceed to step 104: based on the obtained refined oil demand data at different regional category levels and the data of multiple target features at different regional category levels, establish a second training dataset; based on the neural network algorithm, train the pre-set neural network model with the second training dataset to obtain the refined oil demand analysis model.

[0119] In this embodiment, to establish the second training dataset, it is first necessary to comprehensively collect refined oil demand data at different regional category levels. At the cross-provincial address level, total refined oil demand data for each province is obtained from channels such as energy reserve management departments and regional storage centers of large oil companies, including the demand for different oil products such as gasoline and diesel. Time series information on demand is also recorded, such as monthly, quarterly, or annual demand changes. This is combined with target feature data for the corresponding regions. These target features cover previously determined macro-level characteristics such as regional development level and transportation infrastructure construction. For example, data such as a province's GDP growth, highway mileage, and provincial energy subsidies are correlated with the province's refined oil demand data to ensure that each sample point contains complete demand information and relevant target feature information, thus constructing the foundation of the second training dataset at the cross-provincial address level.

[0120] At the provincial-level geography level, detailed data on refined oil demand in various cities and prefectures were collected from provincial energy management agencies and the storage departments of large petroleum enterprises within the province, including demand from different oil depots. Simultaneously, target characteristic data determined at the provincial level were incorporated, such as the industrial structure adjustment of various cities and prefectures, regional energy consumption habits, and the status of provincial transportation hubs. The demand data of each city and prefecture were integrated with the corresponding target characteristic data to form sample points for the second training dataset at the provincial geography level, comprehensively reflecting the situation of refined oil demand in various cities and prefectures and its relationship with relevant influencing factors.

[0121] At the city address level within the province, data on refined oil demand in different areas or business districts within the city are obtained through urban energy management departments, gas stations, and small storage facilities. This includes specific information such as the oil storage tank capacity and delivery frequency of each gas station, as well as target characteristic data such as urban planning layout, public transportation development level, and regional commercial activity intensity. The demand data for each area within the city are mapped one-to-one with the corresponding target characteristic data to construct a second training dataset at the city address level within the province. This allows for accurate analysis of the changing patterns of refined oil demand in different areas within the city and its correlation with regional characteristics.

[0122] Within the same city, at different merchant address levels, data on refined oil demand is collected from the inventory management systems of individual merchants (gas stations). This includes information such as daily inventory levels, replenishment cycles, and inventory turnover rates, as well as target characteristic data such as the merchant's geographical advantages, service facilities, and customer loyalty. By combining each merchant's demand data with its own target characteristic data, a second training dataset is constructed across different merchant address levels within the same city. This dataset provides data support for accurately analyzing changes in refined oil demand among individual merchants.

[0123] In one embodiment, after data collection and organization, the demand data and target feature data for different regional categories are constructed into a second training dataset according to a specific format. For each regional category, the order of samples in the dataset is ensured to be reasonable, such as arranged according to geographical location or time order, to facilitate subsequent model training and data analysis. During the construction process, the completeness and consistency of the data are checked to ensure that each sample contains the required demand data and target feature data, and that there are no missing or erroneous values. If data is found to be missing, it is appropriately imputed according to the data distribution pattern and business logic, such as using mean imputation, median imputation, or model-based imputation methods (e.g., using other relevant features to construct a regression model to analyze missing values). For outliers, they are identified through data distribution analysis or comparison with historical data. If determined to be outliers, they are corrected or deleted according to the specific circumstances to ensure the quality of the dataset.

[0124] The constructed second training dataset undergoes preprocessing to make it suitable for training neural network algorithms. Normalization or standardization is performed on the required data and target feature data. For example, min-max normalization is used to map the data to a specific interval, or Z-score standardization is used to give the data zero mean and unit variance. This improves the training efficiency and accuracy of the neural network model, avoiding convergence difficulties or biased analysis results caused by excessive differences in data scale. Simultaneously, based on data characteristics and business requirements, some features are encoded and transformed, such as converting categorical features to numerical features (e.g., using one-hot encoding or label encoding), ensuring the neural network model can correctly process the data. Through these preprocessing steps, a high-quality, training-suitable second training dataset is obtained, laying a solid foundation for subsequent model training based on neural network algorithms.

[0125] In this embodiment, when training the model based on the neural network algorithm, the first step is to select a suitable neural network architecture according to the characteristics of the refined oil demand analysis. For the analysis of time series data with obvious cross-provincial address levels and provincial address levels, recurrent neural networks (RNNs) and their variants, such as long short-term memory networks (LSTMs) or gated recurrent units (GRUs), can be considered. LSTMs can effectively handle long-term dependencies in long-sequence data through unique gating mechanisms, and are suitable for analyzing the complex dynamic relationship between refined oil demand and various target features over a long period of time. GRUs, while maintaining a certain ability to capture long-term dependencies, have a relatively simple structure and low computational cost, and are advantageous when the data scale is large and the real-time requirements are high. At the city address level within a province and at the address level of different merchants within the same city, since the data may have certain spatial correlations (such as mutual influence between different areas within the city and competitive relationships between merchants), the local perception and feature extraction capabilities of convolutional neural networks (CNNs) can be combined to construct a hybrid architecture that combines convolutional neural networks and recurrent neural networks. CNNs can extract spatial features from data, such as the impact of geographical proximity between urban areas on demand, and then combine them with time series features processed by RNNs or LSTMs to more comprehensively capture the changing patterns of refined oil demand.

[0126] Taking the combination of LSTM and CNN architecture as an example, when training a pre-configured neural network model using a second training dataset, the model parameters are first initialized. The weights and biases of the connections between neurons are randomly set, with initial weight values typically chosen within a small range (e.g., between -0.1 and 0.1) to avoid excessively large initial gradients that could lead to model instability. Then, the pre-processed second training dataset is input into the neural network model. For the LSTM part, the input data is processed sequentially according to time steps, using structural units such as forget gates, input gates, cell states, and output gates to learn the time-series dependency between demand data and target feature data, capturing long-term trends and seasonal variations. The CNN part extracts the spatial features of the data, using convolutional kernels that slide across the data to extract local correlation features between different regions or merchants, such as the mutual influence patterns of demand changes among adjacent merchants.

[0127] During training, the error between the model's analytical output and the actual required data is calculated. Commonly used loss functions include mean squared error (MSE) and mean absolute error (MAE). MSE can amplify the impact of larger errors and requires higher model accuracy; MAE, on the other hand, is relatively insensitive to outliers and focuses more on the average level of the overall error. The gradient is calculated based on the loss function, and then optimization algorithms (such as stochastic gradient descent (SGD), Adagrad, Adadelta, Adam, etc.) are used to update the model parameters, gradually bringing the model closer to the optimal solution. The Adam optimization algorithm combines the advantages of momentum and adaptive learning rate, performing well in practical applications. It can dynamically adjust the learning rate based on the first and second moment estimates of the gradient during training, accelerating model convergence. During training, an appropriate number of training epochs (e.g., 100-1000 epochs) is set, and the dataset is divided into training, validation, and test sets. After each training epoch, the model performance is evaluated using the validation set. Training stops when the loss on the validation set no longer decreases or reaches a preset stopping condition to prevent overfitting. After multiple iterations of training, a refined oil demand analysis model applicable to different regional categories was finally obtained. This model can accurately analyze the changes in refined oil demand under different regional categories based on the input target feature data, providing a scientific basis for refined oil reserve management.

[0128] In specific implementation, after step 104: based on the acquired refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories, a second training dataset is established; based on the neural network algorithm, the pre-set neural network model is trained with the second training dataset to obtain the refined oil demand analysis model, and then step 105: based on the refined oil demand analysis model, the refined oil demand data of different merchants under different regional categories are calculated.

[0129] Here's a specific example to illustrate how to calculate the demand for refined oil products for different types of merchants in different regions:

[0130] 1. Application of cross-provincial address hierarchy analysis

[0131] At the cross-provincial geographical level, a refined oil demand analysis model is used to analyze the refined oil demand of different provincial clusters. First, the target characteristic data of each province (such as macro-level characteristics like regional development level and transportation infrastructure construction) is organized and preprocessed to meet the model's input requirements. Then, the processed target characteristic data is input into the trained refined oil demand analysis model. The model analyzes and calculates based on the complex relationships between these target characteristics and refined oil demand learned from previous studies of different provincial clusters. For example, for a rapidly developing provincial cluster with an expanding transportation network, the model comprehensively considers the impact of factors such as industrial production growth and increased transportation demand on refined oil demand, analyzing the future trend of refined oil demand changes in this provincial cluster. The analysis results can provide decision-making basis for energy management departments in regional energy allocation and planning. Based on the analysis results, energy management departments can arrange cross-regional oil allocation in advance to ensure that the refined oil demand of each provincial cluster remains at a reasonable level under different development stages and energy demand conditions, ensuring the stability and security of energy supply and avoiding energy shortages due to insufficient reserves or resource waste due to excessive reserves.

[0132] 2. Application of provincial address hierarchy analysis

[0133] At the provincial level, analysis of refined oil demand is conducted using a refined oil demand analysis model for different cities within each province. Relevant target characteristic data for each city is collected, including information on industrial restructuring, regional energy consumption habits, and the city's status as a provincial transportation hub. This data is then processed. The processed city characteristic data is input into the model, which analyzes changes in refined oil demand in each city based on the relationship patterns learned during training between the characteristics of each city within the province and refined oil demand. For example, for cities primarily engaged in manufacturing, the model considers the impact of seasonal fluctuations in industrial production and changes in logistics and transportation demand on refined oil reserves; for tourist cities, the model analyzes refined oil demand by considering factors such as changes in tourist flow during peak and off-peak seasons and local traffic conditions. Provincial petroleum companies can optimize their provincial oil distribution plans and rationally allocate oil resources based on these analysis results. Before the peak tourist season, refined oil demand in popular tourist cities is increased in advance to ensure sufficient oil supply at gas stations; during the off-season for industrial production, demand in industrial cities is appropriately adjusted to reduce inventory costs, improve operational efficiency, and ensure a stable supply of refined oil to all cities in the province.

[0134] 3. Application of address hierarchy analysis in cities within the province

[0135] At the city-level location within a province, urban energy management departments and petroleum companies utilize refined oil demand analysis models to analyze the demand for refined oil in different areas or business districts within the city. Data on target characteristics such as urban planning layout, public transportation development level, and regional commercial activity intensity are collected and preprocessed. This regional characteristic data is then input into the model, which analyzes the differences between different areas within the city and the correlation between regional characteristics and refined oil demand, analyzing the changes in refined oil demand in each area. For example, in commercial areas, the model considers the impact of peak commercial activity periods and refined oil-related activities during holidays on refined oil demand, analyzing the demand for refined oil at gas stations in that area; in industrial areas, the model combines enterprise production plans and logistics patterns for analysis. Urban energy management departments can optimize the layout of refined oil storage facilities within the city based on the analysis results, rationally plan oil distribution routes, and improve the efficiency of the urban energy supply system. Petroleum companies can adjust the inventory management strategies of gas stations based on the analyzed demand in different areas, ensuring that market demand is met while reducing inventory costs and mitigating operational risks caused by insufficient or excessive reserves.

[0136] 4. Application of hierarchical analysis of different merchant addresses in the same city

[0137] Within the same city, across different merchant locations, individual gas stations utilize a refined oil demand analysis model to analyze their own refined oil demand changes. Data on target characteristics such as the merchant's geographical advantages, service facilities, and customer loyalty are collected, processed, and input into the model. Based on the merchant's unique characteristics and the relationship between individual merchant characteristics and refined oil demand learned during training, the model analyzes the merchant's future demand. For example, for merchants located near transportation hubs, the model considers high traffic volume and customer mobility to analyze changes in refined oil demand at different times. For merchants with unique service facilities (such as car washes and repairs), the model analyzes the impact of these services on customer refueling behavior, thereby analyzing their refined oil demand. Merchants can optimize their inventory management and rationally plan replenishment based on the analysis results. Before peak customer traffic, they can increase demand in advance to avoid stockouts affecting customer satisfaction; during off-peak periods, they can appropriately reduce demand to reduce inventory backlog, improve capital turnover, and enhance their profitability and service level in market competition. At the same time, merchants can also adjust their business strategies based on the analysis results, such as combining demand and customer loyalty to develop targeted refined oil-related activities to attract more customers and further increase market share.

[0138] The following is a specific embodiment to illustrate the application of the method of the present invention. This embodiment aims to comprehensively and deeply explore the many factors affecting refined oil consumption through diversified data sources. We carefully collected and integrated various types of data to construct a comprehensive and rich dataset, providing solid support for analyzing consumption trends, evaluating market effects, and formulating effective strategies. First, demand data was extracted from the daily oil consumption records of gas stations. This data, as the basis for analysis, directly reflects the actual situation of demand-related activities. Weather information, including temperature and weather conditions, matching the timestamps of the demand data, was obtained to reveal the potential impact of weather conditions on refined oil consumption.

[0139] This dataset provides strong support for in-depth analysis of various factors influencing refined oil consumption and lays a solid foundation for subsequent data modeling, strategy formulation, and market decision-making. Data preprocessing is an important step in data analysis, data mining, and machine learning processes, aiming to improve data quality and make it more suitable for subsequent analysis or modeling work.

[0140] The collected data undergoes data cleaning to remove outliers, errors, and duplicates, and missing values are handled. Furthermore, the data is standardized to convert it to a uniform scale or standard distribution for comparison and analysis.

[0141] In this embodiment, cluster analysis was performed on the collected information, and the results of seasonal characteristics, trend characteristics, and cross-characteristics of refined oil products at the corresponding category level for this region can be seen in Table 1:

[0142] Table 1

[0143]

[0144] Based on K-means and XGboost algorithms, and utilizing internal data from PetroChina and external data from Nielsen, clustering algorithms such as K-means are used to group stores into provincial groups, prefecture-level city groups, same-city store groups, and cross-province store groups. This is used for scenarios such as consumption management, product grouping, business analysis, and inventory assessment. Each management level assesses its subordinate units. To facilitate management, a unified assessment method is adopted for organizations of the same type, and each level is generally divided into groups.

[0145] K-means is the most classic and practical unsupervised clustering algorithm. It can roughly divide an unlabeled dataset, so that each data point has a fixed category.

[0146] In this embodiment, XGBoost (Extreme Gradient Boosting) is an ensemble learning method based on decision tree models, utilizing the gradient boosting framework. The core idea of XGBoost is to optimize a differentiable loss function by iteratively adding models (typically decision trees), which may include steps such as:

[0147] 1. Initialize the model: Start with a constant model, that is, the analysis value is a constant, which is usually the minimum value of the loss function.

[0148] 2. Iterative Model Addition: In each iteration, a new decision tree model is added to fit the residuals (or gradients) of the previous model. This new decision tree attempts to correct the errors of the previous model.

[0149] 3. Optimize the loss function: XGBoost uses a differentiable loss function to evaluate the model's performance, such as squared loss or logistic loss. In each iteration, it attempts to minimize this loss function.

[0150] 4. Regularization: To prevent overfitting, XGBoost adds regularization terms to the loss function, such as the number of leaf nodes in the tree and the depth of the tree.

[0151] 5. Output the final model: After multiple iterations, combine all the added decision tree models to obtain the final analysis model.

[0152] XGBoost uses Taylor expansion to approximate the loss function and a greedy algorithm to construct the decision tree. This process involves steps such as sorting features and selecting the optimal split point.

[0153] Establish a demand analysis model based on artificial intelligence deep learning algorithms, considering multiple factors and features, to meet the accuracy requirements of analysis across different time periods (such as day, week, month, year). Employ feature engineering algorithms to extract meaningful features from the original demand data, such as seasonal features (month, quarter, holiday), trend features (time series decomposition), and cross features (interactions between different scenarios and demands), to capture the spatial or local features of the data and achieve accurate analysis of oil demand.

[0154] In the field of analytics, the advantages of deep learning models are mainly reflected in three aspects. First, deep learning models can automatically learn useful feature representations from raw data, eliminating the need for manual feature extraction and selection. This not only significantly reduces the workload of manual intervention but also improves the accuracy and robustness of the analytical model. Second, through the stacking of multiple hidden layers, deep learning models can learn multi-level abstract representations of data. This representation method can better capture complex patterns and long-term dependencies in the data, thereby improving the accuracy of the analysis. Third, deep learning models can handle various types of data, including time-series data, images, and videos. This makes the application of deep learning in the field of analytics more extensive, meeting the needs of different industries and scenarios.

[0155] The application of deep learning models in the field of analytics mainly focuses on the following aspects.

[0156] Convolutional Neural Networks (CNNs) are deep learning models specifically designed for processing image data. They extract local features from images through convolution operations and reduce data dimensionality and complexity through pooling operations. In analytics, CNNs are widely used in tasks such as image recognition and video analysis. In traffic flow analysis, CNNs can analyze information such as the number and speed of vehicles in traffic monitoring videos to predict traffic flow patterns over a future period.

[0157] Recurrent Neural Networks (RNNs) are deep learning models specifically designed for processing sequential data. By introducing recurrent connections, they enable the model to capture temporal dependencies within sequential data. In analytics, RNNs are widely used in tasks such as time series analysis and natural language processing.

[0158] Long Short-Term Memory (LSTM) networks are an improved version of Recurrent Neural Networks (RNNs). LSTMs address the vanishing and exploding gradient problems inherent in RNNs when processing long sequences by introducing a gating mechanism. LSTMs have wide applications in analytics, particularly excelling in tasks requiring the capture of long-term dependencies. In climate analysis, LSTMs can analyze historical climate data to predict future climate change trends.

[0159] The first step is data preprocessing, including data cleaning: removing duplicate, missing, or outlier data to ensure the effectiveness of model training.

[0160] Data normalization or standardization: Processing raw data to make it have the same scale, which facilitates the learning of deep learning models.

[0161] Further feature selection: In addition to the seasonal, trend and cross features mentioned above, other factors that may affect oil demand can also be considered, such as regional development indicators (GDP growth rate, inflation rate, etc.).

[0162] Features that contribute significantly to the analysis objective are selected through correlation analysis or feature importance assessment.

[0163] In addition to directly extracted features, new features can be generated through feature combination, feature transformation (such as logarithmic transformation, Box-Cox transformation, etc.) or feature encoding (such as one-hot encoding, word embedding, etc.) to capture potential patterns in the data.

[0164] Time series analysis techniques (such as Fourier transform and wavelet analysis) are used to further extract features from time series data.

[0165] Then, deep learning models are built: Suitable deep learning architectures are selected, such as recurrent neural networks (RNNs), long short-term memory networks (LSTMs), and gated recurrent units (GRUs), to capture long dependencies in time-series data. If spatial features exist in the data (such as the need to capture different geographical locations), convolutional neural networks (CNNs) or graph neural networks (GNNs) can be considered to capture these spatial features.

[0166] Build multi-task learning or ensemble learning models, combining the advantages of multiple deep learning models to improve analysis accuracy and robustness.

[0167] Use appropriate loss functions and optimization algorithms to train the model, such as mean squared error (MSE) or mean absolute error (MAE) as loss functions, and use gradient descent or its variants (such as Adam, RMSprop, etc.) for optimization. Introduce regularization techniques (such as L1 / L2 regularization, Dropout, etc.) to prevent the model from overfitting.

[0168] Techniques such as cross-validation and grid search are used to adjust model hyperparameters and find the optimal model configuration.

[0169] Finally, model evaluation and deployment are carried out: the model is evaluated using an independent test set, and metrics such as precision, recall, and F1 score are calculated and analyzed to ensure that the model performance meets the expected requirements.

[0170] The model is regularly updated and retrained to adapt to changes in data distribution and new demand scenarios. It is then deployed to the production environment to enable real-time oil demand analysis and decision support.

[0171] Furthermore, visualization techniques (such as heatmaps and time series plots) are used to display analysis results and model performance, helping users better understand the model. Model interpretability techniques (such as SHAP values and LIME values) are introduced to provide explanations and clarifications of the model analysis results, enhancing user trust in the model. Specifically:

[0172] In this specific embodiment, K-means and XGboost algorithms are used, combined with internal data from PetroChina and external data from Nielsen, to achieve grouping at different levels through cluster analysis. This is then applied to scenarios such as commodity groups, business analysis, and inventory assessment, providing targeted assessment and evaluation criteria for each management level.

[0173] The following is a detailed description of this specific embodiment:

[0174] The categorization of data information under different regional categories may include:

[0175] Provincial Address Grouping: Provinces are divided into groups using the K-means clustering algorithm. Provinces within each group are similar in key indicators such as consumption, demand, and number of customers, which facilitates unified assessment and strategy formulation by provincial management.

[0176] Provincial-level cities grouped: Using the same K-means algorithm, cities are divided into groups. Cities within each group have similarities in terms of population, refined oil consumption data, etc., which helps city-level management to carry out more targeted management and optimization.

[0177] Local store grouping: The K-means algorithm is used to cluster local stores and divide them into groups. Stores in each group are similar in terms of location, fuel consumption data, and customer traffic, which facilitates unified management and operational optimization by store management.

[0178] Cross-provincial store grouping: Through cluster analysis, cross-provincial stores are grouped. Stores in each group have similarities in cross-provincial transportation, customer traffic, etc., which helps cross-regional management to conduct unified assessments and formulate strategies.

[0179] Based on the grouping results, management levels can develop more targeted demand management plans, adjusting and optimizing refined oil demand control strategies according to the characteristics and needs of different groups. The product mix for different groups can also be adjusted and optimized to meet the needs of different regions and stores, thereby improving customer satisfaction.

[0180] Cluster analysis allows management levels to gain a clearer understanding of the operational status of different groups, including key indicators such as profit and customer traffic, providing strong support for business decisions. Inventory assessment: Based on the grouping results, inventory levels in different groups can be assessed and optimized to ensure that inventory levels match consumption demand, reduce inventory costs, and improve operational efficiency. Adopting a unified performance evaluation approach for similar organizations ensures fairness and consistency, stimulating the enthusiasm and creativity of management levels. Through cluster analysis, management levels can gain a clearer understanding of their own performance objectives and requirements, enabling more targeted work planning and execution.

[0181] Based on the grouping results and assessment criteria, each management level can more effectively improve and optimize performance, thereby enhancing overall operational efficiency. Utilizing K-means and XGboost algorithms, combined with internal PetroChina data and external Nielsen data, for cluster analysis and evaluation can provide targeted assessment criteria for each management level. This can be applied to scenarios such as commodity groups, business analysis, and inventory assessment, thereby optimizing business processes and improving overall operational efficiency.

[0182] Of course, it is understood that there may be other variations of the above detailed process, and all such variations should fall within the protection scope of this invention.

[0183] In this embodiment of the invention, seasonal information and historical demand information of refined oil products are collected; based on a clustering algorithm, the collected information is clustered to obtain data information at different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical scope; for the data information at each regional category level, feature engineering processing is performed to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products corresponding to that regional category level; the cross characteristics are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products; based on An ensemble learning algorithm based on a decision tree model is used to determine the importance value of each feature among the seasonal characteristics, trend characteristics, and cross-features of refined oil products at the corresponding category level for a given region. Multiple features with importance values exceeding a preset value are selected as target features for the corresponding category level of that region. A second training dataset is established based on the obtained refined oil demand data at different regional category levels and the data of multiple target features corresponding to different regional category levels. A pre-set neural network model is trained using the second training dataset using a neural network algorithm to obtain a refined oil demand analysis model. Based on the refined oil demand analysis model, refined oil demand data for different merchants at different regional category levels is calculated. This invention employs clustering algorithms to perform multi-level clustering analysis on data (across provinces, provinces, cities within the same province, and different merchants within the same city), enabling more accurate capture of differences between regions and avoiding the one-size-fits-all problem of traditional methods. Feature engineering is performed on the data information at each level to extract various features, including seasonal features, trend features, and cross-features, allowing the model to comprehensively consider various influencing factors. An ensemble learning algorithm based on a decision tree model is used to determine the importance values of key features and select target features, ensuring that the dataset used to train the neural network model contains the most influential variables. A neural network algorithm is used to train the refined oil demand analysis model; this deep... The degree-learning model can automatically learn useful feature representations from raw data, capture complex patterns and long-term dependencies, thereby improving the accuracy of refined oil demand calculation. Furthermore, by using clustering algorithms to obtain data information at different regional category levels, management at all levels can provide personalized services based on the characteristics of specific regions. Through accurate analysis of future demand using the refined oil demand analysis model, enterprises can allocate resources more rationally, avoiding inventory backlogs or shortages. This invention addresses the problems of insufficient analytical precision, lack of targeted management, unreasonable resource allocation, and slow response speed in existing technologies, improving the accuracy and efficiency of refined oil demand calculation.

[0184] As described above, the embodiments of the present invention have the following advantages:

[0185] 1. Improve the accuracy of market analysis

[0186] By conducting in-depth research on various characteristics affecting refined oil demand, including the development of new energy vehicles, regulatory and environmental requirements, and supply chain stability, a more accurate and comprehensive analytical model can be established. Such a model can better capture market dynamics, analyze future demand trends, and provide a scientific basis for enterprises to formulate refined oil demand control strategies and adjust production plans.

[0187] 2. Optimize the allocation of refined oil resources

[0188] Understanding the factors influencing refined oil demand helps companies allocate resources rationally based on market demand and changes. For example, when market demand is high, production input can be increased to boost capacity; when market demand is low, production can be appropriately reduced to avoid inventory buildup. By optimizing resource allocation, companies can reduce production costs, improve operational efficiency, and enhance market competitiveness.

[0189] 3. Responding to market challenges

[0190] With the rapid development of new energy vehicles and the continuous improvement of regulations, the traditional refined oil market is facing unprecedented challenges. Studying the characteristics affecting refined oil demand can help companies identify market changes in a timely manner and formulate response strategies. For example, in response to the widespread adoption of new energy vehicles, companies can develop more environmentally friendly and efficient refined oil products, or explore business areas related to new energy vehicles to address market challenges.

[0191] 4. Promote the sustainable development of the industry

[0192] Studying the characteristics of refined oil demand is also an important way to promote the sustainable development of the industry. By gaining a deeper understanding of market demand and regulatory factors, companies can better clarify their own positioning and development direction, actively adjust their product structure, enhance their technological innovation capabilities, and drive the industry towards a greener, lower-carbon, and more efficient direction. At the same time, companies can also enhance their competitiveness by strengthening cooperation and exchanges with international markets, introducing advanced technologies and management experience, and contributing to the sustainable development of the industry.

[0193] The purpose of studying the characteristics affecting refined oil demand is to improve the accuracy of market analysis, optimize resource allocation, address market challenges, promote sustainable industry development, and support relevant decision-making. Achieving these objectives will help enterprises better adapt to market changes, enhance competitiveness, and promote the healthy development of the entire industry.

[0194] This invention also provides an intelligent calculation device for refined oil demand, as described in the following embodiments. Since the principle by which this device solves the problem is similar to the intelligent calculation method for refined oil demand, the implementation of this device can refer to the implementation of the intelligent calculation method for refined oil demand; repeated details will not be elaborated further.

[0195] Figure 5 This is a schematic diagram of the structure of an intelligent calculation device for refined oil demand in an embodiment of the present invention. The embodiment also provides an intelligent calculation device for refined oil demand to improve the accuracy and efficiency of refined oil demand calculation and to achieve targeted analysis of refined oil demand at different regional category levels, such as... Figure 5 As shown, the device includes:

[0196] The information acquisition and cluster analysis module 501 is used to acquire the seasonal information and historical demand information of the collected refined oil products; based on the clustering algorithm, it performs cluster analysis on the collected information to obtain data information under different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants in the same city, arranged from largest to smallest geographical range.

[0197] The feature engineering processing module 502 is used to perform feature engineering processing on the data information under each regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products.

[0198] The target feature determination module 503 is used to determine the feature importance value of each feature in the seasonal features, trend features and cross features of refined oil products at the category level of the corresponding region based on the decision tree model ensemble learning algorithm; and to take multiple features whose feature importance values exceed the preset values as target features at the category level of the corresponding region.

[0199] The refined oil demand analysis modeling module 504 is used to establish a second training dataset based on the refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories; and to train the pre-set neural network model using the second training dataset based on the neural network algorithm to obtain the refined oil demand analysis model.

[0200] The refined oil demand analysis module 505 is used to calculate the refined oil demand data of different merchants in different regions and categories based on the refined oil demand analysis model.

[0201] Figure 6This is a specific example diagram of an intelligent calculation device for refined oil demand according to an embodiment of the present invention. In one embodiment, such as... Figure 6 As shown, it also includes:

[0202] The data processing module 601 is used to perform data cleaning and standardization on the collected seasonal and historical demand information of refined oil products to obtain standardized data.

[0203] The information acquisition and cluster analysis module is specifically used for: performing cluster analysis on standardized data based on the K-means clustering algorithm.

[0204] In one embodiment, clustering analysis is performed on the collected information based on a clustering algorithm to obtain data information at different regional category levels, including:

[0205] Based on prior knowledge of the collected information, determine the number of clusters of the collected information;

[0206] Randomly select multiple samples from the collected information as initial centroids; calculate the distance from each sample to each centroid in the collected information, and assign the sample to the cluster to which the nearest centroid belongs; recalculate the new centroid for each cluster; repeat the above steps of assignment and recalculation of new centroids until the centroids no longer move significantly or the preset maximum number of iterations is reached;

[0207] The centroids obtained when they no longer move significantly or reach the preset maximum number of iterations are used as the clustering analysis results of the collected information; based on the clustering analysis results, data information at different regional category levels is obtained.

[0208] In one embodiment, based on a decision tree model ensemble learning algorithm, the importance value of each feature among the seasonal characteristics, trend characteristics, and cross-characteristics of refined oil products at the category level corresponding to the region is determined, including:

[0209] Based on the seasonal characteristics, trend characteristics, and cross-characteristics of refined oil products at the corresponding regional category levels, a first training dataset is established.

[0210] Based on the decision tree model ensemble learning algorithm, the pre-set decision tree model is trained with the first training dataset to obtain the model for determining the importance of refined oil features at the category level in the corresponding region.

[0211] The importance of refined oil features is used to determine the model, including the number of times each feature is selected as a splitting feature during the decision tree splitting process, the information gain it brings at the splitting node, and the number of samples covered in the tree.

[0212] The importance value of each feature in refined oil products is determined based on the frequency of each feature, the information gain, and the number of samples.

[0213] In one embodiment, an ensemble learning algorithm based on a decision tree model is used to train a pre-set decision tree model on a first training dataset to obtain a model for determining the importance of refined oil features at the category level corresponding to the region, including:

[0214] Choose XGBoost, Random Forest, or Gradient Boosting Decision Tree as the decision tree model;

[0215] Initialize the model hyperparameters of the decision tree model; the model hyperparameters include: learning rate, maximum tree depth, subsample ratio, column sampling ratio, and regularization parameter;

[0216] Based on the hyperparameters of the model, and using the decision tree model ensemble learning algorithm, the decision tree model is trained on the first training dataset to obtain a model for determining the importance of refined oil features at the category level for the corresponding region.

[0217] This invention provides an embodiment of a computer device for implementing all or part of the above-described intelligent calculation method for refined oil demand. The computer device specifically includes the following components:

[0218] The computer device comprises a processor, memory, a communications interface, and a bus; wherein the processor, memory, and communications interface communicate with each other via the bus; the communications interface is used to realize information transmission between related devices; the computer device can be a desktop computer, tablet computer, or mobile terminal, etc., and this embodiment is not limited to these. In this embodiment, the computer device can be implemented with reference to the embodiments for implementing the intelligent calculation method for refined oil demand and the embodiments for implementing the intelligent calculation device for refined oil demand, the contents of which are incorporated herein by reference, and repeated details will not be described again.

[0219] Figure 7 This is a schematic block diagram illustrating the system configuration of the computer device 1000 according to an embodiment of this application. Figure 7 As shown, the computer device 1000 may include a central processing unit 1001 and a memory 1002; the memory 1002 is coupled to the central processing unit 1001. It is worth noting that... Figure 7 This is an example; other types of structures can also be used to supplement or replace this structure to achieve telecommunications functions or other functions.

[0220] In one embodiment, the intelligent calculation function for refined oil demand can be integrated into the central processing unit 1001. The central processing unit 1001 can be configured to perform the following control:

[0221] The system collects seasonal and historical demand information for refined oil products; it then performs cluster analysis on the collected information based on a clustering algorithm to obtain data information at different regional category levels. These regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical scope.

[0222] For the data information under each regional category level, feature engineering is performed on the data information under the regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products.

[0223] Based on the decision tree model ensemble learning algorithm, the importance value of each feature in the seasonality feature, trend feature, and cross feature of refined oil products in the corresponding category hierarchy of the region is determined; and multiple features whose feature importance values exceed the preset values are taken as the target features in the corresponding category hierarchy of the region.

[0224] Based on the refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories, a second training dataset is established; based on the neural network algorithm, the pre-set neural network model is trained with the second training dataset to obtain the refined oil demand analysis model.

[0225] Based on the refined oil demand analysis model, the refined oil demand data of different merchants in different regions and categories are calculated.

[0226] In another embodiment, the intelligent calculation device for refined oil demand can be configured separately from the central processing unit 1001. For example, the intelligent calculation device for refined oil demand can be configured as a chip connected to the central processing unit 1001, and the intelligent calculation function for refined oil demand can be realized through the control of the central processing unit.

[0227] like Figure 7 As shown, the computer device 1000 may further include: a communication module 1003, an input unit 1004, an audio processor 1005, a display 1006, and a power supply 1007. It is worth noting that the computer device 1000 does not necessarily need to include... Figure 7 All components shown; in addition, the computer device 1000 may also include Figure 7 For components not shown, please refer to existing technologies.

[0228] like Figure 7 As shown, the central processing unit 1001, sometimes also referred to as a controller or operation control, may include a microprocessor or other processor device and / or logic device. The central processing unit 1001 receives input and controls the operation of various components of the computer device 1000.

[0229] The memory 1002 may be, for example, one or more of a cache, flash memory, hard drive, removable medium, volatile memory, non-volatile memory, or other suitable device. It may store the aforementioned device-related information, and also store programs for executing that information. The central processing unit 1001 may execute the program stored in the memory 1002 to perform information storage or processing, etc.

[0230] Input unit 1004 provides input to central processing unit 1001. This input unit 1004 may be, for example, a keypad or touch input device. Power supply 1007 provides power to computer device 1000. Display 1006 displays images, text, and other display objects. This display may be, for example, an LCD display, but is not limited to this.

[0231] The memory 1002 can be a solid-state memory, such as a read-only memory (ROM), random access memory (RAM), a SIM card, etc. It can also be a memory that retains information even when power is off, can be selectively erased, and contains more data; examples of this type of memory are sometimes referred to as EPROMs, etc. The memory 1002 can also be some other type of device. The memory 1002 includes a buffer memory 1021 (sometimes referred to as a buffer). The memory 1002 may include an application / function storage unit 1022 for storing application programs and function programs or processes for executing operations of the computer device 1000 via the central processing unit 1001.

[0232] The memory 1002 may also include a data storage unit 1023 for storing data, such as contacts, digital data, pictures, sounds, and / or any other data used by the computer device. The driver storage unit 1024 of the memory 1002 may include various drivers for the computer device for communication functions and / or for performing other functions of the computer device (such as messaging applications, address book applications, etc.).

[0233] The communication module 1003 is a transmitter / receiver that transmits and receives signals via the antenna 1008. The communication module (transmitter / receiver) 1003 is coupled to the central processing unit 1001 to provide input signals and receive output signals, which is the same as in a conventional mobile communication terminal.

[0234] Based on different communication technologies, multiple communication modules 1003 can be configured in the same computer device, such as cellular network modules, Bluetooth modules, and / or wireless LAN modules. The communication module (transmitter / receiver) 1003 is also coupled to a speaker 1009 and a microphone 1010 via an audio processor 1005 to provide audio output via the speaker 1009 and receive audio input from the microphone 1010, thereby realizing typical telecommunications functions. The audio processor 1005 may include any suitable buffer, decoder, amplifier, etc. Furthermore, the audio processor 1005 is also coupled to a central processing unit 1001, enabling on-device recording via the microphone 1010 and on-device playback of stored sound via the speaker 1009.

[0235] This invention also provides a computer-readable storage medium storing a computer program that, when executed by a processor, implements the above-described intelligent calculation method for refined oil demand.

[0236] This invention also provides a computer program product, which includes a computer program that, when executed by a processor, implements the above-mentioned intelligent calculation method for refined oil demand.

[0237] In this embodiment of the invention, seasonal information and historical demand information of refined oil products are collected; based on a clustering algorithm, the collected information is clustered to obtain data information at different regional category levels; the regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical scope; for the data information at each regional category level, feature engineering processing is performed to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products corresponding to that regional category level; the cross characteristics are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products; based on An ensemble learning algorithm based on a decision tree model is used to determine the importance value of each feature among the seasonal characteristics, trend characteristics, and cross-features of refined oil products at the corresponding category level for a given region. Multiple features with importance values exceeding a preset value are selected as target features for the corresponding category level of that region. A second training dataset is established based on the obtained refined oil demand data at different regional category levels and the data of multiple target features corresponding to different regional category levels. A pre-set neural network model is trained using the second training dataset using a neural network algorithm to obtain a refined oil demand analysis model. Based on the refined oil demand analysis model, refined oil demand data for different merchants at different regional category levels is calculated. This invention employs clustering algorithms to perform multi-level clustering analysis on data (across provinces, provinces, cities within the same province, and different merchants within the same city), enabling more accurate capture of differences between regions and avoiding the one-size-fits-all problem of traditional methods. Feature engineering is performed on the data information at each level to extract various features, including seasonal features, trend features, and cross-features, allowing the model to comprehensively consider various influencing factors. An ensemble learning algorithm based on a decision tree model is used to determine the importance values of key features and select target features, ensuring that the dataset used to train the neural network model contains the most influential variables. A neural network algorithm is used to train the refined oil demand analysis model; this deep... The degree-learning model can automatically learn useful feature representations from raw data, capture complex patterns and long-term dependencies, thereby improving the accuracy of refined oil demand calculation. Furthermore, by using clustering algorithms to obtain data information at different regional category levels, management at all levels can provide personalized services based on the characteristics of specific regions. Through accurate analysis of future demand using the refined oil demand analysis model, enterprises can allocate resources more rationally, avoiding inventory backlogs or shortages. This invention addresses the problems of insufficient analytical precision, lack of targeted management, unreasonable resource allocation, and slow response speed in existing technologies, improving the accuracy and efficiency of refined oil demand calculation.

[0238] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.

[0239] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.

[0240] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing device to function in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1 The function specified in one or more boxes.

[0241] These computer program instructions may also be loaded onto a computer or other programmable data processing equipment to cause a series of operational steps to be performed on the computer or other programmable equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.

[0242] The specific embodiments described above further illustrate the purpose, technical solution, and beneficial effects of the present invention. It should be understood that the above descriptions are merely specific embodiments of the present invention and are not intended to limit the scope of protection of the present invention. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of the present invention should be included within the scope of protection of the present invention.

Claims

1. A method for intelligently calculating the demand for refined oil products, characterized in that, include: Obtain collected information on the seasonality and historical demand of refined oil products; Based on clustering algorithms, the collected information is clustered to obtain data information at different regional category levels. The regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical range. For the data information under each regional category level, feature engineering is performed on the data information under the regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products. Based on the decision tree model ensemble learning algorithm, the importance value of each feature in the seasonality feature, trend feature, and cross feature of refined oil products in the corresponding category hierarchy of the region is determined; and multiple features whose feature importance values exceed the preset values are taken as the target features in the corresponding category hierarchy of the region. Based on the refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories, a second training dataset is established; based on the neural network algorithm, the pre-set neural network model is trained with the second training dataset to obtain the refined oil demand analysis model. Based on the refined oil demand analysis model, the refined oil demand data of different merchants in different regions and categories are calculated.

2. The method as described in claim 1, characterized in that, Also includes: The collected seasonal and historical demand information for refined oil products is cleaned and standardized to obtain standardized data. Clustering algorithms are used to perform cluster analysis on the collected information, including: cluster analysis on standardized data based on the K-means clustering algorithm.

3. The method as described in claim 1, characterized in that, Based on clustering algorithms, the collected information is clustered to obtain data information at different regional category levels, including: Based on prior knowledge of the collected information, determine the number of clusters of the collected information; Randomly select multiple samples from the collected information as initial centroids; calculate the distance from each sample to each centroid in the collected information, and assign the sample to the cluster to which the nearest centroid belongs; recalculate the new centroid for each cluster; repeat the above steps of assignment and recalculation of new centroids until the centroids no longer move significantly or the preset maximum number of iterations is reached; The centroids obtained when they no longer move significantly or reach the preset maximum number of iterations are used as the clustering analysis results of the collected information; based on the clustering analysis results, data information at different regional category levels is obtained.

4. The method as described in claim 1, characterized in that, Based on the decision tree model ensemble learning algorithm, the importance value of each feature in the seasonality, trend, and cross-features of refined oil products at the category level for the corresponding region is determined, including: Based on the seasonal characteristics, trend characteristics, and cross-characteristics of refined oil products at the corresponding regional category levels, a first training dataset was established. Based on the decision tree model ensemble learning algorithm, the pre-set decision tree model is trained with the first training dataset to obtain the model for determining the importance of refined oil features at the category level in the corresponding region. The importance of refined oil features is used to determine the model, including the number of times each feature is selected as a splitting feature during the decision tree splitting process, the information gain it brings at the splitting node, and the number of samples covered in the tree. The importance value of each feature in refined oil products is determined based on the frequency of each feature, the information gain, and the number of samples.

5. The method as described in claim 4, characterized in that, Based on the ensemble learning algorithm of decision tree models, a pre-set decision tree model is trained using the first training dataset to obtain a model for determining the importance of refined oil features at the category level for the corresponding region, including: Choose XGBoost, Random Forest, or Gradient Boosting Decision Tree as the decision tree model; Initialize the model hyperparameters of the decision tree model; the model hyperparameters include: learning rate, maximum tree depth, subsample ratio, column sampling ratio, and regularization parameter; Based on the hyperparameters of the model, and using the decision tree model ensemble learning algorithm, the decision tree model is trained on the first training dataset to obtain a model for determining the importance of refined oil features at the category level for the corresponding region.

6. A smart calculation device for refined oil demand, characterized in that, include: The information acquisition and cluster analysis module is used to acquire seasonal and historical demand information of refined oil products. Based on clustering algorithms, the collected information is clustered to obtain data information at different regional category levels. The regional category levels include cross-provincial address levels, provincial address levels, intra-provincial city address levels, and address levels of different merchants within the same city, arranged from largest to smallest geographical range. The feature engineering processing module is used to perform feature engineering processing on the data information under each regional category level to obtain the seasonal characteristics, trend characteristics, and cross characteristics of refined oil products under the corresponding regional category level; the cross characteristics of refined oil products are used to characterize the correlation between the seasonal characteristics and trend characteristics of refined oil products. The target feature determination module is used to determine the feature importance value of each feature in the seasonal features, trend features, and cross features of refined oil products at the category level of the corresponding region based on the decision tree model ensemble learning algorithm; and to select multiple features whose feature importance values exceed preset values as target features at the category level of the corresponding region. The refined oil demand analysis modeling module is used to establish a second training dataset based on the refined oil demand data under different regional categories and the data of multiple target features corresponding to different regional categories; based on the neural network algorithm, the pre-set neural network model is trained with the second training dataset to obtain the refined oil demand analysis model. The refined oil demand analysis module is used to calculate the refined oil demand data of different merchants in different regions and categories based on the refined oil demand analysis model.

7. The apparatus as claimed in claim 6, characterized in that, Also includes: The data processing module is used to clean and standardize the collected seasonal and historical demand information of refined oil products to obtain standardized data. The information acquisition and cluster analysis module is specifically used for: performing cluster analysis on standardized data based on the K-means clustering algorithm.

8. A computer device, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the method of any one of claims 1 to 5.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program that, when executed by a processor, implements the method of any one of claims 1 to 5.

10. A computer program product, characterized in that, The computer program product includes a computer program that, when executed by a processor, implements the method of any one of claims 1 to 5.