An internet of vehicles edge computing offloading multi-target decision method based on digital twinning

By adopting a multi-objective decision-making method for vehicle-to-everything (V2X) edge computing based on digital twins, combined with a decomposed multi-objective evolutionary algorithm and near-end strategy optimization, the problems of low data efficiency and general stability in V2X are solved. This approach optimizes time latency, energy consumption, and cloud computing costs, thereby improving the computing efficiency and intelligence level of V2X.

CN116782296BActive Publication Date: 2026-06-23NANJING UNIV OF SCI & TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
NANJING UNIV OF SCI & TECH
Filing Date
2023-05-29
Publication Date
2026-06-23

AI Technical Summary

Technical Problem

When using digital twin technology in the Internet of Vehicles (IoV), existing technologies suffer from low data efficiency and general stability, failing to fully leverage real-time advantages. Furthermore, they cannot effectively optimize time latency and energy consumption as conflicting optimization objectives, resulting in inefficient decision-making for offloading edge computing tasks in the IoV.

Method used

A multi-objective decision-making method for offloading edge computing in the Internet of Vehicles based on digital twins is adopted. This method combines a decomposed multi-objective evolutionary algorithm and near-end policy optimization. It estimates the advantage by fitting the value function through mean square error regression and calculating the discounted reward, and combines the truncation method to constrain policy updates. This achieves multi-objective optimization of time delay, energy consumption and cloud computing costs.

Benefits of technology

It achieves a trade-off between multiple objectives such as time latency, energy consumption, and cloud computing costs, reduces the offloading cost of long-term tasks, improves the data computing efficiency and intelligence level of vehicle-to-everything (V2X) networks, alleviates the problems of high latency and high energy consumption, and improves the stability and convergence speed of the algorithm.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN116782296B_ABST
    Figure CN116782296B_ABST
Patent Text Reader

Abstract

The application discloses a kind of based on digital twinning's Internet of Vehicles edge computing unloading multi-objective decision-making method, specifically: input based on digital twinning's edge Internet of Vehicles environment, initialize executor-evaluator network parameters and based on the parameter of multi-objective evolutionary algorithm based on decomposition;Perform unloading action, obtain reward vector and completion flag, store in cache area;According to the data fitting value function of cache area, based on current value function and discount return calculation advantage estimate;Using multi-objective evolutionary algorithm based on decomposition and reward vector update solution set and fitness value, return Pareto optimal solution as executor network learning parameter;With truncation method constraint policy update, calculate loss function and update strategy;When completion flag is true, reset edge Internet of Vehicles environment starts next round.The application is suitable for in unknown dynamic edge Internet of Vehicles environment through digital twinning auxiliary intelligent edge unloading decision to realize the long-term trade-off between minimization delay, energy consumption and cloud computing cost.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention belongs to the field of wireless communication technology, and is a multi-objective decision-making method for edge computing offloading in vehicle networking based on digital twins. Background Technology

[0002] The level of intelligence in automobiles is constantly improving with the popularization of in-vehicle electronic devices, and the application scenarios of the Internet of Vehicles (IoV) have expanded from the traditional mobile Internet to the field of intelligent transportation (Zhang J, Letaief K B. MobileEdge Intelligence and Computing for the Internet of Vehicles[J]. Proceedingsof the IEEE, 2019, 108(2): 246-261.). Applying Mobile Edge Computing (MEC) to the IoV can meet the needs of real-time, security and intelligence, save a lot of latency and energy consumption caused by long-distance data transmission, alleviate the task computing pressure of individual intelligent driving vehicles, and help improve the data computing efficiency of the entire IoV system (Lin Y, Zhang Y, Li J, et al. Popularity-Aware OnlineTask Offloading for Heterogeneous Vehicular Edge Computing Using ContextualClustering of Bandits[J]. IEEE Internet of Things Journal, 2022, 9(7): 5422-5433.).

[0003] Digital twins are a technology that integrates multiple physical, multi-scale, and multi-disciplinary attributes, possessing characteristics of real-time synchronization, faithful mapping, and high fidelity, enabling interaction and fusion between the physical and information worlds (Tao Fei, Liu Weiran, Liu Jianhua. Exploration of Digital Twins and Their Applications [J]. Computer Integrated Manufacturing Systems, 2018, 24(01):1-18.). When using digital twin technology to assist in edge computing offloading, vehicle-to-everything (V2X) applications only need to upload data to the digital twin network on the edge device, effectively reducing latency between vehicles and between vehicles and edge servers, making information transmission more timely and accurate, thereby improving system performance and availability.

[0004] Given the advantages of real-time and aggregated data from digital twins, researchers are considering utilizing data from digital twin networks and combining it with reinforcement learning theory to find optimal edge computing task offloading strategies to improve the efficiency and accuracy of strategy learning (Lu Y, Maharjan S, Zhang Y. Adaptive Edge Association for Wireless Digital Twin Networks in 6G[J]. IEEE Internet of Things Journal, 2021, 8(22):16219-16230). Vehicle-to-everything (V2X) networks, by utilizing digital twin technology, can not only provide data from virtual spaces but also infer information to assist in task offloading decisions from data generated by digital twins and the environment. This greatly reduces the burden of subsequent task offloading decisions, helps reduce the amount of data required to train reinforcement learning algorithms, and improves the learning efficiency of resource management (Zhang K, Cao J, Zhang Y. Adaptive Digital Twin and Multiagent DeepReinforcement Learning for Vehicular Edge Computing and Networks[J]. IEEE Transactions on Industrial Informatics, 2021, 18(2): 1405-1413.). While digital twins are efficient and convenient for data analysis, there is a significant discrepancy between the estimated data from the digital twin network on the edge server and the actual data from the Internet of Vehicles in the physical world. To reduce the error between the digital twin and the physical entity, many scholars have conducted further research in recent years (Yuan X, Chen J, Zhang N, et al. Digital Twin-Driven Vehicular Task Offloading and IRS Configuration in the Internet of Vehicles[J]. IEEE Transactions on Intelligent Transportation Systems, 2022,23(12): 24290-24304.). However, these studies mainly use relatively traditional reinforcement learning algorithms, such as the DDQN (Double Deep Q-network) algorithm. These algorithms have low data efficiency and general stability, and cannot fully leverage the advantages of the real-time performance of digital twins.Moreover, since latency and energy consumption are two contradictory optimization objectives, some scholars have used a weighted sum method to take the total reward as the optimization objective, which has failed to fully consider the multi-objective optimization problem and cannot maximize the optimization of the offloading decision of edge computing tasks in the Internet of Vehicles. Summary of the Invention

[0005] The purpose of this invention is to provide a multi-objective decision-making method for offloading edge computing in the Internet of Vehicles based on digital twins, which minimizes the offloading cost of long-term tasks in the trade-off between multiple objectives such as time delay, energy consumption and cloud computing costs, and achieves low-cost and high-efficiency intelligent task offloading.

[0006] The technical solution to achieve the purpose of this invention is: a multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins, comprising the following steps:

[0007] Step 1: Input the edge vehicle network environment based on digital twin, initialize the executor-evaluator network parameters, and initialize the algorithm parameters of the decomposition-based multi-objective evolutionary algorithm;

[0008] Step 2: In the current time slot, each vehicle user selects an unloading plan based on the action generated in the previous time slot and obtains the reward vector and completion flag from the environmental feedback, which are then stored in the cache.

[0009] Step 3: Use mean squared error regression to fit the value function based on the buffer data, and calculate the advantage estimate based on the current value function and discounted return;

[0010] Step 4: Update the solution set and fitness value using a decomposition-based multi-objective evolutionary algorithm and reward vector, and return the Pareto optimal solution set that optimizes the fitness value as the learning parameters for the executor network.

[0011] Step 5: Constrain the policy update using the truncation method, calculate the loss function of the executor-evaluator network, and update the policy;

[0012] Step 6: When the completion flag is true, end the current round, start the next round, re-enter the edge vehicle networking environment based on digital twin, and repeat steps 2 to 5.

[0013] Further, the input in step 1 is a digital twin-based edge vehicle-to-everything (V2X) environment, wherein the digital twin-based edge V2X environment includes:

[0014] (1) Time slot model: The continuous training time is discretized into multiple time slots, using positive integers.

[0015] To indicate the first

[0016] A time slot; it is assumed that a vehicle user completes an edge unloading decision and location movement within a single time slot, and that environmental conditions such as transmit power and channel noise do not change within a single time slot. When all vehicle users arrive at the set road endpoint, it is called a round.

[0017] (2) Network Model: Establish a two-layer digital twin vehicle network edge network model consisting of a physical entity layer and a digital twin layer. Assume the physical entity layer includes...

[0018] There are one vehicle user and one base station equipped with a MEC server. The set of vehicle users is represented as...

[0019] ,vehicle

[0020] In the time slot

[0021] Digital twin representation

[0022] ,in

[0023] ,

[0024] Is DT to

[0025] Time-slot vehicles

[0026] The estimated value of the true value of the task calculation frequency.

[0027] It is a time slot

[0028] vehicle

[0029] The task calculates the error between the actual frequency value and the DT estimate. Digital twins of all vehicles, base stations, and MEC servers constitute the digital twin layer.

[0030] (3) Communication Model: Assuming one-way traffic, vehicles can communicate with base stations wirelessly or with cloud servers and digital twin networks via base stations along the road. Assuming the road is an open area without obstacles, and disregarding the impact of environmental factors on path loss, a simple free-space model is used for the path loss. Therefore, the channel gain between the vehicle and the base station can be expressed as...

[0031]

[0032] make

[0033] For vehicle users

[0034] Rayleigh attenuation of the channel between the base station and the station, where

[0035] Let be the corresponding scaling parameter.

[0036] This is the path loss factor.

[0037] Indicates vehicle

[0038] exist

[0039] The distance between the time slot and the base station.

[0040] Information transmission is subject to various interferences, and Gaussian white noise, as a general noise model, can be used to describe noise interference in the channel. Therefore...

[0041] Time-slot vehicles

[0042] The information transmission rate between the base station and the base station can be expressed as:

[0043]

[0044] in,

[0045] For vehicle serial number,

[0046] This indicates that the base station is assigned to the vehicle.

[0047] Available bandwidth,

[0048] Indicates vehicle

[0049] exist

[0050] Signal transmission power between time slots and base stations

[0051] The power of the Gaussian white noise is given.

[0052] (4) Computational model: using binary pairs

[0053] describe

[0054] Time-slot vehicles

[0055] The task, among which

[0056] express

[0057] Time-slot vehicles

[0058] The size of the task data sent.

[0059] Represents the unit of calculation (bit).

[0060] Time-slot vehicles

[0061] The number of computation cycles required for the task data. There are latency errors when real-time data interaction occurs between the digital twin and the physical entity.

[0062] The error between the actual time slot calculation delay and the digital twin estimated time slot delay can be expressed as:

[0063]

[0064] Time-slot vehicles

[0065] Task calculation time is

[0066]

[0067] Time-slot vehicles

[0068] Task transmission time is

[0069]

[0070] Time-slot vehicles

[0071] The energy consumption for task transmission is

[0072]

[0073] Time-slot vehicles

[0074] The computational energy consumption when the task is unloaded to local computing is

[0075]

[0076] Time-slot vehicles

[0077] The energy consumption is calculated when the task is unloaded to the MEC server.

[0078]

[0079] The cost of renting cloud computing cannot be ignored; let's assume the unit cost of renting cloud computing is...

[0080] Dollar,

[0081] Time-slot vehicles

[0082] The cost of transferring tasks to a cloud computing platform is

[0083]

[0084] Time-slot vehicles

[0085] The cost of cloud computing for the task is

[0086]

[0087] in

[0088] It is a price factor related to cloud service providers.

[0089] for

[0090] Time-slot vehicles

[0091] The total time cost, energy cost, and cloud computing cost of offloading computing tasks are:

[0092]

[0093] Further, in step 2, each vehicle user selects an unloading scheme based on the action generated in the previous time slot, obtains the reward vector and completion flag from the environmental feedback, and stores them in the cache. Specifically:

[0094] (1) Actions of vehicle users

[0095] Time slot

[0096] The motion vector selected by the vehicle user

[0097] It can be represented as

[0098]

[0099] Time slot

[0100] Get off the vehicle

[0101] action

[0102] A value of 0 indicates that the vehicle is connected to local computing, a value of 1 indicates that the vehicle is connected to MEC computing, and a value of 2 indicates that the vehicle is connected to cloud computing.

[0103] (2) System rewards

[0104] The multi-objective decision-making method for offloading edge computing in vehicle-to-everything (V2X) based on digital twins has two optimization objectives: minimizing the latency of offloading computing tasks from vehicle users and minimizing energy consumption and cloud computing costs. When the cost value is large, parameters are used.

[0105] Reduce the reward value.

[0106] The system reward for a time slot with the goal of minimizing latency is set as follows:

[0107]

[0108] To minimize energy consumption and cloud computing costs, the system reward is set as follows:

[0109]

[0110] Combining the system rewards of the two objectives yields the reward vector from the environmental feedback.

[0111] .

[0112] (3) Completion mark

[0113] Assuming vehicles travel at a constant speed in one direction, each vehicle user performs one vehicle movement and position update within a time slot. If all vehicle users reach the destination along the designated path from their respective starting points, one round ends and the returned completion flag (done) is true; otherwise, done is false.

[0114] Further, step 3, which involves using mean squared error regression to fit a value function based on the cache data and calculating the advantage estimate based on the current value function and discounted return, specifically involves:

[0115] We use mean squared error regression to fit a value function, the value function being:

[0116]

[0117] make

[0118] Let be the current value function in the Markov decision process.

[0119] For time slots

[0120] The reward vector,

[0121] As a discount factor,

[0122] For time slots

[0123] In the following state, the discounted return is represented as

[0124]

[0125] The advantage estimation function is expressed as:

[0126]

[0127] Further, step 4 involves using a decomposition-based multi-objective evolutionary algorithm and reward vectors to update the solution set and fitness values, returning the Pareto optimal solution set that maximizes the fitness values ​​as the learning parameters for the executor network. Specifically:

[0128] Initial fitness vector of a decomposition-based multi-objective evolutionary algorithm

[0129] For the reward vector

[0130] The fitness value, which can evaluate overall performance, can be obtained using the Chebyshev method.

[0131]

[0132] in

[0133] Indicates the first

[0134] The weights of each objective function,

[0135] Indicates the first

[0136] The objective function has a reference point. The fitness value of each solution is compared, Pareto optimality is determined, and the set of solutions that achieves the optimal fitness value is returned as the learning parameters for the Actor network.

[0137] Further, step 5, which constrains policy updates using a truncation method, involves calculating the loss function of the executor-evaluator network and updating the policy, specifically as follows:

[0138] The near-end policy optimization algorithm uses the Clip truncation method to limit the magnitude of policy updates during updates. At each update, the Clip algorithm calculates the ratio between the new and old policies and then limits this ratio to a range determined by hyperparameters.

[0139] Control. Therefore, the loss function can be defined as

[0140]

[0141] in

[0142] clip() is the truncation function.

[0143] like

[0144] If the reward generated by the current action is greater than the expected reward of the baseline action, then the updated strategy increases the probability of this action occurring, but this probability is no higher than that of the original strategy.

[0145] times; conversely if

[0146] This reduces the probability of the action occurring, and this probability is not lower than that of the original strategy.

[0147] The result is achieved by maximizing the loss function.

[0148] Update strategy.

[0149] Compared with the prior art, the significant advantages of this invention are:

[0150] (1) Digital twin technology and mobile edge computing technology are used to improve the real-time performance and accuracy of data, improve the data computing efficiency and intelligence level of the Internet of Vehicles, and alleviate the high latency and high energy consumption problems of the Internet of Vehicles. (2) A multi-objective decision-making method for offloading edge computing of the Internet of Vehicles based on digital twins is adopted. By using the truncation method to limit the range of policy updates, the stability and convergence speed of the algorithm are greatly improved. (3) The multi-objective decision-making method for offloading edge computing of the Internet of Vehicles based on digital twins can effectively solve the multi-objective optimization problem, minimize the long-term task offloading cost in the trade-off between multiple objectives such as time delay, energy consumption and cloud computing cost, and realize low-cost and high-efficiency intelligent task offloading. Attached Figure Description

[0151] Figure 1 This is a flowchart of the multi-objective decision-making method for edge computing offloading in the Internet of Vehicles based on digital twins, as described in this invention.

[0152] Figure 2 This is a schematic diagram of the vehicle edge network topology based on digital twins according to an embodiment of the present invention.

[0153] Figure 3 This is a schematic diagram illustrating the learning convergence effect of different schemes in the embodiments of the present invention.

[0154] Figure 4 This is a comparison chart of the loss functions of different schemes in the embodiments of the present invention.

[0155] Figure 5 This is a comparison chart of reward values ​​for different schemes under different task data volumes in embodiments of the present invention.

[0156] Figure 6 This is a comparison chart of reward values ​​for different schemes under different numbers of vehicle users in embodiments of the present invention.

[0157] Figure 7 This is a comparison chart of reward values ​​for different schemes under different numbers of MECs in the embodiments of the present invention. Detailed Implementation

[0158] The purpose of this invention is to provide a multi-objective decision-making method for offloading edge computing in the Internet of Vehicles based on digital twins, which minimizes the offloading cost of long-term tasks in the trade-off between multiple objectives such as time delay, energy consumption and cloud computing costs, and achieves low-cost and high-efficiency intelligent task offloading.

[0159] The technical solution to achieve the purpose of this invention is: a multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins, combined with... Figures 1-2 This includes the following steps:

[0160] Step 1: Input the edge vehicle network environment based on digital twin, initialize the executor-evaluator network parameters, and initialize the algorithm parameters of the decomposition-based multi-objective evolutionary algorithm;

[0161] Step 2: In the current time slot, each vehicle user selects an unloading plan based on the action generated in the previous time slot and obtains the reward vector and completion flag from the environmental feedback, which are then stored in the cache.

[0162] Step 3: Use mean squared error regression to fit the value function based on the buffer data, and calculate the advantage estimate based on the current value function and discounted return;

[0163] Step 4: Update the solution set and fitness value using a decomposition-based multi-objective evolutionary algorithm and reward vector, and return the Pareto optimal solution set that optimizes the fitness value as the learning parameters for the executor network.

[0164] Step 5: Constrain the policy update using the truncation method, calculate the loss function of the executor-evaluator network, and update the policy;

[0165] Step 6: When the completion flag is true, end the current round, start the next round, re-enter the edge vehicle networking environment based on digital twin, and repeat steps 2 to 5.

[0166] Further, the input in step 1 is a digital twin-based edge vehicle-to-everything (V2X) environment, wherein the digital twin-based edge V2X environment includes:

[0167] (1) Time slot model: The continuous training time is discretized into multiple time slots, using positive integers.

[0168] To indicate the first

[0169] A time slot; it is assumed that a vehicle user completes an edge unloading decision and location movement within a single time slot, and that environmental conditions such as transmit power and channel noise do not change within a single time slot. When all vehicle users arrive at the set road endpoint, it is called a round.

[0170] (2) Network Model: Establish a two-layer digital twin vehicle network edge network model consisting of a physical entity layer and a digital twin layer. Assume the physical entity layer includes...

[0171] There are one vehicle user and one base station equipped with a MEC server. The set of vehicle users is represented as...

[0172] ,vehicle

[0173] In the time slot

[0174] Digital twin representation

[0175] ,in

[0176] ,

[0177] Is DT to

[0178] Time-slot vehicles

[0179] The estimated value of the true value of the task calculation frequency.

[0180] It is a time slot

[0181] vehicle

[0182] The task calculates the error between the actual frequency value and the DT estimate. Digital twins of all vehicles, base stations, and MEC servers constitute the digital twin layer.

[0183] (3) Communication Model: Assuming one-way traffic, vehicles can communicate with base stations wirelessly or with cloud servers and digital twin networks via base stations along the road. Assuming the road is an open area without obstacles, and disregarding the impact of environmental factors on path loss, a simple free-space model is used for the path loss. Therefore, the channel gain between the vehicle and the base station can be expressed as...

[0184]

[0185] make

[0186] For vehicle users

[0187] Rayleigh attenuation of the channel between the base station and the station, where

[0188] Let be the corresponding scaling parameter.

[0189] This is the path loss factor.

[0190] Indicates vehicle

[0191] exist

[0192] The distance between the time slot and the base station.

[0193] Information transmission is subject to various interferences, and Gaussian white noise, as a general noise model, can be used to describe noise interference in the channel. Therefore...

[0194] Time-slot vehicles

[0195] The information transmission rate between the base station and the base station can be expressed as:

[0196]

[0197] in,

[0198] For vehicle serial number,

[0199] This indicates that the base station is assigned to the vehicle.

[0200] Available bandwidth,

[0201] Indicates vehicle

[0202] exist

[0203] Signal transmission power between time slots and base stations

[0204] The power of the Gaussian white noise is given.

[0205] (4) Computational model: using binary pairs

[0206] describe

[0207] Time-slot vehicles

[0208] The task, among which

[0209] express

[0210] Time-slot vehicles

[0211] The size of the task data sent.

[0212] Represents the unit of calculation (bit).

[0213] Time-slot vehicles

[0214] The number of computation cycles required for the task data. There are latency errors when real-time data interaction occurs between the digital twin and the physical entity.

[0215] The error between the actual time slot calculation delay and the digital twin estimated time slot delay can be expressed as:

[0216]

[0217] Time-slot vehicles

[0218] Task calculation time is

[0219]

[0220] Time-slot vehicles

[0221] Task transmission time is

[0222]

[0223] Time-slot vehicles

[0224] The energy consumption for task transmission is

[0225]

[0226] Time-slot vehicles

[0227] The computational energy consumption when the task is unloaded to local computing is

[0228]

[0229] Time-slot vehicles

[0230] The energy consumption is calculated when the task is unloaded to the MEC server.

[0231]

[0232] The cost of renting cloud computing cannot be ignored; let's assume the unit cost of renting cloud computing is...

[0233] Dollar,

[0234] Time-slot vehicles

[0235] The cost of transferring tasks to a cloud computing platform is

[0236]

[0237] Time-slot vehicles

[0238] The cost of cloud computing for the task is

[0239]

[0240] in

[0241] It is a price factor related to cloud service providers.

[0242] for

[0243] Time-slot vehicles

[0244] The total time cost, energy cost, and cloud computing cost of offloading computing tasks are:

[0245]

[0246] Further, in step 2, each vehicle user selects an unloading scheme based on the action generated in the previous time slot, obtains the reward vector and completion flag from the environmental feedback, and stores them in the cache. Specifically:

[0247] (1) Actions of vehicle users

[0248] Time slot

[0249] The motion vector selected by the vehicle user

[0250] It can be represented as

[0251]

[0252] Time slot

[0253] Get off the vehicle

[0254] action

[0255] A value of 0 indicates that the vehicle is connected to local computing, a value of 1 indicates that the vehicle is connected to MEC computing, and a value of 2 indicates that the vehicle is connected to cloud computing.

[0256] (2) System rewards

[0257] The multi-objective decision-making method for offloading edge computing in vehicle-to-everything (V2X) based on digital twins has two optimization objectives: minimizing the latency of offloading computing tasks from vehicle users and minimizing energy consumption and cloud computing costs. When the cost value is large, parameters are used.

[0258] Reduce the reward value.

[0259] The system reward for a time slot with the goal of minimizing latency is set as follows:

[0260]

[0261] To minimize energy consumption and cloud computing costs, the system reward is set as follows:

[0262]

[0263] Combining the system rewards of the two objectives yields the reward vector from the environmental feedback.

[0264] .

[0265] (3) Completion mark

[0266] Assuming vehicles travel at a constant speed in one direction, each vehicle user performs one vehicle movement and position update within a time slot. If all vehicle users reach the destination along the designated path from their respective starting points, one round ends and the returned completion flag (done) is true; otherwise, done is false.

[0267] Further, step 3, which involves using mean squared error regression to fit a value function based on the cache data and calculating the advantage estimate based on the current value function and discounted return, specifically involves:

[0268] We use mean squared error regression to fit a value function, the value function being:

[0269]

[0270] make

[0271] Let be the current value function in the Markov decision process.

[0272] For time slots

[0273] The reward vector,

[0274] As a discount factor,

[0275] For time slots

[0276] In the following state, the discounted return is represented as

[0277]

[0278] The advantage estimation function is expressed as:

[0279]

[0280] Further, step 4 involves using a decomposition-based multi-objective evolutionary algorithm and reward vectors to update the solution set and fitness values, returning the Pareto optimal solution set that maximizes the fitness values ​​as the learning parameters for the executor network. Specifically:

[0281] Initial fitness vector of a decomposition-based multi-objective evolutionary algorithm

[0282] For the reward vector

[0283] The fitness value, which can evaluate overall performance, can be obtained using the Chebyshev method.

[0284]

[0285] in

[0286] Indicates the first

[0287] The weights of each objective function,

[0288] Indicates the first

[0289] The objective function has a reference point. The fitness value of each solution is compared, Pareto optimality is determined, and the set of solutions that achieves the optimal fitness value is returned as the learning parameters for the Actor network.

[0290] Further, step 5, which constrains policy updates using a truncation method, involves calculating the loss function of the executor-evaluator network and updating the policy, specifically as follows:

[0291] The near-end policy optimization algorithm uses the Clip truncation method to limit the magnitude of policy updates during updates. At each update, the Clip algorithm calculates the ratio between the new and old policies and then limits this ratio to a range determined by hyperparameters.

[0292] Control. Therefore, the loss function can be defined as

[0293]

[0294] in

[0295] clip() is the truncation function.

[0296] like

[0297] If the reward generated by the current action is greater than the expected reward of the baseline action, then the updated strategy increases the probability of this action occurring, but this probability is no higher than that of the original strategy.

[0298] times; conversely if

[0299] This reduces the probability of the action occurring, and this probability is not lower than that of the original strategy.

[0300] The result is achieved by maximizing the loss function.

[0301] Update strategy.

[0302] The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.

[0303] Example

[0304] An embodiment of the present invention is described in detail below. The simulation is performed using Python programming, and the parameter settings do not affect the generality. Methods that can be compared with the above method include: (1) a digital twin-assisted vehicle network edge computing task offloading decision method based on single-objective proximal policy optimization; (2) a vehicle network edge computing task offloading decision method based on multi-objective proximal policy optimization; (3) a vehicle network edge computing task offloading decision method based on single-objective proximal policy optimization; and (4) a vehicle network edge computing task offloading decision method based on the executor-evaluator algorithm.

[0305] Vehicle edge network model such as Figure 2 As shown in Table 1, there are 12 vehicle users, 1 MEC server, and the base station covers the entire road area. The road is 1 kilometer long, with each vehicle starting at a 10-meter interval. Vehicles travel at a constant speed of 50 km / h in one direction. The amount of vehicle task data, computation frequency, and computation density are all random values ​​within a set range. The main simulation parameters are shown in Table 1.

[0306] Table 1 Main Simulation Parameters

[0307]

[0308] like Figure 3 As shown, compared with the various comparative schemes, the multi-objective decision-making method for offloading vehicle-to-everything edge computing based on digital twins has the fastest convergence speed, the highest reward value, the best stability, and the best performance after convergence. This verifies that the method can minimize the offloading cost of long-term tasks in the trade-off between multiple objectives such as time delay, energy consumption, and cloud computing costs, and achieve low-cost and high-efficiency intelligent task offloading.

[0309] The vehicular network edge computing task offloading decision method based on the executor-evaluator algorithm converges slower and has the smallest reward value after convergence compared to other methods. This is because the method struggles to handle high-dimensional state and action spaces. The vehicular network edge computing task offloading decision method based on single-objective proximal policy optimization uses the clip method to limit the size of each policy update, thus avoiding the problem of excessively large policy updates and making the algorithm converge faster and more stable. However, it does not consider multi-objective optimization, so its performance is inferior to the vehicular network edge computing task offloading decision method based on multi-objective proximal policy optimization. One advantage of digital twins is their ability to help reduce errors in long-distance information transmission. Therefore, methods based on digital twins are slightly better than those without digital twins.

[0310] like Figure 4 As shown, the loss function of the multi-objective decision-making method for offloading edge computing in the Internet of Vehicles based on digital twins is smaller than that of other methods, and the loss curve fluctuates less after convergence, indicating that the method is more stable and performs better. This is because the multi-objective decision-making method for offloading edge computing in the Internet of Vehicles based on digital twins combines the advantages of decomposition-based multi-objective evolutionary algorithms and proximal policy optimization, making it more efficient and stable, and better able to achieve multi-objective collaborative optimization.

[0311] like Figure 5-7 As shown, with the increase in task data volume, number of vehicle users, and number of MECs, the average convergence reward of each scheme gradually decreases. This is because the algorithm learning complexity increases and the algorithm performance decreases. However, when changing various parameters, the multi-objective decision-making method for offloading edge computing in the vehicle network based on digital twins shows the smallest decrease and has the highest reward, further verifying the superiority of the multi-objective decision-making method for offloading edge computing in the vehicle network under complex vehicle network conditions.

[0312] The foregoing has shown and described the basic principles, main features, and advantages of the present invention. Those skilled in the art should understand that the present invention is not limited to the above embodiments. The embodiments and descriptions in the specification are merely illustrative of the principles of the invention. Various changes and modifications can be made to the invention without departing from its spirit and scope, and all such changes and modifications fall within the scope of the present invention as claimed. The scope of protection of the present invention is defined by the appended claims and their equivalents.

Claims

1. A multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins, characterized in that, Includes the following steps: Step 1: Input the edge vehicle-to-everything (V2X) environment based on digital twins, initialize the executor-evaluator network parameters, and initialize the parameters of the decomposition-based multi-objective evolutionary algorithm; Step 2: In the current time slot, each vehicle user selects an unloading plan based on the action generated in the previous time slot and obtains the reward vector and completion flag from the environmental feedback, which are then stored in the cache. Step 3: Use mean squared error regression to fit the value function based on the buffer data, and calculate the advantage estimate based on the current value function and discounted return; Step 4: Update the solution set and fitness value using a decomposition-based multi-objective evolutionary algorithm and reward vector, and return the Pareto optimal solution set that optimizes the fitness value as the learning parameters for the executor network. Step 5: Constrain the policy update using the truncation method, calculate the loss function of the executor-evaluator network, and update the policy; Step 6: When the completion flag is true, end the current round, start the next round, re-enter the edge vehicle networking environment based on digital twin, and repeat steps 2 to 5.

2. The multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins as described in claim 1, characterized in that, The input of the digital twin-based edge vehicle networking environment in step 1 includes: (1) Time slot model: The continuous training time is discretized into multiple time slots, using positive integers. To indicate the first A time slot; it is assumed that a vehicle user completes an edge buffer decision and location movement within a single time slot, and the environmental state of transmit power and channel noise does not change within a single time slot. When all vehicle users arrive at the set road end point, it is called a round. (2) Network Model: Establish a two-layer digital twin vehicle network edge network model consisting of a physical entity layer and a digital twin layer; assume that the physical entity layer includes There are one vehicle user and one base station equipped with an MEC server; the set of vehicle users is represented as... ,vehicle In the time slot Digital twin representation ,in , Is DT to Time-slot vehicles The estimated value of the true value of the task calculation frequency. It is a time slot vehicle The error between the actual value of the task calculation frequency and the DT estimate; digital twins of all vehicles, base stations and MEC servers constitute the digital twin layer; (3) Communication Model: Assuming one-way traffic, vehicles can communicate with base stations wirelessly or with cloud servers and digital twin networks via base stations along the road; assuming the road is an open area without obstacles, and disregarding the impact of environmental factors on path loss, a simple free-space model is adopted for the path loss model; therefore, the channel gain between the vehicle and the base station can be expressed as... make For vehicle users Rayleigh attenuation of the channel between the base station and the station, where Let be the corresponding proportional parameter; This is the path loss factor. Indicates vehicle exist The distance between the time slot and the base station; Information transmission is subject to various interferences, and Gaussian white noise, as a general noise model, can be used to describe noise interference in the channel; therefore Time-slot vehicles The information transmission rate between the base station and the base station can be expressed as: in, For vehicle serial number, This indicates that the base station is assigned to the vehicle. Available bandwidth, Indicates vehicle exist Signal transmission power between time slots and base stations The power of the Gaussian white noise; (4) Computational model: using binary pairs describe Time-slot vehicles The task, among which express Time-slot vehicles The size of the task data sent. Represents the unit of calculation (bit). Time-slot vehicles The number of computation cycles required for the task data; latency errors exist when digital twins and physical entities interact with each other in real time; The error between the actual time slot calculation delay and the digital twin estimated time slot delay can be expressed as: Time-slot vehicles Task calculation time is Time-slot vehicles Task transmission time is Time-slot vehicles Task transmission energy consumption is Time-slot vehicles The computational energy consumption when the task is unloaded to local computing is Time-slot vehicles The energy consumption is calculated when the task is unloaded to the MEC server. The cost of renting cloud computing cannot be ignored; let's assume the unit cost of renting cloud computing is... Dollar, Time-slot vehicles The cost of transferring tasks to a cloud computing platform is Time-slot vehicles The cost of cloud computing for the task is in It is a price factor related to cloud service providers; for Time-slot vehicles The total time cost, energy cost, and cloud computing cost of offloading computing tasks are: 。 3. The multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins as described in claim 2, characterized in that, Step 2 describes how, in the current time slot, each vehicle user selects an unloading plan based on the action generated in the previous time slot, obtains the reward vector and completion flag from the environmental feedback, and stores them in the cache. Specifically: (1) Actions of vehicle users Time slot The motion vector selected by the vehicle user It can be represented as Time slot Get off the vehicle action A value of 0 indicates that the vehicle is connected to local computing, a value of 1 indicates that the vehicle is connected to MEC computing, and a value of 2 indicates that the vehicle is connected to cloud computing. (2) System rewards The multi-objective decision-making method for offloading edge computing in vehicle-to-everything (V2X) based on digital twins has two optimization objectives: minimizing the latency of offloading computing tasks from vehicle users and minimizing energy consumption and cloud computing costs. When the cost value is large, parameters are used. Reduce the reward value. The system reward for a time slot with the goal of minimizing latency is set as follows: To minimize energy consumption and cloud computing costs, the system reward is set as follows: Combining the system rewards of the two objectives yields the reward vector from the environmental feedback. ; (3) Completion mark Assuming vehicles travel at a constant speed in one direction, each vehicle user moves and updates its position once per time slot. If all vehicle users reach the destination of the specified path from their respective starting points, one round ends and the completed flag "done" is true; otherwise, "done" is false.

4. The multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins according to claim 3, characterized in that, Step 3 describes using mean squared error regression to fit a value function based on the cache data, and calculating the advantage estimate based on the current value function and discounted return. Specifically: We use mean squared error regression to fit a value function, the value function being: make Let be the current value function in the Markov decision process. For time slots The reward vector, As a discount factor, For time slots In the following state, the discounted return is represented as The advantage estimation function is expressed as: 。 5. The multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins according to claim 4, characterized in that, Step 4 describes using a decomposition-based multi-objective evolutionary algorithm and reward vectors to update the solution set and fitness values, returning the Pareto optimal solution set that maximizes the fitness values ​​as the learning parameters for the executor network. Specifically: Initial fitness vector of a decomposition-based multi-objective evolutionary algorithm For the reward vector The fitness value, which can evaluate overall performance, can be obtained using the Chebyshev method. in Indicates the first The weights of each objective function, Indicates the first The reference point for the objective function; The fitness values ​​of each solution are compared to determine Pareto optimality, and the set of solutions that achieves the optimal fitness value is returned as the learning parameters for the Actor network.

6. The multi-objective decision-making method for edge computing offloading in vehicle-to-everything (V2X) networks based on digital twins according to claim 5, characterized in that, Step 5, which involves constraining policy updates using a truncation method, calculates the loss function of the executor-evaluator network and updates the policy, specifically includes: The near-end policy optimization algorithm uses the Clip truncation method to limit the magnitude of policy updates during updates. At each update, the Clip algorithm calculates the ratio between the new and old policies and then limits this ratio to a range determined by hyperparameters. Control; therefore, the loss function can be defined as in clip() is the truncation function; like If the reward generated by the current action is greater than the expected reward of the baseline action, then the updated strategy increases the probability of this action occurring, but this probability is no higher than that of the original strategy. times; conversely if This reduces the probability of the action occurring, and this probability is not lower than that of the original strategy. Times; ultimately by maximizing the loss function Update strategy.