A method for dynamic access optimization of a satellite-ground laser communication multi-optical ground station
By constructing site state vectors and graph convolutional feature transfer, and combining spatial correlation length and stability constraint scores, the selection of active sites is optimized, which solves the problems of frequent handover and insufficient link stability in dynamic access of multiple optical ground stations, and improves long-term link availability and access stability.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- CHANGCHUN UNIV OF SCI & TECH
- Filing Date
- 2026-05-14
- Publication Date
- 2026-06-19
Smart Images

Figure CN122247513A_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of satellite optical communication and intelligent scheduling technology, and in particular relates to a dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication. Background Technology
[0002] In satellite-to-ground laser communication systems, the access capability of a single optical ground station is easily affected by local weather changes. In particular, cloud cover can cause link interruptions, and atmospheric turbulence can degrade link transmission quality. To improve the overall availability of satellite-to-ground laser communication systems, multiple optical ground stations are typically used for collaborative access, and the active station is dynamically selected based on changes in the link status of each station during communication.
[0003] Most existing dynamic access methods for multiple optical ground stations adopt reactive scheduling based on the current station status. For example, they select the station with the highest current link quality and switch to a backup station when the link at the current station deteriorates. Although this type of method can improve link quality at local moments, it does not fully utilize the spatial relationship between stations and the correlation information between changes in station status. Furthermore, it usually does not incorporate the interruption overhead, reacquisition overhead, and hold operations caused by handover into a unified decision-making process, which can easily lead to frequent handovers, decreased link stability, and low overall access efficiency.
[0004] Furthermore, while some existing methods employ graph structures and learning-based scheduling models, their graph construction typically relies solely on simple connections between stations based on geometric distances, lacking utilization of historical correlations in link states. This makes it difficult for the graph structure to accurately represent the spatial relationships between multiple optical ground stations. Although some existing methods introduce handover costs into the reward function, they still primarily rely on instantaneous link gains for decision-making, failing to explicitly consider the sustained stability of the target station within a short future timeframe. Consequently, they are prone to frequent handovers and short-stay handovers.
[0005] Therefore, there is an urgent need for a dynamic access optimization method for multiple optical ground stations that can comprehensively consider link status, spatial correlation structure, handover cost, and dwell stability under the constraint that only one optical ground station can be in active access state at any given time, so as to improve long-term link availability and access stability during multi-station collaborative access. Summary of the Invention
[0006] To address the aforementioned issues, this invention proposes a dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication. This method comprehensively optimizes link quality, handover cost, and dwell stability under the constraint that only one optical ground station can be in an active access state at any given time, outputs the target active station, and generates corresponding control results.
[0007] The above objectives are achieved through the following technical solutions:
[0008] The present invention provides a dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication, comprising the following steps:
[0009] Step 1: Obtain link status data for the current time and historical time periods of multiple optical ground stations, and construct the historical status sequence of each station;
[0010] Step 2: Based on the historical state sequence constructed in Step 1, combined with the current activation state, the time since the last switch, and time characteristics, construct the site state vector of each optical ground station, and generate the dwell stability factor of each station.
[0011] Step 3: Generate the initial site representation of each optical ground station based on the site state vector obtained in Step 2; and introduce the spatial correlation length determined by the spatial distance between sites and the correlation of historical link states to construct a weighted site association graph.
[0012] Step 4: Using the initial site representations of each optical ground station generated in Step 3 as graph node features, and the weighted site association graph constructed in Step 3 as the edge weight relationship between nodes, graph convolution feature transfer is performed; after... After passing the layered graph convolutional features, the final site representation matrix is obtained. The final station represents the i-th station in the matrix. A site representation that integrates the site's own status information with spatial relationship information between sites;
[0013] Step 5: Introduce a centralized action space containing multiple site switching actions and actions to maintain the current site, and use the final site representation obtained in Step 4 and the dwell stability factor generated in Step 2 to score the stability constraints of each action.
[0014] Step 6: Under the constraint that only one optical ground station is allowed to be in active access state at any given time, the target active station is determined based on the stability constraint action score generated in Step 5, and the corresponding control result is output.
[0015] Step 7: Based on the link quality status data corresponding to the target active site determined in Step 6, the site switching cost generated by switching from the current active site to the target active site, and the target active site dwell stability factor generated in Step 2, construct a reward function; use the reward function to update the trainable parameters included in the graph convolution feature transfer process in Step 4 and the stability constraint action scoring process in Step 5.
[0016] Step 8: Based on the stability constraint action score updated in Step 7, under the constraint that only one optical ground station is allowed to be in active access state at the same time, re-output the target active station and its corresponding control result as the dynamic access optimization result after joint optimization of link quality, handover cost and dwell stability at the current time.
[0017] Further, in step 1, the link status data includes cloud cover status data, atmospheric turbulence status data, link quality status data, and time information corresponding to each time step for each optical ground station at the current time and historical time periods; continuous status data is extracted according to a preset historical time window length to construct a historical status sequence for each station; for the first... The historical state sequence expression for each optical ground station is as follows: in, Indicates the first A historical state sequence of an optical ground station. Indicates the first An optical ground station at the current moment The previous Link state feature vector at each time step, , Indicates the length of the history status window. The dimension representing the link state features at a single time step.
[0018] Furthermore, the station state vector described in step 2, for the first... An optical ground station has a station state vector expression as follows: in, Indicates the first The site state vector of an optical ground station Represents a sequence of historical states The historical state representation obtained after aggregation Indicates the first The current activation status flag of each optical ground station. This indicates the number of time steps since the last switch for the currently active site. This indicates the time characteristics corresponding to the current moment. Indicates the first The dwell stability factor of the first optical ground station, which characterizes the ability of each station to maintain stable access over a short period of time. The dwell stability factor of an optical ground station is expressed as follows: in, Indicates the first Average link quality of an optical ground station within a historical state window. Indicates the first Link quality fluctuations of an optical ground station within a historical status window.
[0019] Further, in step 3, the initial site representation expression is: in, Indicates the first The initial site representation of an optical ground station, This represents the site status coding mapping function. This indicates the construction of the first step in step 2. The site state vector of an optical ground station.
[0020] In step 3, for the i-th optical ground station and the j-th optical ground station, the historical link state correlation expression is: in, Indicates the first The first optical ground station and the first Historical link status correlation between optical ground stations Indicates the first The link quality sequence of an optical ground station within a historical status window. Indicates the first The link quality sequence of an optical ground station within a historical status window;
[0021] The spatial correlation length mentioned in step 3 The correlation of historical link status is determined by the attenuation relationship between spatial distances between sites, and its expression is: in, Indicates spatial distance as Historical link status correlation at time This represents the baseline value for historical link status correlation when the spatial distance between sites is 0. Indicates the spatial distance between stations. Indicates spatially related length;
[0022] The weighted site association graph is used to characterize the spatial association strength between sites. For the first site... The first optical ground station and the first The edge weight expression of the weighted site association graph for an optical ground station is: in, Indicates the number of sites in the weighted site association graph. The first optical ground station and the first Border weights between optical ground stations Indicates the first The first optical ground station and the first Spatial distance between optical ground stations Indicates the site association threshold distance; This indicates the value that retains the correlation between non-negative historical link states.
[0023] Further, in step 4, the graph convolution feature transfer process uses the initial site representations of each optical ground station generated in step 3 as graph node features, and the weighted site association graph constructed in step 3 as the edge weight relationship between nodes, and performs graph convolution feature transfer on the weighted site association graph; after After passing the layered graph convolutional features, the final site representation matrix is obtained. The final station represents the i-th station in the matrix. To represent a site by integrating its own state information with spatial association information between sites, the graph convolutional feature transfer expression is as follows: in, Indicates the first The station representation matrix is the input to the layer graph convolution. Indicates the first The site representation matrix output by layer graph convolution. Represents the weighted site association matrix The matrix obtained after normalization Indicates the first The trainable parameter matrix of layer graph convolution, This represents a non-linear activation function.
[0024] Furthermore, the stability constraint score mentioned in step 5 is used to improve the persistent dwell capability after target site handover, including handover action score and current site maintenance action score, specifically,
[0025] For the For each optical ground station, the handover action scoring expression is: in, Indicates switching to the first Action score of each optical ground station This indicates the th feature after graph convolution feature propagation. The site information for each optical ground station indicates that... Indicates the number of convolutional layers in the graph. This represents the switching action scoring mapping function; the Corresponding final site representation matrix The final site representation of the i-th optical ground station.
[0026] The current site action scoring expression remains as follows: in, This indicates that the action score of the currently active site will be maintained. This indicates the active site number at the previous moment. This indicates the dwell stability factor corresponding to the activated site in the previous time step. This indicates the number of dwell time steps corresponding to the previous activated site. This represents the action scoring mapping function. Corresponding final site representation matrix The final site representation of the site that was activated at the previous moment.
[0027] Furthermore, in step 6, the determination of the target activation station based on the stability constraint action score generated in step 5 specifically involves outputting a hold control result when the action score for maintaining the current station is optimal; and outputting a handover control result when the handover action score for a certain station is optimal. The action selection expression is as follows: in, Indicates time The target action, This indicates the number of optical ground stations. The target action is determined by a stability constraint action score.
[0028] The correspondence between the target action and the target active site number is as follows: when the target action corresponds to a site switching action, the target active site number at the current moment is the number of the optical ground station to which the switch is made; when the target action corresponds to a "stay at current site" action, the target active site number at the current moment is the number of the active site at the previous moment. The link quality score and dwell stability factor in the reward function both correspond to the optical ground station indicated by the target active site number at the current moment.
[0029] Furthermore, the expression for the reward function in step 7 is: in, Indicates time The return value, Indicates the target site is activated at time. The corresponding link quality score, This represents the site switching cost coefficient. Indicates time Indicator variables for site switching, This represents the residency stability benefit coefficient. This indicates the residency stability factor corresponding to the target activated site. This represents the penalty coefficient for short-term dwell oscillations. This indicates a short dwell time indicator variable, which is an indicator variable whose dwell time at the activated station in the previous time step is less than the minimum expected dwell time step threshold. Among them, switching indicator variables The expression is: in, Indicates the target active site number at the current moment. Indicates the station number that was activated at the previous moment; Short-stay indicator variable The expression is: in, This indicates the number of dwell time steps corresponding to the previous activated site. This represents the minimum expected dwell time step threshold.
[0030] Compared with the prior art, the present invention has the following beneficial effects:
[0031] 1. By constructing a site state vector that includes the current link state, historical state, current active state, time since last switch, and dwell stability factor, the problem of reactive scheduling based solely on the current single site state in existing technologies is overcome, thereby improving the ability of scheduling decisions to represent changes in link state and dwell stability.
[0032] 2. By introducing a spatial correlation length determined by both spatial distance and historical link status correlation, and constructing a weighted site association graph accordingly, the problem of single graph structure construction basis and insufficient spatial association expression in existing technologies is overcome, thereby improving the accuracy of spatial modeling and the globality of scheduling in multi-site collaborative scenarios.
[0033] 3. By using the initial site representation of each optical ground station as the graph node feature and the weighted site association graph as the edge weight relationship between nodes to carry out graph convolution feature transfer, the final site representation that integrates the site's own state information and the spatial association information between sites is obtained. In addition, the stability constraint score of the handover action and the hold action is combined with the dwell stability factor, which overcomes the problem of the action decision based only on the instantaneous link benefit in the existing technology, thereby improving the stability and continuous availability of the target activation site selection process.
[0034] 4. By unifying link quality, site handover cost, dwell stability gain, and short dwell oscillation penalty into the reward function, the problems of frequent handover and short dwell handover in the existing technology are overcome, thereby improving the long-term link availability and access stability in the dynamic access process of multiple optical ground stations.
[0035] 5. By adopting a spatially related length-driven weighted mapping mechanism, a switching stability-driven action scoring mechanism, a reward function update mechanism, and a mechanism for re-outputting updated control results, a closed-loop optimization from multi-station link status perception to dynamic access optimization result output is achieved, which improves the engineering application value and long-term scheduling stability in multi-optical ground station collaborative access scenarios. Attached Figure Description
[0036] Figure 1 This is a flowchart illustrating the overall process of the method of the present invention.
[0037] Figure 2 This is a schematic diagram illustrating the construction of the site state vector, spatial correlation length, and weighted site association graph in this invention.
[0038] Figure 3 This is a schematic diagram illustrating the convolutional feature transfer and stability constraint action score generation in this invention.
[0039] Figure 4 This is a performance comparison chart of the method of this invention with other scheduling methods in multi-time-step scheduling tasks. Figure 4 In the table, (a) shows a comparison of long-term returns for different scheduling methods, and (b) shows a comparison of the number of switching operations for different scheduling methods.
[0040] Figure 5 This is a schematic diagram of the site switching trajectory during the continuous scheduling process of the method of the present invention. Detailed Implementation
[0041] The present invention will now be described in detail with reference to a specific embodiment. This embodiment is used to illustrate the present invention.
[0042] This embodiment describes a dynamic access optimization method for a ground station network consisting of 45 optical ground stations. These 45 optical ground stations are distributed across three climate zones: inland Europe, the Mediterranean, and the Tibetan Plateau, with 15 stations in each climate zone. The method includes the following steps.
[0043] Step 1: Obtain link status data and construct historical status sequences.
[0044] like Figure 1 As shown, this embodiment first performs the process of acquiring link status data from multiple optical ground stations and constructing historical status sequences.
[0045] First, link status data for the current time and historical time periods from multiple optical ground stations are acquired. This link status data includes cloud cover status data, atmospheric turbulence status data, link quality status data, and time information corresponding to each time step, arranged in chronological order.
[0046] In this embodiment, the link status data is constructed using ERA5 reanalysis data and link status evaluation results, with a time range from 2010 to 2023, a time resolution of 1 hour, and a spatial resolution of 0.25°.
[0047] Subsequently, continuous state data was captured using a fixed historical window to construct the historical state sequence of each optical ground station.
[0048] In this embodiment, the length of the historical status window Hour.
[0049] For the The historical state sequence expression for each optical ground station is as follows: in, Indicates the first A historical state sequence of an optical ground station. Indicates the first An optical ground station at the current moment The previous Link state feature vector at each time step, , The dimension representing the link state features at a single time step.
[0050] Step 2: Construct the site state vector and generate the residency stability factor.
[0051] like Figure 2 As shown, after constructing the historical state sequence of each station in step 1, the station state vector of each optical ground station is further constructed, and the dwell stability factor is generated.
[0052] Based on the historical state sequence constructed in step 1, and combined with the current active state, the time since the last switch, and time characteristics, a site state vector for each optical ground station is constructed. Simultaneously, a dwell stability factor is generated based on the link quality changes within the historical state window.
[0053] For the An optical ground station has a station state vector expression as follows: in, Indicates the first The site state vector of an optical ground station Represents a sequence of historical states The historical state representation obtained after aggregation Indicates the first The current activation status flag of each optical ground station. This indicates the number of time steps since the last switch. Indicates the time characteristics of the current moment. Indicates the first The dwell stability factor of an optical ground station.
[0054] For the The dwell stability factor of an optical ground station is expressed as follows: in, Indicates the first Average link quality of an optical ground station within a historical state window. Indicates the first Link quality fluctuations of an optical ground station within a historical status window.
[0055] In this embodiment, the station state vector has a dimension of 13.
[0056] Step 3: Generate initial site representations and construct a spatially relevant length-driven weighted site association graph.
[0057] like Figure 2 As shown, after obtaining the state vectors of each site in step 2, an initial site representation of each optical ground station is generated based on the site state vectors obtained in step 2; and a weighted site association graph is constructed by introducing a spatial correlation length jointly determined by the spatial distance between sites and the correlation of historical link states. The weighted site association graph is used to characterize the spatial correlation strength between different optical ground stations.
[0058] For the An optical ground station, whose initial site representation expression is: in, Indicates the first The initial site representation of an optical ground station, This represents the site status coding mapping function. This indicates the construction of the first step in step 2. The site state vector of an optical ground station.
[0059] In this embodiment, the initial site representation dimension is 64.
[0060] For the The first optical ground station and the first The historical link status correlation expression for each optical ground station is as follows: in, Indicates the first The first optical ground station and the first Historical link status correlation between optical ground stations This represents the correlation calculation function. Indicates the first The link quality sequence of an optical ground station within a historical status window. Indicates the first The link quality sequence of an optical ground station within a historical status window.
[0061] Spatial related length The correlation of historical link status is determined by the attenuation relationship between spatial distances between sites, and its expression is: in, Indicates spatial distance as Historical link status correlation at time This represents the baseline value for historical link status correlation when the spatial distance between sites is 0. Indicates the spatial distance between stations. Indicates the spatially related length.
[0062] For the The first optical ground station and the first The edge weight expression of the weighted site association graph for an optical ground station is: in, Indicates the number of sites in the weighted site association graph. The first optical ground station and the first Border weights between optical ground stations Indicates the first The first optical ground station and the first Spatial distance between optical ground stations Indicates spatially related length. Indicates the threshold distance for site association. This indicates the value that retains the correlation between non-negative historical link states.
[0063] In this embodiment, spatially related length km, site association threshold distance The constructed weighted site association graph has 96 edges and a graph density of 4.8%.
[0064] Step 4: Perform graph convolution feature transfer.
[0065] like Figure 3 As shown, after generating the initial site representation and constructing the weighted site association graph in step 3, the initial site representations of each optical ground station generated in step 3 are used as graph node features, and the weighted site association graph constructed in step 3 is used as the edge weight relationship between nodes. Graph convolution feature transfer is then performed on the weighted site association graph. After passing the layered graph convolutional features, the final site representation matrix is obtained. The final station represents the i-th station in the matrix. A site representation that integrates the site's own status information with the spatial relationship information between sites.
[0066] The expression for feature transfer in graph convolution is: in, Indicates the first The station representation matrix is the input to the layer graph convolution. Indicates the first The site representation matrix output by layer graph convolution. Represents the weighted site association matrix The matrix obtained after normalization Indicates the first The trainable parameter matrix of layer graph convolution, This represents a non-linear activation function.
[0067] In this embodiment, the number of graph convolutional layers .
[0068] Step 5: Generate stability constraint action scores.
[0069] like Figure 3 As shown, the final site representation matrix is obtained in step 4. Subsequently, a centralized action space is introduced, which includes multiple site switching actions and actions to maintain the current site, and this is combined with the final site representation matrix. The site representation and resident stability factor in the system generate stability constraint action scores.
[0070] For the For each optical ground station, the handover action scoring expression is: in, Indicates switching to the first Action score of each optical ground station Indicates the process After feature propagation in layer graph convolution, the first The site information for each optical ground station indicates that... Indicates the first The dwell stability factor of an optical ground station This represents the switching action scoring mapping function. Corresponding final site representation matrix The final site representation of the i-th optical ground station.
[0071] The current site action scoring expression remains as follows: in, This indicates that the action score of the currently active site will be maintained. This indicates the active site number at the previous moment. This represents the site representation after graph convolution feature propagation from the previously activated site. This indicates the dwell stability factor corresponding to the activated site in the previous time step. This indicates the number of dwell time steps corresponding to the previous activated site. This represents the action scoring mapping function. Corresponding final site representation matrix The final site representation of the site that was activated at the previous moment.
[0072] In this embodiment, the action space dimension is 46, which includes 45 site switching actions and 1 action to keep the current site.
[0073] Step 6: Determine the target activation site and output the control results.
[0074] like Figure 3 As shown, after generating the scores for each action in step 5, the target active site is determined under the constraint that only one optical ground station is allowed to be in the active access state at the same time, and the corresponding control results are output.
[0075] The action selection expression is: in, Indicates time The target action, These represent switching to the 1st to the 2nd. Action score of each optical ground station This indicates that the action score of the currently active site will be maintained. This indicates the number of optical ground stations. The target action is determined by a stability constraint action score.
[0076] when When maintaining the current station action, output the hold control result; when When a switching action is performed at a specific station, the switching control result is output.
[0077] The correspondence between the target action and the target active site number is as follows: when the target action corresponds to a site switching action, the target active site number at the current moment is the number of the optical ground station to which the switch is made; when the target action corresponds to a "stay at current site" action, the target active site number at the current moment is the number of the active site at the previous moment. The link quality score and dwell stability factor in the reward function both correspond to the optical ground station indicated by the target active site number at the current moment.
[0078] Step 7: Construct the reward function and update the trainable parameters.
[0079] like Figure 1 As shown, after determining the target active site in step 6, a reward function is constructed based on the link quality status data corresponding to the target active site determined in step 6, the site switching cost generated by switching from the current active site to the target active site, and the target active site residency stability factor generated in step 2.
[0080] The expression for the reward function is: in, Indicates time The return value, Indicates the target site is activated at time. The corresponding link quality score, This represents the site switching cost coefficient. Indicates time Indicator variables for site switching, This represents the residency stability benefit coefficient. This indicates the residency stability factor corresponding to the target activated site. This represents the penalty coefficient for short-term dwell oscillations. This indicates that the dwell time of the activated site in the previous moment is less than the minimum expected dwell time step threshold.
[0081] Among them, switching indicator variables The expression is: in, Indicates the target active site number at the current moment. This indicates the station number that was activated at the previous moment.
[0082] Among them, short-stay indicator variables The expression is: in, This indicates the number of dwell time steps corresponding to the previous activated site. This represents the minimum expected dwell time step threshold.
[0083] The trainable parameters included in the graph convolution feature transfer process in step 4 and the stability constraint action scoring process in step 5 are updated using the reward function.
[0084] In this embodiment, the site handover cost coefficient Resident stability benefit coefficient Short-term dwell oscillation penalty coefficient Minimum expected dwell time step threshold .
[0085] In this embodiment, the trainable parameters included in the graph convolution feature transfer process in step 4 and the stability constraint action scoring process in step 5 are trained using a proximal policy optimization algorithm. The discount factor is 0.99, the generalized advantage estimation parameter is 0.95, the pruning coefficient is 0.2, the value loss coefficient is 0.5, the entropy reward coefficient is 0.01, the learning rate is 3×10−4, and the gradient pruning threshold is 0.5. During training, each round is 168 hours long, the total number of training rounds is 500, each round is updated 10 times, and the mini-batch size is 32.
[0086] Step 8: Re-output the target active site and obtain the dynamic access optimization results.
[0087] like Figure 1 As shown, after updating the trainable parameters in step 7, based on the updated stability constraint action score, under the constraint that only one optical ground station is allowed to be in the active access state at any given time, the target active station and its corresponding control results are re-output.
[0088] The target active site and its corresponding control result output in step 8 are used as the dynamic access optimization result at the current moment after joint optimization of link quality, handover cost and dwell stability.
[0089] In the comparative verification, the method of this invention is compared with the static site method, the random switching method, the greedy scheduling method, the threshold switching method, and the distributed multi-agent scheduling method. For example... Figure 4 As shown, the method of the present invention outperforms the static site method, random switching method, greedy scheduling method and distributed multi-agent scheduling method in long-term returns, while significantly reducing the number of switching, indicating that the method of the present invention can effectively suppress frequent switching while maintaining high link quality.
[0090] like Figure 5As shown in the figure, the frequent switching trajectory indicated by the dashed line is the comparison trajectory generated by the greedy scheduling method. This greedy scheduling method selects the target active site based solely on the current link quality score at each time step, without incorporating site switching costs, short-term dwell oscillation penalties, and dwell stability gains into a unified reward constraint. Therefore, it is prone to frequent switching between adjacent time steps due to instantaneous link quality fluctuations. The trajectory of the method of the present invention, indicated by the solid line in the figure, is generated under the joint optimization constraints of link quality, switching costs, and dwell stability. The target active site maintains a longer dwell time and fewer switching times within consecutive time steps, demonstrating that the method of the present invention can improve scheduling stability and engineering feasibility under the constraint that only one optical ground station is allowed to be in active access state at any given time.
[0091] Through the above steps, this embodiment realizes dynamic access optimization for multi-optical ground station scenarios in satellite-to-ground laser communication. Under the conditions of comprehensively considering link quality, site spatial relationship, handover cost and dwell stability, the target active site and its corresponding control results are re-output, thereby improving the long-term link availability and access stability in the multi-site collaborative access process.
Claims
1. A dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication, characterized in that, The method includes the following steps: Step 1: Obtain link status data for the current time and historical time periods of multiple optical ground stations, and construct the historical status sequence of each station; Step 2: Based on the historical state sequence constructed in Step 1, combined with the current activation state, the time since the last switch, and time characteristics, construct the site state vector of each optical ground station, and generate the dwell stability factor of each station. Step 3: Generate the initial site representation for each optical ground station based on the site state vector obtained in Step 2; Furthermore, a weighted site association graph is constructed by introducing a spatial correlation length determined jointly by the spatial distance between sites and the correlation of historical link status. Step 4: Using the initial site representations of each optical ground station generated in Step 3 as graph node features, and the weighted site association graph constructed in Step 3 as the edge weight relationship between nodes, graph convolution feature transfer is performed; after... After passing the layered graph convolutional features, the final site representation matrix is obtained. The final station represents the i-th station in the matrix. A site representation that integrates the site's own status information with spatial relationship information between sites; Step 5: Introduce a centralized action space containing multiple site switching actions and actions to maintain the current site, and use the final site representation obtained in Step 4 and the dwell stability factor generated in Step 2 to score the stability constraints of each action. Step 6: Under the constraint that only one optical ground station is allowed to be in active access state at any given time, the target active station is determined based on the stability constraint action score generated in Step 5, and the corresponding control result is output. Step 7: Based on the link quality status data corresponding to the target active site determined in Step 6, the site switching cost generated by switching from the current active site to the target active site, and the target active site dwell stability factor generated in Step 2, construct a reward function; use the reward function to update the trainable parameters included in the graph convolution feature transfer process in Step 4 and the stability constraint action scoring process in Step 5. Step 8: Based on the stability constraint action score updated in Step 7, under the constraint that only one optical ground station is allowed to be in active access state at the same time, re-output the target active station and its corresponding control result as the dynamic access optimization result after joint optimization of link quality, handover cost and dwell stability at the current time.
2. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 1, characterized in that, The link status data in step 1 includes cloud cover status data, atmospheric turbulence status data, link quality status data, and time information corresponding to each time step for each optical ground station at the current time and historical time periods. Continuous state data is extracted according to a preset historical time window length to construct a historical state sequence for each station; for the first... The historical state sequence expression for each optical ground station is as follows: in, Indicates the first A historical state sequence of an optical ground station. Indicates the first An optical ground station at the current moment The previous Link state feature vector at each time step, , Indicates the length of the history status window. The dimension representing the link state features at a single time step.
3. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 2, characterized in that, The station state vector mentioned in step 2, for the first An optical ground station has a station state vector expression as follows: in, Indicates the first The site state vector of an optical ground station Represents a sequence of historical states The historical state representation obtained after aggregation Indicates the first The current activation status flag of each optical ground station. This indicates the number of time steps since the last switch for the currently active site. This indicates the time characteristics corresponding to the current moment. Indicates the first The dwell stability factor of the first optical ground station, which characterizes the ability of each station to maintain stable access over a short period of time. The dwell stability factor of an optical ground station is expressed as follows: in, Indicates the first Average link quality of an optical ground station within a historical state window. Indicates the first Link quality fluctuations of an optical ground station within a historical status window.
4. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 3, characterized in that, The initial site representation expression mentioned in step 3 is: in, Indicates the first The initial site representation of an optical ground station, This represents the site status coding mapping function. This indicates the construction of the first step in step 2. The site state vector of an optical ground station; The historical link state correlation expression mentioned in step 3 is: in, Indicates the first The first optical ground station and the first Historical link status correlation between optical ground stations Indicates the first The link quality sequence of an optical ground station within a historical status window. Indicates the first The link quality sequence of an optical ground station within a historical status window; The spatial correlation length determined in step 3 based on the spatial distance between sites and the correlation of historical link status refers to the spatial correlation length. The relationship between spatial distance between sites, historical link state correlation, and spatial correlation length is determined by the attenuation relationship of historical link state correlation with spatial distance between sites. in, Indicates spatial distance as Historical link status correlation at time This represents the baseline value for historical link status correlation when the spatial distance between sites is 0. Indicates the spatial distance between stations. Indicates spatially related length; The weighted site association graph described in step 3 is used to characterize the spatial association strength between sites. For the first site... The first optical ground station and the first The edge weight expression of the weighted site association graph for an optical ground station is: in, Indicates the number of sites in the weighted site association graph. The first optical ground station and the first Border weights between optical ground stations Indicates the first The first optical ground station and the first Spatial distance between optical ground stations Indicates the site association threshold distance; This indicates the value that retains the correlation between non-negative historical link states.
5. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 4, characterized in that, The graph convolution feature transfer expression mentioned in step 4 is: in, Indicates the first The station representation matrix is the input to the layer graph convolution. Indicates the first The site representation matrix output by layer graph convolution. Represents the weighted site association matrix The matrix obtained after normalization Indicates the first The trainable parameter matrix of layer graph convolution, This represents a non-linear activation function.
6. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 5, characterized in that, The stability constraint score mentioned in step 5 is used to improve the persistent dwell capability after target site handover, including handover action score and current site maintenance action score. Specifically, For the For each optical ground station, the handover action scoring expression is: in, Indicates switching to the first Action score of each optical ground station This indicates the th feature after graph convolution feature propagation. The site information for each optical ground station indicates that... Indicates the number of convolutional layers in the graph. This represents the switching action scoring mapping function; the Corresponding final site representation matrix The final location of the i-th optical ground station is represented by the following: The current site action scoring expression remains as follows: in, This indicates that the action score of the currently active site will be maintained. This indicates the active site number at the previous moment. This indicates the dwell stability factor corresponding to the activated site in the previous time step. This indicates the number of dwell time steps corresponding to the previous activated site. This represents the action rating mapping function. Corresponding final site representation matrix The final site representation of the site that was activated at the previous moment.
7. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 6, characterized in that, Step 6, which determines the target activation station based on the stability constraint action score generated in step 5, specifically involves outputting a hold control result when the current station's action score is optimal; and outputting a handover control result when a station's handover action score is optimal. The action selection expression is as follows: in, Indicates time The target action, This indicates the number of optical ground stations; the target action is determined by a stability constraint action score. The correspondence between the target action and the target activated station number is as follows: when the target action corresponds to a station switching action, the target activated station number at the current moment is the number of the optical ground station to which it is switched; when the target action corresponds to a station-keeping action, the target activated station number at the current moment is the activated station number at the previous moment.
8. The dynamic access optimization method for multiple optical ground stations in satellite-to-ground laser communication according to claim 7, characterized in that, The expression for the reward function in step 7 is: in, Indicates time The return value, Indicates the target site is activated at time. The corresponding link quality score, This represents the site switching cost coefficient. Indicates time Indicator variables for site switching, This represents the residency stability benefit coefficient. This indicates the residency stability factor corresponding to the target activated site. This represents the penalty coefficient for short-term dwell oscillations. This indicates a short dwell time indicator variable, which is an indicator variable whose dwell time at the activated station in the previous time step is less than the minimum expected dwell time step threshold. Among them, the indicator variable for switching The expression is: in, Indicates the target active site number at the current moment. Indicates the station number that was activated at the previous moment; Short-stay indicator variable The expression is: in, This indicates the number of dwell time steps corresponding to the previous activated site. This represents the minimum expected dwell time step threshold.