Ultra-high voltage converter station protection system panoramic monitoring image reconstruction and transmission method
By optimizing the panoramic monitoring image reconstruction and heterogeneous network topology of UHV converter stations using deep multi-scale residual networks and deep reinforcement learning, the problems of image blurring and data transmission congestion were solved, achieving efficient image reconstruction and network optimization.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- STATE GRID ANHUI ELECTRIC POWER CO LTD
- Filing Date
- 2021-12-22
- Publication Date
- 2026-06-26
AI Technical Summary
The panoramic monitoring images of the UHV converter station are blurry and have low resolution. Furthermore, the real-time performance and reliability of the reconstructed communication network are poor when the network fails, leading to congestion in the transmission of panoramic monitoring data.
A deep multi-scale residual network model is used for image super-resolution reconstruction, and the heterogeneous network topology is optimized by deep reinforcement learning. A Monte Carlo tree search is then combined to construct an efficient network topology.
The image reconstruction clarity and resolution have been improved to meet the monitoring needs of inspection personnel, and network transmission performance has been optimized to ensure the real-time and reliable transmission of data.
Smart Images

Figure CN114283062B_ABST
Abstract
Description
Technical Field
[0001] This invention belongs to the field of panoramic monitoring technology for ultra-high voltage converter stations, and relates to a method for panoramic monitoring image reconstruction and transmission of ultra-high voltage converter station protection systems. Background Technology
[0002] With the development of power grids, the scale of grid interconnection is constantly expanding, and electrical connections within the grid are becoming increasingly close. This has led to a growing prominence of the safety and stability issues facing large power grids, significantly increasing the technical difficulty and safety risks associated with operation and management. The safe and reliable operation of ultra-high-voltage (UHV) converter stations plays an undeniably crucial role in the safe and stable operation of the power grid. Therefore, manual inspections are necessary during the daily operation and maintenance of UHV systems to troubleshoot equipment faults and ensure system safety and stability. However, this manual inspection mode is labor-intensive, and its performance is easily affected by the experience and sense of responsibility of the personnel. To improve the efficiency of UHV converter station operation and maintenance management, panoramic monitoring systems are widely deployed within UHV converter stations to monitor the operating status of equipment at all stages.
[0003] The following status signal parameters, applicable to the core links of UHVDC protection, need to be monitored by the protection device of the UHV converter station: A. Monitoring of the status of the outlet pressure plate; B. Temperature measurement of the terminal blocks inside the cabinet; C. Monitoring of the front panel of the secondary equipment inside the cabinet; D. Operating temperature of the secondary equipment inside the cabinet; E. Operating voltage of the secondary equipment inside the cabinet; F. Fiber optic light intensity monitoring; G. Cable insulation detection; H. Outlet circuit detection; I. Auxiliary contact position; J. Cable status detection; K. Detection of environmental parameters, such as temperature and humidity; L. Corrosion status of the terminals.
[0004] However, the operating environment and long-term use of monitoring systems inevitably lead to vibrations and shaking, as well as interference such as dust accumulation and spider webs on the lenses, resulting in blurred video images and inaccurate panoramic monitoring data acquisition. Therefore, there is an urgent need for a super-resolution reconstruction method for panoramic monitoring images of UHV converter station protection systems, so that the reconstructed high-resolution images can meet the panoramic monitoring needs of inspection personnel.
[0005] Traditional image enhancement and reconstruction methods typically enhance image contrast to highlight target objects, including histogram equalization, logarithmic transform, sharpening, wavelet transform, and Retinex at different scales. These methods are computationally inefficient and highly portable, but their enhancement effects are limited as general-purpose algorithms, and the processed images often fail to meet the needs of panoramic surveillance in specific scenarios. Image enhancement and reconstruction is a classic research topic in computer vision, and Single Image Super Resolution (SISR) is a crucial component. SISR utilizes a set of low-quality, low-resolution images to generate a single high-quality, high-resolution image, acquiring a region of interest with higher spatial resolution. This allows for focused analysis of the target object, transforming the image from detection level to recognition level, or even further to fine-resolution level, thereby improving the recognition capability and accuracy of panoramic surveillance images for converter stations.
[0006] Currently, SISR algorithms can be broadly categorized into three types: interpolation-based, reconstruction-based, and deep learning-based. Interpolation algorithms offer low computational cost and high real-time performance, but lack external information features, leading to the loss of high-frequency features after image degradation, resulting in images with noticeable blurring and ringing effects. Compared to interpolation algorithms, reconstruction-based algorithms show more significant improvements, but as the reconstruction magnification increases, high-frequency features become blurred. Deep learning-based methods have become mainstream in recent years, utilizing the mapping relationship between observed low-resolution (LR) images and original high-resolution (HR) images, along with a large number of training samples, to learn more high-frequency details in HR images. However, reconstructed images still suffer from detail distortion and high computational complexity. Convolutional neural networks (CNNs) are widely used in visual analysis due to their powerful image feature learning capabilities. In recent years, SISR algorithms based on CNNs have been proposed and have achieved significant performance improvements. The paper "Image Super-Resolution Using Deep Convolutional Networks" (C. Dong, IEEE Transactions on Pattern Analysis and Machine Intelligence, published in 2016) proposed a CNN model called SRCNN, which replaces dictionary modeling with automatic adjustment of hidden layer parameters, learning the nonlinear mapping relationship from low-resolution input to high-resolution output, improving reconstruction accuracy and reducing computation time. However, SRCNN also has some shortcomings. For example, bicubic interpolation can cause blurred and jagged edges in the image, and with the number of model parameters remaining constant, a larger super-resolution factor indicates a larger input resolution, resulting in higher computational cost. The paper "Accelerating the Super-Resolution Convolutional Neural Network" (Chao D, European Conference on Computer Vision, published in 2016) proposed an improved algorithm, FSRCNN, to address the slow training of SRCNN. It uses deconvolution for upsampling and 1×1 convolutions for dimensionality reduction, reducing the computational cost of the model and accelerating training. The core of ResNet is to add a skip connection between the output of the convolutional layer and the input of the previous convolutional layer to solve the gradient vanishing problem. H(x) represents the underlying mapping fitted by several stacked convolutional layers, where the input of the first convolutional layer is x, and x is connected to the output of the last convolutional layer. The stacked layers only need to learn the mapping F(x) = H(x) - x. If F(x) is zero, the residual unit can fit the identity mapping.
[0007] Because the concurrent processing of various network communications and service data in the panoramic monitoring of UHVDC converter stations is diverse, an unreasonable heterogeneous network topology can lead to dynamic imbalances in data flow access, resulting in poor overall data transmission performance. In severe cases, this can cause data congestion and affect network reliability. Therefore, it is necessary to optimize the heterogeneous network topology to improve its structure and transmission performance.
[0008] In recent years, the topology optimization problem of heterogeneous networks has received widespread attention. The 2015 publication "Logical Topology Design of Multi-Interface Multi-Channel Wireless Mesh Networks" (Bao Xuecai et al., Small and Microcomputer Systems) proposed a logical topology design method under topology reliability constraints. This method uses topology reliability and network path hop count as constraints, with maximum capacity and minimum interference as optimization objectives. It integrates shortest path and minimum spanning tree algorithms into the disjoint path calculation process to obtain an optimized logical topology, but this only improves network robustness. The 2019 publication "Adaptive Protection and Self-Healing Control Method for Distribution Networks Based on Dynamic Topology Analysis" (Zhang Anlong, Power System Protection and Control) proposed a new adaptive distributed topology control algorithm. By adjusting the transmission capacity of nodes under different states, it ensures network connectivity during node failures. Regarding data transmission, tree topology can transmit collected data better than other network topologies and has strong anti-interference capabilities. The 2019 publication, "Improving the Capacity of a Mesh LoRa Network by Spreading-Factor-Based Network Clustering" (Zhu G, Liao CH, Sakdejayont T, et al. IEEE Access), proposes a tree-based algorithm with a set of heuristic rules for constructing tree topologies in multi-hop wireless networks. The advantage of tree topologies lies in their efficient data transmission and aggregation through non-leaf nodes. Throughput in heterogeneous networks is a primary criterion for evaluating the merits of existing network models. Currently, many network topologies have been constructed for good data transmission, such as cluster-based and tree-based topologies. The performance of these topologies in heterogeneous networks demonstrates that topology quality significantly impacts data transmission. Therefore, network topology construction should be based on specific network requirements, accommodating various network types and transmission performance as much as possible.
[0009] The aforementioned network topology optimization methods consider the quality and security of data transmission, but the optimization process is time-consuming when the topology changes, making it difficult to meet the performance requirements of power industry data transmission networks. Finding the optimal reliability topology in the heterogeneous network of an UHVDC transmission system is essentially a combinatorial problem. The 2018 publication "Research on Minimum Spanning Tree Algorithm for Ventilation Networks Based on Weight Matrix" (Tu Peng et al., Journal of Railway Science and Engineering) proposes a minimum spanning tree topology optimization method. This method uses heuristic rules to reduce the number of candidate searches, thus obtaining a suboptimal solution to some extent. However, it still falls short of meeting the real-time and reliability requirements for rapid reconfiguration of the communication network when a UHVDC converter station experiences a network failure. Furthermore, because the search space for all possible topology configurations is extremely large, the complexity of achieving the optimal network configuration through exhaustive search increases exponentially. Summary of the Invention
[0010] The purpose of this invention is to design a method for reconstructing and transmitting panoramic monitoring images of UHV converter station protection systems, in order to solve the problems of blurry and low resolution of current panoramic monitoring images of UHV converter station protection systems, as well as the poor real-time performance and low reliability of UHV converter station reconstructing communication networks when network failures occur, which cause congestion in panoramic monitoring data transmission.
[0011] The present invention solves the above-mentioned technical problems through the following technical solutions:
[0012] A method for reconstructing and transmitting panoramic monitoring images of an ultra-high voltage converter station protection system includes the following steps:
[0013] S1. Perform super-resolution reconstruction on the panoramic surveillance image. The reconstruction method is as follows:
[0014] S11. Establish a deep multi-scale residual network model on the edge side; the deep multi-scale residual network model includes: an input convolutional layer, an output convolutional layer, and k multi-scale convolutional blocks; the input convolutional layer acts as an encoder to extract the original low-level features of the low-resolution image; the output convolutional layer is used to fuse multi-scale detail features to reconstruct a high-resolution image; the input and output convolutional layers are skipped connections to establish an identity mapping from the low-resolution image to the high-resolution image for global residual learning; the k multi-scale convolutional blocks are stacked sequentially to obtain the network model depth; the original low-level features are connected to the k multi-scale convolutional blocks through k corresponding paths, and the ability of the network model to learn complex features is enhanced through local residual learning.
[0015] S12. Input sample dataset and train the deep multi-scale residual network model;
[0016] S13. Test the peak signal-to-noise ratio and structural similarity index of the trained deep multi-scale residual network model using a standard dataset.
[0017] S14. Input the panoramic monitoring image of the UHV converter station into the trained deep multi-scale residual network model to complete the super-resolution reconstruction.
[0018] S2. Optimize the ubiquitous heterogeneous network transmission topology for panoramic monitoring of UHV converter stations. The optimization method is as follows:
[0019] S21. Model the heterogeneous network of the UHV converter station as a tree structure, wherein the tree structure has a master station v0 and N-1 data transmission nodes {v1, v2, ..., v...} N-1 Each data transmission node has a unique path to the master station v0;
[0020] S22. Using the main station v0 as the root node of the tree structure, and recursively searching the Monte Carlo tree for each state with the root node as the initial state to obtain the training dataset.
[0021] S23. Input the training dataset obtained from the search into the deep convolutional neural network for training to obtain the value function and policy function, which are used to guide the Monte Carlo tree to recursively search for states with expected rewards and in turn update the training dataset collected by the deep convolutional neural network.
[0022] S24. After training is complete, starting from the initial state s0 = 0, select a sequentially from the strategies predicted by the deep convolutional neural network. t ~π(s) t The action at point ) and the update of state s t+1 =T(s) t ,a t This process continues until a complete tree is reached, thus obtaining the heterogeneous network topology.
[0023] Furthermore, both the input and output convolutional layers use convolutional kernels with a stride of 1, and the input convolutional layer uses ReLU activation. The multi-scale convolutional block extracts multi-level detailed features from the input image using convolutional kernels of four scales: 3×3, 3×2, 2×3, and 2×2. Then, the feature maps of the four scales are concatenated pairwise along a specified dimension through a cross-connection mechanism and fed into a 3×3 convolutional layer for feature mapping, generating a new feature map of the same size as the input and feeding it into the next multi-scale convolutional block.
[0024] Furthermore, the local residual learning is defined as follows: H k =G k (H k-1 )+F; where G kH is the feature map learned by the k-th multi-scale convolutional block. k H is the output of the k-th multi-scale convolutional block. k-1 is the output of the (k-1)th multi-scale convolutional block, and F is the original low-order feature extracted by the input convolutional layer;
[0025] The mapping of k multi-scale convolutional blocks learned from global and local residuals is represented as follows: Where F0() is the mapping that the input convolutional layer needs to learn, F -1 () represents the mapping that the output convolutional layer needs to learn, where I HR I LR G represents high-resolution and low-resolution images, respectively. k-1 G is the feature map learned by the (k-1)th multi-scale convolutional block. k R is the feature map learned from the first multi-scale convolutional block, and R() is the mapping operation.
[0026] Furthermore, the loss function of the aforementioned deep multi-scale residual network model is: Where θ represents the parameters of the deep multi-scale residual network, and the loss function is minimized using the Adam optimizer; X (i) For sample dataset The i-th sub-image in Y (i) For the corresponding label, N is a positive integer.
[0027] Furthermore, the panoramic monitoring images include images of secondary equipment, hard pressure plates, and terminal corrosion; the standard datasets include three basic datasets: Set5, Set14, and Urban100.
[0028] Furthermore, the formula for calculating the peak signal-to-noise ratio is as follows: Where MSE is the mean square error between the original image and the processed image, and MAX is the mean square error between the original image and the processed image. I The maximum value of the image color is represented by the following formula: SSIM(X,Y)=L(X,Y)*C(X,Y)*S(X,Y); where u X u Y σ X and σ Y Let σ represent the mean and standard deviation of images X and Y, respectively. XY This represents the covariance of images X and Y. C1, C2, and C3 are constants, typically taken as C1 = (K1 * L). 2 C2 = (K2 * L) 2 C3 = C2 / 2, K1 = 0.01, K2 = 0.03, and L is the range of pixel values.
[0029] Furthermore, the method for modeling the heterogeneous network of the UHV converter station as a tree structure is as follows: In each round of data collection, node v i transmits -bit data to its parent node, where i ∈ {1, 2,..., N - 1}; is the data generated by v i itself, and the data set is from the child nodes of v i , and the a() function is an aggregation function; the transmission model is used to transmit traffic, and the node transmission traffic related to the topology in the transmission model consists of two parts: data processing and transmission time consumption; and are respectively the time consumption for processing each bit of data and the time consumption for transmitting each bit of data at node v i . The time consumption for transmitting each bit of data depends on the distance to the parent node, and its calculation formula is as follows: where, is the Euclidean distance between node v i and its parent node v j , and ρ is the power amplification constant considering the shadow fading effect in the link budget.
[0030] Furthermore, the method for Monte Carlo tree recursive search is as follows: Each node on the Monte Carlo tree represents a 5-tuple data (s, a, M(s, a), π(s), Q π (s, a)); at each search step t < N, select the action that maximizes the upper confidence bound. When the search reaches the termination state t = N, obtain the reward and propagate it back along the search path to the root state of all visited states and the actions taken. The Q π value on the path is correspondingly updated by the average value on the node; where s is the state of the heterogeneous network; a is the action in this state; M(s, a) is the total number of times (s, a) is visited on the search tree; π(s) is the prior probability of the effective action predicted by the deep convolutional neural network; Q π (s, a) is the state-action value, representing the expected reward starting from state s and taking action a; the calculation formula for the action that maximizes the upper confidence bound is as follows: where, is the visit count of state s, and the action is not considered. c is a hyperparameter that controls the search level.
[0031] Furthermore, the deep convolutional neural network comprises a deep Vgg16 module, a fully connected layer with softmax for the policy, and a fully connected layer with ReLU activation for the value function; the deep Vgg16 module consists of two convolutional layers with 64 convolutional filters, two convolutional layers with 128 convolutional filters, three convolutional layers with 256 convolutional filters, and six convolutional layers with 512 convolutional filters, each convolutional filter having a 3×3 kernel and a max pooling layer.
[0032] Furthermore, the value function satisfies the Bellman equation, indicating that the value of the current state is the reward of that state plus the expected reward of the next state. The formula for the value function is: The formula for the strategy function is as follows:
[0033] The advantages of this invention are:
[0034] The panoramic monitoring image reconstruction and transmission method for ultra-high voltage converter station protection systems of this invention avoids incomplete image detail extraction by constructing low-order and high-order features of the image at multiple scales using multi-scale convolutional blocks in a deep multi-scale residual network model. A residual learning mechanism is employed in the network model to preserve low-order coarse features, reducing training difficulty and promoting feature reuse, thereby improving image reconstruction capabilities. The reconstructed image exhibits better structural similarity and peak signal-to-noise ratio performance. Experiments on image super-resolution reconstruction and target recognition were conducted using a standard dataset and an ultra-high voltage converter station panoramic monitoring image dataset. The results show that the high-resolution images reconstructed by the method of this invention can meet the panoramic monitoring needs of inspection personnel. A topology control algorithm based on deep reinforcement learning is used to sequentially construct the topology of a heterogeneous network. A framework combining deep reinforcement learning and Monte Carlo tree search is adopted to construct the network sequentially according to predefined topology rules. A deep convolutional neural network is trained to predict the transmission traffic of the partially constructed topology and guide the Monte Carlo tree to search in more promising regions in the search space. The search results of the Monte Carlo tree enhance the learning of the deep convolutional neural network so as to obtain more accurate predictions in the next iteration. Attached Figure Description
[0035] Figure 1 This is an architecture diagram of the deep multi-scale residual network model according to an embodiment of the present invention;
[0036] Figure 2 This is a structural diagram of a multi-scale convolutional block according to an embodiment of the present invention;
[0037] Figure 3These are PSNR performance curves for different network model depths according to embodiments of the present invention.
[0038] Figure 4 , Figure 5 , Figure 6 These are comparison images of the reconstruction effects of the method of this invention and other algorithms on secondary equipment monitoring images, hard pressure plate images, and terminal corrosion images.
[0039] Figure 7 This is the heterogeneous network model of Embodiment 2 of the present invention;
[0040] Figure 8 This is the heterogeneous network tree structure of Embodiment 2 of the present invention;
[0041] Figure 9 This is a description of two steps in the finite-time domain Markov decision process of Embodiment 2 of the present invention;
[0042] Figure 10 This is the Monte Carlo tree search process of Embodiment 2 of the present invention;
[0043] Figure 11 This is the structure of a deep convolutional neural network according to an embodiment of the present invention;
[0044] Figure 12 This describes the convergence and performance of the DRL-TC algorithm proposed in this embodiment of the invention.
[0045] Figure 13 This is an evolution of the training process in the embodiments of the present invention. Detailed Implementation
[0046] To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of the present invention.
[0047] The technical solution of the present invention will be further described below with reference to the accompanying drawings and specific embodiments:
[0048] Example 1
[0049] like Figure 1 As shown, the method for super-resolution reconstruction of panoramic monitoring images of the UHV converter station protection system includes the following steps:
[0050] I. Super-resolution reconstruction of panoramic surveillance images
[0051] 1.1 Establish a deep multi-scale residual network model on the edge side.
[0052] 1.1.1 Deep Multi-scale Residual Network (DMRN)
[0053] Figure 1 This network employs a deep multi-scale residual network architecture, consisting of convolutional layers, k multi-scale convolutional blocks (MC blocks), and skip connections. Stacking the k MC blocks achieves greater depth, while the convolutional operations are improved by employing small kernels of different scales to extract and fuse detailed features at different scales in the image. This enhances the network's ability to reconstruct the microscopic texture and macroscopic geometric features of the input panoramic surveillance image, resulting in more realistic HR images. Residual structures are incorporated during network training to enable feature reuse, reduce network redundancy, accelerate convergence, and address the vanishing gradient problem.
[0054] 1.1.2 Multi-scale convolutional blocks
[0055] DMRN uses a multi-scale convolutional block architecture to perform super-resolution tasks. Convolutional layers with different scales form a multi-scale convolutional block, which can generate and combine detailed features at different levels.
[0056] Figure 2 This is a structural diagram of a single multi-scale convolutional block, where x represents the input of the multi-scale convolutional block and y is the output of the convolutional block. Convolutional blocks of different scales can extract details at different frequencies. In each multi-scale convolutional block, the input image is processed using convolutional kernels of four scales: 3x3, 3x2, 2x3, and 2x2 to extract multi-level detail features. Then, the feature maps of the four scales are concatenated pairwise along a specified dimension using a cross-connection mechanism, and then fed into a 3x3 convolutional layer for feature mapping, generating a new feature map of the same size as the input, which is then fed into the next multi-scale convolutional block. Multi-scale convolutional blocks better preserve the edge information of the image and increase the detail information of the reconstructed high-resolution image.
[0057] 1.1.3 Residual Learning Mechanism
[0058] The DMRN network architecture introduces global residual learning and local residual learning mechanisms for network training. Due to the similarity between low-resolution and high-resolution images, DMRN establishes an identity mapping from low-resolution to high-resolution images through skip connections between input and output to perform global residual learning.
[0059] There are two reasons for using local residual learning: First, the details needed in high-resolution reconstruction are the sum of high-frequency features and low-order features. Figure 1 The first convolutional layer in the algorithm acts as an encoder, extracting the original low-order features of the low-resolution image. Local residual learning can preserve these low-order features. Second, there are multiple paths between low-order features and multi-scale convolutional blocks. Through local residual learning, the network's ability to learn more complex features can be enhanced.
[0060] Local residual learning is defined as follows:
[0061] H k =G k (H k-1 )+F (1)
[0062] Among them, G k H is the feature map learned by the k-th multi-scale convolutional block. k is the output of the k-th multi-scale convolutional block, and F is the original low-order feature extracted by the first convolutional layer.
[0063] Let F0 be the mapping that the first convolutional layer (with ReLU) needs to learn. -1 For the mappings that the last convolutional layer (without ReLU) needs to learn, the mappings of the k multi-scale convolutional blocks learned based on the global and local residuals can be expressed as follows:
[0064] I HR =R(I LR ) = I LR +F -1 (G k (G k-1 (…(G 1 (F)+F)…)+F)+F) (2)
[0065] Where F = F0(I LR ) is a primitive, low-level feature.
[0066] 1.1.4 DMRN Network Details
[0067] Figure 1The main structure of DMRN differs from ResNet. DMRN removes pooling layers and batch normalization layers. This is because SISR aims to achieve accurate pixel prediction, and removing pooling layers helps preserve more image details. Batch normalization layers, which normalize features, eliminate the network's range flexibility and are detrimental to image reconstruction; therefore, they are also removed. DMRN uses convolutional kernels with a stride of 1 and ReLU activation, thus accepting images of arbitrary size as input. Furthermore, DMRN uses two 5×5 convolutional layers in the first and last layers to extract coarse features and fuse multi-scale detail features to reconstruct the HR image.
[0068] 1.2. Input sample data and train the deep multi-scale residual network model.
[0069] Eighty hundred monitoring images, each with a resolution of 1600*1200, were collected from the panoramic monitoring system of the UHV converter station. First, the high-resolution images were reduced to one-third of their original resolution using a bicubic interpolation algorithm, and then their dimensions were adjusted to the original image size. From the adjusted images, 24,000 sub-images of size 32×32 were selected with a step size of 32 as the dataset. Where N = 24000, X (i) For the i-th sub-image, Y (i) The corresponding labels are used. 80% of the images are randomly selected as the training set, and the remaining 20% as the test set. Mean squared error (MSE) is used as the loss function for the network.
[0070]
[0071] Here, θ represents the parameters of DMRN, and the Adam optimizer is used to minimize the loss function.
[0072] 1.3 The trained deep multi-scale residual network model was tested and analyzed using a standard dataset.
[0073] After the DMRN network was trained, it was first tested using three standard datasets: Set5, Set14, and Urban100. Since human vision is more sensitive to changes in brightness, the images were converted to the YCbCr space, and the peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) on the Y channel were used to evaluate the performance of super-resolution reconstruction.
[0074] PSNR is defined as the ratio of the maximum power of a signal to the power of noise, measured in decibels (dB). It is commonly used to evaluate the quality of image compression; a higher value indicates a more realistic image. The formula for calculating PSNR is as follows:
[0075]
[0076] Where MSE is the mean square error between the original image and the processed image, and MAX is the mean square error between the original image and the processed image. I This represents the maximum value of the image color.
[0077] SSIM evaluates the similarity between the original image and the processed image, with values ranging from [0, 1]. A higher value indicates less image distortion. The formula for calculating SSIM is as follows:
[0078]
[0079]
[0080]
[0081] SSIM(X,Y)=L(X,Y)*C(X,Y)*S(X,Y) (8)
[0082] Among them, u X u Y σ X and σ Y Let σ represent the mean and standard deviation of images X and Y, respectively. XY This represents the covariance of images X and Y. C1, C2, and C3 are constants, typically taken as C1 = (K1 * L). 2 C2 = (K2 * L) 2 C3 = C2 / 2, K1 = 0.01, K2 = 0.03, and L is the range of pixel values.
[0083] The number of multi-scale convolutional blocks determines the depth of DMRN. Here, models with different numbers of multi-scale convolutional blocks (k = {8, 10, 12, 14}) are selected, such as... Figure 3 As shown, the average PSNR and SSIM performance of 50 randomly selected images from the Set5, Set14, and Urban100 test datasets are presented. With the increase in the number of multi-scale convolutional blocks, the PSNR performance of DMRN on Set5, Set14, and Urban100 steadily improves, indicating that the method of this invention achieves the expected goal of "the deeper the better." However, excessively deep networks also lead to increased computational complexity. The performance improvement of k=14 compared to k=12 is limited; therefore, the parameter setting of k=12 was used in subsequent experiments.
[0084] The SSIM and PSNR values for testing on the standard datasets Set5, Set14, and Urban100 are shown in Tables 1-2. The tables also compare these values with other methods, including Bicubic interpolation, SRCNN, and FSRCNN.
[0085] Table 1. Structural similarity indices for the Set5, Set14, and Urban100 datasets.
[0086]
[0087] Table 2 Peak Signal-to-Noise Ratio of Set5, Set14, and Urban100 Datasets
[0088]
[0089] Here, DMRN with k=12 is selected as the comparison model. As shown in the table, the average SSIM values of SRCNN, FSRCNN, and DMRN are 0.7784, 0.7827, and 0.8082, respectively, while the structural similarity of the algorithm of this invention increases by 0.0043 and 0.0298, respectively. The average PSNR values of SRCNN, FSRCNN, and DMRN are 27.50 dB, 27.67 dB, and 28.33 dB, respectively, while the algorithm of this invention improves by 0.17 dB and 0.83 dB, respectively. This result indicates that the algorithm of this invention can establish a nonlinear mapping relationship from LR to HR by fusing low-order and high-order features and using a combination of global and local residuals.
[0090] 1.4 Input the panoramic monitoring image of the UHV converter station into the trained deep multi-scale residual network model to complete super-resolution reconstruction.
[0091] Figure 4 , Figure 5 , Figure 6 Super-resolution reconstruction images of the panoramic monitoring images of the UHV converter station are presented, including images of secondary equipment, hard pressure plates, and terminal corrosion. The method of this embodiment is compared with Bicubic, SRCNN, and FSRNN, and Tables 3 and 4 show the quantitative experimental results. The images before and after reconstruction were input into the YOLOv3 recognition model used in the UHV converter station, and the recognition results are shown in Table 5. The experimental results show that compared with other methods, DMRN has better SSIM and PSNR performance, recovers clearer edges and more details, such as the indicator lights and corresponding blurred text information in the first image, the switch status and text display of the hard pressure plate in the second image, and the terminal corrosion status in the third image, which can better help inspection personnel to conduct panoramic monitoring.
[0092] Table 3. Structural Similarity Index of Monitoring Images of UHV Converter Stations
[0093]
[0094] Table 4 Peak Signal-to-Noise Ratio of Monitoring Images from UHV Converter Stations
[0095]
[0096] Table 5. Image Recognition Results of UHV Converter Station Monitoring
[0097]
[0098] This invention proposes a deep multi-scale residual network (DMRN) to achieve fast super-resolution reconstruction of panoramic monitoring images of UHV converter station protection systems, meeting the panoramic monitoring needs of inspection personnel. In DMRN, multi-scale convolutional blocks are used to construct low-order and high-order features of the image at multiple scales, solving the problem of incomplete image detail extraction. The network employs residual learning to preserve low-order coarse features, reducing training difficulty, promoting feature reuse, and thus improving image reconstruction capabilities. Experimental results show that compared with other methods, DMRN has better SSIM and PSNR performance, recovering clearer edges and more details from standard datasets and UHV panoramic monitoring image sets, improving the quality of high-resolution image reconstruction, and meeting the panoramic monitoring needs of inspection personnel for UHV converter station protection systems.
[0099] II. Optimization of the ubiquitous heterogeneous network transmission topology for panoramic monitoring of UHV converter stations
[0100] 2.1 Establishing a heterogeneous network model for the UHV converter station
[0101] To ensure the stable operation of ultra-high voltage converter stations, comprehensive monitoring of numerous devices within the station is necessary. However, different devices use different networks for data transmission, resulting in a heterogeneous network. Figure 1 As shown. To address the dynamic imbalance in data flow access caused by unreasonable heterogeneous network topology connections, it is necessary to optimize the heterogeneous network topology to meet network communication performance requirements.
[0102] Table 6. Symbols used in the network model
[0103]
[0104] This embodiment takes a ±1100 kV converter station as an example, targeting... Figure 7 The heterogeneous network model shown depicts the heterogeneous network as a tree structure, such as... Figure 8 As shown, this structure has one master station v0 and N-1 data transmission nodes {v1, v2, ..., v...}N-1}, where each node has a unique path to the main station v0. V = {v0, v1, ..., v N-1} is the set of all vertices, and E is the set of directed edges.
[0105] In each round of data collection, node v i Need to The bit data is forwarded to its parent node. Calculated using formula (9):
[0106]
[0107] in, It is made by v i Self-generated data, data set It comes from v i The child nodes, the a() function is an aggregate function, i∈{1,2,...,N-1}.
[0108] This embodiment uses the transmission model shown in formula (10) to transmit traffic, where the topology-related node transmission traffic mainly consists of two parts: data processing (including data reception) and transmission time. The model is shown in formula (10):
[0109]
[0110] in, and They are at node v i The transmission time per bit for data processing and transmission is calculated. The transmission time per bit depends on the distance to the parent node and is further modeled as shown in Equation (11):
[0111]
[0112] in, It is node v i Its parent node (or main site) v j The Euclidean distance between them, ρ is the power amplification constant in the link budget that takes into account the effects of shadow fading.
[0113] To apply reinforcement learning in this embodiment, an effective tree structure is first constructed with the master station v0 as the root node. In each step, a node that is not yet connected is selected and connected to a node in the tree or the master station, until all nodes are connected. Figure 9 As shown, this process can be described by a finite-domain Markov decision process (MDP) with fully observable 4-tuples {S, A, T, R}. At each step t ∈ [0, N], the state s of the system is... t ∈S is the current adjacency matrix of the network. In a t The action at point A is to choose which node to connect to in the tree, or equivalently... Where node v i To connect to node v in the tree j (or the main station). Then, the system evolves to the next state s. t+1 In this case, there is a deterministic transition matrix T(s,a). Upon reaching the terminal state s... N Before all nodes are connected to the tree, the reward at step t is uncertain. Then the lifetime of the heterogeneous network is determined. Return as a reward for each action along the state trajectory.
[0114] The energy efficiency topology optimization framework proposed in this invention follows the following general settings:
[0115] (1) Node v i generated The data size is a random number extracted from a specific distribution in the DRL-TC algorithm;
[0116] (2) The aggregation function a() can be any deterministic function. In this invention, the summation method is used.
[0117] (3) The designed topology control algorithm should be applicable to other network objectives, such as minimizing the overall network time consumption or maximizing the network throughput.
[0118] Will be able to node v i The total traffic transmitted at that location is expressed as Since it is assumed that the main site v0 has no restrictions, therefore This invention defines the lifetime of a heterogeneous network as the minimum transmission traffic of all nodes based on the total number of transmission rounds. This maximization of the lifetime of a heterogeneous network can be expressed as:
[0119]
[0120]
[0121]
[0122]
[0123] Where δ(S) is the set of edges. If v i It is v j A subset of v, then v i=1, otherwise 0. Constraint (12b) ensures all nodes are connected, and constraint (12c) ensures each node can only transmit to one parent node at a time. To approximate the complexity of the problem, if the topology is considered as an undirected spanning tree, then according to Cayley's formula, the number of all possible spanning trees in the network is N. N-2 While heuristic rules can reduce the number of search candidates, enumerating all possible solutions remains infeasible for reasonable N values. This invention proposes a real-time DRL-TC algorithm that focuses on more promising regions in the search space with limited computational resources and approaches the optimal solution with increased computational power.
[0124] 2.2 Topology Optimization Algorithm Based on Deep Reinforcement Learning
[0125] 2.2.1 Reinforcement Learning
[0126] Reinforcement learning teaches the agent to take actions in a dynamic environment to maximize reward signals. In step t, the agent performs actions in the environment and receives an immediate reward r. t Receive observations of the environmental state. The action to be taken is determined by a strategy. The strategy can be dependent on s t It can be a set of deterministic actions, or it can be a random strategy using a set of action probabilities.
[0127] Reward r t The agent is told how much the current environmental state necessitates the objective; this is given by a reward function, which may depend on s. t a t and s t+1 When the goal is achieved, it produces a high value; otherwise, it produces a low value. A series of states and actions is called the trajectory motion τ, and the discounted sum of all reward values collected along a trajectory is called the reward, as shown in formula (13):
[0128]
[0129] Here, γ is a discount factor that reduces the value of future rewards. When γ < 1, the reward obtained now is more valuable than the reward obtained later. The return can be a finite-level return collected over the maximum number of time steps, and γ = 1 can be used if needed. Alternatively, the reward can be an unlimited, infinite-level return, in which case γ < 1 is required.
[0130] The value function satisfies the Bellman equation, which states that the value of the current state is the reward of that state plus the expected reward of the next state. The value function and policy function are shown in equations (14) and (15):
[0131]
[0132]
[0133] The main problem of reinforcement learning is to find a policy that maximizes this expected return, and its algorithms usually use an approximation value function.
[0134] 2.2.2 Monte Carlo Tree Search
[0135] This embodiment uses the described deep convolutional neural network to approximate the policy function and the value function. The deep convolutional neural network requires training data sets of states, policies, and values in order to fit the deep convolutional neural network as a function approximator. One method is to enumerate and collect all states and their values as the training data set. However, when the state space is large, this method will overfit the deep convolutional neural network and become infeasible. Instead of using heuristic rules to reduce the number of search candidates, this embodiment uses MCTS to effectively collect the training data set in more promising regions of the search space. Each node on the search tree represents 5-tuple data (s, a, M(s, a), π(s), Q π (s, a)), where s is the state of the heterogeneous network, a is the action in this state, M(s, a) is the total number of visits to (s, a) on the search tree, π(s) is the prior probability of the effective action predicted by the deep convolutional neural network, and Q π (s, a) is the state-action value, which is defined as the expected reward starting from state s and taking action a, and is calculated using formula (14). At each search step t < N, the action that maximizes the upper confidence bound (UCB) is selected, as shown in formula (16):
[0136]
[0137] where, is the visit count of state s, and the action is not considered. c is a hyperparameter that controls the search level. Intuitively, this selection policy initially favors actions with a higher prior probability π, but asymptotically favors actions with a higher state-action value Q π . As Figure 10 shows, when the search reaches the termination state (i.e., t = N), the reward is obtained and propagated back along the search path to the root state of all visited states and the actions taken. The Q π values on the path are correspondingly updated by the average value on the node. The details of MCTS are described in Algorithm 1, as shown in Table 7.
[0138] Table 7 MCTS Subroutine of the Proposed DRL-TC Algorithm
[0139]
[0140]
[0141] Each search begins in a specific state and recursively searches for the next state until a new leaf state or terminal state is reached. By performing multiple MCTSs in each state, the posterior visit counts M(s) are collected as part of the training dataset used to update the deep convolutional neural network in the next iteration.
[0142] 2.2.3 Deep Convolutional Neural Networks
[0143] A stochastic policy π(s) defines the distribution of effective actions in a state, under which the system moves from state s. t Until terminal state s N Generate state and action trajectory h(s) t )={s t ,a t ,...,s N-1 ,a N-1 ,s N Value function V π (s) is defined as the expected reward for all possible trajectories starting from state s, and is calculated by formula (7).
[0144] This embodiment uses a deep convolutional neural network to approximate the policy function and the value function f. Θ (s) (parameterized by Θ) approximates the optimal value function V * (s)=max π V π (s) and the optimal strategy π * (s). For example... Figure 11 As shown, the input to the deep convolutional neural network is the training dataset {(s,π(s),V}. π To maintain the feasibility of training multi-layer neural networks while significantly improving the representational power of deep convolutional neural networks, this embodiment employs a deep Vgg16 module. This module consists of two convolutional layers with 64 convolutional filters, two convolutional layers with 128 convolutional filters, three convolutional layers with 256 convolutional filters, and six convolutional layers with 512 convolutional filters. Each convolutional filter has a 3×3 kernel, followed by a max-pooling layer. Then, the deep convolutional neural network is divided into two branches of convolutional layers, followed by fully connected layers with softmax and ReLU activations for the policy and value functions, respectively. The deep convolutional neural network (π(s), V π (s))=f Θ (s) The policy and value of each predicted state contain prior information that guides MCTS to search for states with high rewards and, in turn, collects the training dataset for the deep convolutional neural network.
[0145] Once the deep convolutional neural network (π(s)) is trained, V π (s))=f Θ (s) In order to obtain the tree-like topology of the heterogeneous network, this embodiment starts from the root state s0 = 0, and then sequentially selects a from the strategies predicted by the deep convolutional neural network. t ~π(s) t The action at point ) and the update of state s t+1 =T(s) t ,a t This process continues until a complete tree is reached. This embodiment notes that this topology construction is a stochastic process, and once the deep convolutional neural network has been trained for a sufficient number of iterations, it will converge to a solution.
[0146] 2.2.4 Self-configured DRL-TC algorithm
[0147] The self-configuration and self-optimization characteristics are known as SON (Self-Organizing Network), which better adapts to the flattening and flexibility of network structures, and therefore have attracted widespread attention. In short, the DRL-TC proposed in this embodiment alternates between training a deep convolutional neural network (DCNN) and MCTS (Multi-Channel Search Theory), where the DCNN provides a prior policy to guide the MCTS, and then the MCTS returns posterior visit counts and state values to update the DCNN. In this way, with limited computational resources, the proposed DRL-TC algorithm will focus more on promising search regions and converge to a solution with higher rewards.
[0148] The proposed DRL-TC algorithm can also adapt to dynamic changes in the environment. For example, when nodes are suddenly added or removed, topology rules may make some actions effective or ineffective. In a new run of MCTS, the policy π returned by the deep convolutional neural network for this state will be renormalized for all effective actions. Therefore, the new prior policy π(s) reflects the changes in the network but is still related to historical data. MCTS will collect a new training dataset and use it to update the deep convolutional neural network. Assuming that network changes are slower than training time (depending on available computational resources), the proposed DRL-TC algorithm can track the dynamic changes in the network and reconfigure the network topology accordingly. Algorithm 2 describes the complete algorithm of the proposed DRL-TC, as shown in Table 8.
[0149] Table 8 proposes the DRL-TC algorithm.
[0150]
[0151]
[0152] 2.3 Simulation Results and Analysis
[0153] 2.3.1 Simulation Settings
[0154] To evaluate the performance of the DRL-TC algorithm, simulation tests were conducted on a heterogeneous network of a ±1100 kV converter station. This heterogeneous network consists of a master node and 12 nodes distributed within a circular area with a radius of 1000 m, uniformly generating 500 to 1000 bits of sensing data in each transmission round. This embodiment assumes that all nodes have sufficient time to transmit data in each round. The data transmission throughput of each unit of all nodes is set to... The power amplification constant is set to ρ = 1.
[0155] In each iteration of the algorithm, from each state N m =Collect N in MCTS with 100 searches e =10 training samples. Batch size B=16, learning rate α=10 -6 This embodiment uses the ADAM optimizer to train a deep convolutional neural network. After each training iteration, 100 network topologies are constructed using the deep convolutional neural network, and the results are averaged to evaluate the performance of the algorithm.
[0156] 2.3.2 Results Analysis
[0157] First, this embodiment demonstrates the convergence and performance of the proposed DRL-TC algorithm. Figure 12 The solid lines in the table represent the network latency of the deep convolutional neural network after each training iteration, with the algorithm converging after approximately 50 iterations. Table 9 compares the performance of the proposed DRL-TC algorithm with three heuristics: star topology (all nodes are connected to the master station), random topology (each node randomly selects a node to connect to), and minimum spanning tree (MST) topology, where MST is weighted by the Euclidean distance between nodes. The star topology exhibits the longest network latency due to higher transmission traffic at edge nodes far from the master station. The random topology shows a shorter average network latency, but the differences are significant. The MST topology further reduces network latency by shortening the overall transmission distance. The proposed DRL-TC algorithm in this embodiment significantly outperforms these heuristics, and the algorithm has a short convergence time.
[0158] Table 9 Performance Comparison of DRL-TC Algorithm and Three Heuristic Methods
[0159]
[0160] Figure 13The proposed DRL-TC demonstrates its ability to adapt to sudden changes in heterogeneous networks, showing the average network latency after each training iteration. Figure 13 Points A to D on the curve show the 100 topological superpositions given by the DRL-TC algorithm after the 1st, 62nd, 63rd, and 100th iterations. Point A indicates that in the first iteration, DRL-TC randomly explores the search space. Because the deep convolutional neural network has no prior information about the state values, the network latency is relatively high. Point B indicates that the network gradually converges after multiple iterations. Point C indicates that it can quickly adapt when the heterogeneous network structure changes. Point D indicates that the algorithm converges to the optimal solution after 100 iterations.
[0161] This invention proposes a novel and unified heterogeneous network topology optimization algorithm based on deep reinforcement learning. The proposed DRL-TC algorithm can adapt to environmental changes and exhibits significantly better data transmission performance than other heuristic algorithms, demonstrating excellent adaptability to network topology changes and enhancing network reliability. The DRL-MCTS framework has great potential in heterogeneous networks, enabling online training without intervening in network services. Furthermore, with the continuous improvement of computing power, this invention anticipates that in the 5G era, DRL-MCTS will see other promising topology control applications in self-organizing and fully automated IoT networks.
[0162] The above embodiments are only used to illustrate the technical solutions of the present invention, and are not intended to limit it. Although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some of the technical features. Such modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims
1. A method for reconstructing and transmitting panoramic monitoring images of an ultra-high voltage converter station protection system, characterized in that, Includes the following steps: S1. Perform super-resolution reconstruction on the panoramic surveillance image. The reconstruction method is as follows: S11. Establish a deep multi-scale residual network model on the edge side; the deep multi-scale residual network model includes: input convolutional layer, output convolutional layer, ... k The input convolutional layer acts as an encoder to extract the original low-level features of the low-resolution image; the output convolutional layer is used to fuse multi-scale detail features to reconstruct the high-resolution image; the input and output convolutional layers have skip connections to establish an identity mapping from the low-resolution image to the high-resolution image for global residual learning. k The aforementioned multi-scale convolutional blocks are stacked and connected sequentially to obtain the network model depth; the original low-order features and k The multi-scale convolutional blocks are interconnected via... k Each path corresponds to a connection, and local residual learning enhances the network model's ability to learn complex features; Both the input and output convolutional layers use convolutional kernels with a stride of 1, and the input convolutional layer uses ReLU activation. The multi-scale convolutional block extracts multi-level detail features from the input image using convolutional kernels of four scales: 3×3, 3×2, 2×3, and 2×2. Then, the feature maps of the four scales are concatenated pairwise along a specified dimension through a cross-connection mechanism and fed into a 3×3 convolutional layer for feature mapping, generating a new feature map of the same size as the input and feeding it into the next multi-scale convolutional block. S12. Input sample dataset and train the deep multi-scale residual network model; S13. Test the peak signal-to-noise ratio and structural similarity index of the trained deep multi-scale residual network model using a standard dataset. S14. Input the panoramic monitoring image of the UHV converter station into the trained deep multi-scale residual network model to complete the super-resolution reconstruction. S2. Optimize the ubiquitous heterogeneous network transmission topology for panoramic monitoring of UHV converter stations. The optimization method is as follows: S21. Model the heterogeneous network of the UHV converter station as a tree structure, wherein the tree structure has a master station. and Data transmission nodes Each data transmission node has a connection to the master station. The only path; S22, Main Station As the root node of the tree structure, the training dataset is obtained by recursively searching the Monte Carlo tree for each state, with the root node as the initial state. S23. Input the training dataset obtained from the search into the deep convolutional neural network for training to obtain the value function and policy function, which are used to guide the Monte Carlo tree to recursively search for states with expected rewards and in turn update the training dataset collected by the deep convolutional neural network. S24. After training is complete, from the initial state Initially, by sequentially selecting from the strategies predicted by the deep convolutional neural network... The action at the location, and update the status. This continues until a complete tree is reached, thus obtaining the heterogeneous network topology.
2. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 1, characterized in that, The local residual learning is defined as follows: ;in, For the first k The feature maps learned by multi-scale convolutional blocks For the first k The output of a multi-scale convolutional block -1 For the first k-1 The output of a multi-scale convolutional block F The original low-order features extracted by the input convolutional layer; Global residuals and local residuals learned k The mapping of a multi-scale convolutional block is represented as follows: ;in, () represents the mapping that the input convolutional layer needs to learn. F -1 () represents the mapping that the output convolutional layer needs to learn, where, , These represent high-resolution and low-resolution images, respectively. -1 For the first k-1 The feature maps learned by multi-scale convolutional blocks The feature map learned from the first multi-scale convolutional block. R () represents a mapping operation.
3. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 2, characterized in that, The loss function of the deep multi-scale residual network model is: ;in, The parameters of the deep multi-scale residual network are defined, and the loss function is minimized using the Adam optimizer. For sample dataset The first in Sub-images, For the corresponding label, N is a positive integer.
4. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 1, characterized in that, The panoramic monitoring images include images of secondary equipment, hard pressure plates, and terminal corrosion; the standard datasets include three basic datasets: Set5, Set14, and Urban100.
5. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 3, characterized in that, The formula for calculating the peak signal-to-noise ratio in the test is as follows: ;in, The mean square error between the original image and the processed image is denoted as . The maximum value of the image color is represented by the following formula: ; ; ; ;in, , , and Representing images respectively and The mean and standard deviation, Representing an image and covariance, , and It is a constant, usually taken as , , , , , This represents the range of pixel values.
6. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 1, characterized in that, The method for modeling the heterogeneous network of the UHV converter station as a tree structure is as follows: In each round of data acquisition, the nodes... Will The bit data is forwarded to its parent node, where, ; It is by Self-generated data, data set It comes from child nodes, The function is an aggregate function; a transfer model is used. Transmission traffic, in the transmission model, the transmission traffic of nodes related to the topology consists of two parts: data processing and transmission time. and They are at the nodes The time consumption for processing each bit of data and the time consumption for transmitting each bit of data are specified. The time consumption for transmitting each bit of data depends on the distance to the parent node, and its calculation formula is as follows: ,in, It is a node with its parent node The Euclidean distance between them It is the power amplification constant in the link budget that takes into account the effects of shadow fading.
7. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 6, characterized in that, The Monte Carlo tree recursive search method is as follows: Each node in the Monte Carlo tree represents a 5-tuple of data. ; in each search step At this point, choose the action that maximizes the upper confidence limit; when the search reaches the termination state... When a reward is obtained, the search path is propagated back to the root state of all visited states and the actions taken, along the path. The value is updated accordingly by the average value on the node; where It is the state of a heterogeneous network; This refers to the action performed in that state. Accessing on the search tree Total number of times; It is the prior probability of an effective action predicted by a deep convolutional neural network; It is a state action value, representing the state. Start and take action The expected reward; the formula for calculating the action that maximizes the confidence upper limit is as follows: ;in, It is a state The access count, without considering actions. It is a hyperparameter that controls the search level.
8. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 7, characterized in that, The deep convolutional neural network comprises a deep Vgg16 module, a fully connected layer with softmax for policy, and a fully connected layer with ReLU activation for value functions. The deep Vgg16 module consists of two convolutional layers with 64 convolutional filters, two convolutional layers with 128 convolutional filters, three convolutional layers with 256 convolutional filters, and six convolutional layers with 512 convolutional filters. Each convolutional filter has a 3×3 kernel and a max pooling layer.
9. The method for panoramic monitoring image reconstruction and transmission of the UHV converter station protection system according to claim 8, characterized in that, The value function satisfies the Bellman equation, meaning that the value of the current state is the reward of that state plus the expected reward of the next state. The formula for the value function is: The strategy function formula is as follows: .