A method and device for identifying a poor-quality cell
By constructing a poor-quality cell identification method based on anomaly detection and threshold model, and utilizing historical index data and degradation features, the problem of inaccurate identification of poor-quality cells in existing technologies is solved, and a more reasonable and efficient poor-quality cell identification method is achieved.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- FUJIAN FUNO MOBILE COMM TECH CO LTD
- Filing Date
- 2023-01-29
- Publication Date
- 2026-06-19
AI Technical Summary
In existing technologies, the methods for identifying poor-quality cells rely on fixed thresholds and manual analysis, resulting in inaccurate and subjective identification, as well as being time-consuming and incomplete.
By collecting historical indicator data of the cell's wireless network, using an anomaly detection model to remove abnormal data, a poor-quality cell identification model is constructed. Combining the indicator scenario threshold and cell threshold range, along with degradation characteristics and user complaint data, the LightGBM classification algorithm is used to identify poor-quality cells.
It improves the accuracy and comprehensiveness of identifying poor-quality cells, ensures that threshold calculation is reasonable and reliable, reduces manual intervention, and enhances the objectivity and efficiency of identification.
Smart Images

Figure CN116156531B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of wireless communication technology, and in particular to a method and apparatus for identifying poor-quality cells. Background Technology
[0002] As users' demands for wireless network quality increase, the timely identification of cells with poor wireless network quality is receiving more and more attention. Existing technologies mainly compare wireless network performance indicators such as LTE availability, RRU availability, call completion rate, and coverage with pre-set thresholds for each performance indicator. Maintenance personnel analyze the wireless network quality of the cells based on the comparison results and identify cells with poor wireless network quality. However, the thresholds for each indicator in this method are set by maintenance personnel based on their business experience, and the thresholds for each indicator are fixed. Since network environments vary, there are certain differences in network quality. Judging solely by using uniform fixed thresholds cannot guarantee its rationality and accuracy. Furthermore, manual analysis by maintenance personnel is not only time-consuming but also lacks objectivity, comprehensiveness, and accuracy. Summary of the Invention
[0003] The technical problem to be solved by the present invention is: to provide a method and apparatus for identifying poor-quality cells, thereby improving the accuracy of identifying poor-quality cells.
[0004] To solve the above-mentioned technical problems, the technical solution adopted by the present invention is as follows:
[0005] In a first aspect, the present invention provides a method for identifying poor-quality cells, comprising:
[0006] Collect historical performance data of various aspects of the community wireless network, input the historical performance data into the anomaly detection model for anomaly detection, determine whether there is any abnormal data in the historical performance data, and if so, remove the abnormal data from the historical performance data to generate historical performance data without anomalies.
[0007] The historical indicator data without anomalies is input into the threshold model to calculate the threshold range of each performance indicator scenario and the threshold range of the indicator cell;
[0008] A poor-quality cell identification model is constructed based on the threshold range of the performance indicators and the threshold range of the indicator cells, and poor-quality cells are identified through the poor-quality cell identification model.
[0009] The beneficial effects of this invention are that, based on historical network performance index data of a cell, an anomaly detection model is used to remove abnormal data from the performance index data, eliminating interference from sudden situations or adverse environments. This makes the calculation of the threshold values for each performance index more reasonable and reliable. Furthermore, the scenario index threshold values and cell index threshold values for each performance index are calculated separately at the scenario level and the cell level. This allows the network performance assessment to be compared not only with network scenarios of the same level but also with the historical performance of the cell itself. This makes the construction of the poor-quality cell identification model more comprehensive and reasonable, improving the accuracy and comprehensiveness of identifying poor-quality cells.
[0010] Optionally, the step of collecting historical performance data of the cell's wireless network, inputting the historical performance data into an anomaly detection model for anomaly detection, and determining whether there is abnormal data in the historical performance data includes:
[0011] Historical performance data of the community wireless network are collected, and a random number of historical performance data are randomly selected from the historical performance data as a sample dataset. The sample dataset is used as the first root node, and a historical performance data is randomly selected from the first root node as a split point. Based on the split point, the first root node is split using the isolated forest outlier detection algorithm to generate the first child node of the first root node.
[0012] The first child node is used as the second root node, and a historical indicator data is randomly selected from the second root node as a split point. Based on the split point, the second root node is split using the isolated forest outlier detection algorithm to generate the second child node of the second root node.
[0013] If there is only one historical indicator data in the second child node, the segmentation stops. Otherwise, it is determined whether all the historical indicator data in the second child node are the same. If so, the segmentation stops. Otherwise, it is determined whether the depth of the historical indicator data in the second child node has reached the depth threshold. If so, the segmentation stops. Otherwise, the segmentation continues to generate child nodes.
[0014] The first root node and all its child nodes form an anomaly detection model; the historical indicator data is input into the anomaly detection model for anomaly detection to determine whether there is any abnormal data in the historical indicator data.
[0015] As described above, the anomaly detection model is constructed based on historical performance data of various aspects of the cell's wireless network, which improves the objectivity and rationality of the anomaly detection model. Furthermore, the anomaly detection model is constructed using the isolated forest anomaly detection algorithm, which enhances the noise resistance of the anomaly detection model and makes it easier to detect missing values, i.e., abnormal data.
[0016] Optionally, the step of inputting the historical indicator data into the anomaly detection model for anomaly detection, and determining whether there is abnormal data in the historical indicator data, includes:
[0017] The historical indicator data is input into the anomaly detection model, allowing each historical indicator data point to traverse every node of the anomaly detection model. The depth of each historical indicator data point at each node is calculated, and the anomaly value of the depth is calculated using the following anomaly score formula:
[0018]
[0019] Where: s(x) is the outlier, h(x) = ln(x) + ξ, h(x) is the depth of the historical indicator data x in the node of the anomaly detection model, and ξ is Euler's constant; ψ is the average depth of historical indicator data, and c(ψ) is the average path length of nodes in the anomaly detection model constructed with ψ nodes.
[0020] When the outlier exceeds the outlier threshold, the historical indicator data is considered outlier.
[0021] As described above, anomaly detection involves traversing historical data through each node of the anomaly detection model, calculating the depth of historical data at each node, and then calculating the anomaly value of the depth using anomaly score formula, thereby detecting abnormal data. This layered detection improves the accuracy and comprehensiveness of anomaly detection.
[0022] Optionally, the step of inputting the historical indicator data without anomalies into the threshold model to calculate the indicator thresholds for each performance scenario includes:
[0023] The abnormal historical indicator data is input into the threshold model. The abnormal historical indicator data is classified according to the preset scenario classification rules to generate classified historical indicator data. The standard deviation of the classified historical indicator data is calculated. The parameters of the threshold model are set according to the standard deviation to generate a parameterized threshold model. The parameterized threshold model is then used to cluster the classified historical indicator data using a clustering algorithm to generate clustered historical indicator data.
[0024] Based on the principle of having the most categories, the historical indicator data with the most categories is selected from the clustered historical indicator data, and the historical indicator data is used as the master data. The maximum and minimum values of the master data are obtained, and the threshold range of the indicator scenario for the current performance is generated based on the maximum and minimum values.
[0025] As described above, historical indicator data without anomalies will be classified according to preset scenario classification rules, and then the standard deviation of the classified historical indicator data will be calculated. The parameters of the threshold model will be set based on the calculated standard deviation, which will improve the rationality of the threshold model while ensuring its objectivity. At the same time, clustering will be performed through clustering algorithms to make the fluctuation of historical indicator data smaller, thereby improving the accuracy and stability of the threshold range of indicator scenarios.
[0026] Optionally, the step of inputting the historical indicator data without anomalies into the threshold model to calculate the indicator thresholds for each performance cell includes:
[0027] The abnormal historical indicator data is input into the threshold model, the standard deviation of the abnormal historical indicator data is calculated, and the parameters of the threshold model are set according to the standard deviation of the indicator to generate a parameterized threshold model. The parameterized threshold model is then used to cluster the abnormal historical indicator data using a clustering algorithm to generate clustered historical indicator data.
[0028] Based on the principle of having the most categories, the historical indicator data with the most categories is selected from the clustered historical indicator data, and the historical indicator data is used as the master data. The maximum and minimum values of the master data are obtained, and the threshold range of the indicator cell for the current performance is generated based on the maximum and minimum values.
[0029] As described above, the calculation of the threshold range for indicator cells is similarly based on the standard deviation of historical indicator data without anomalies to set the threshold model, ensuring the rationality and objectivity of the threshold model. Furthermore, a clustering algorithm is used to cluster the historical indicator data, making the fluctuations of the historical indicator data smaller, thereby improving the accuracy and stability of the threshold range for indicator cells.
[0030] Optionally, constructing a poor-quality cell identification model based on the threshold range of the performance indicators and the threshold range of the cell indicators includes:
[0031] Degradation features are constructed by combining the threshold range of the performance indicators for each scenario with the threshold range of the indicator cells with preset degradation rules.
[0032] Obtain a list of poor-quality cells from user complaints and mark the poor-quality cells in the list to generate marked poor-quality complaint cells. At the same time, obtain a list of actual poor-quality cells and mark the poor-quality cells in the list to generate marked poor-quality measured cells.
[0033] Based on the degradation characteristics, the degradation characteristic data of the marked poor quality complaint cells and the marked poor quality measured cells are obtained, and the degradation characteristic data is normalized to generate normalized degradation characteristic data.
[0034] Based on the normalized degradation feature data, a poor-quality cell identification model is constructed using the LightGBM classification algorithm, and poor-quality cells are identified using the poor-quality cell identification model.
[0035] As described above, the construction of the poor-quality cell identification model is not merely based on the threshold range of various performance indicators and the threshold range of indicator cells. Instead, it combines these with preset degradation rules to construct degradation features. Based on the constructed degradation features, it obtains degradation feature data of poor-quality cells in the list of poor-quality cells complained about by users and degradation feature data of poor-quality cells in the actual measured list of poor-quality cells. This process constructs the poor-quality cell identification model and improves the accuracy and comprehensiveness of the poor-quality cell identification model.
[0036] Optionally, identifying poor-quality cells using the poor-quality cell identification model includes:
[0037] The poor-quality cell identification model obtains the existing indicator data of each performance of the cell wireless network in real time, and compares the existing indicator data with the indicator scenario threshold range and the indicator cell threshold range. When the existing indicator data of each performance of the cell wireless network is not within the indicator scenario threshold range or the indicator cell threshold range, the current performance deteriorates, and the degree of deterioration of the current performance is calculated by the deterioration formula.
[0038] The number of days of continuous performance degradation and the number of days of degradation within a preset time period are obtained based on the degradation characteristics.
[0039] Poor-quality cells are identified based on the degree of degradation, the number of days of continuous degradation, and the number of days of degradation within the preset time period.
[0040] As described above, the identification of poor-quality cells no longer simply involves judging whether the existing performance indicators of the wireless network are within the threshold of the indicator scenario or the threshold of the indicator cell. Instead, it also takes into account the degree of performance degradation, the number of days of continuous degradation, and the number of days of degradation within a preset time period to improve the rationality and comprehensiveness of the judgment of poor-quality cells.
[0041] In a second aspect, the present invention provides a device for identifying poor-quality cells, including a memory, a processor, and a computer program stored in the memory and executable on the processor. When the processor executes the computer program, it implements the method for identifying poor-quality cells described in the first aspect.
[0042] The technical effects of the poor-quality cell identification device provided in the second aspect are described in the relevant description of the poor-quality cell identification method provided in the first aspect. Attached Figure Description
[0043] Figure 1A flowchart illustrating a method for identifying poor-quality cells provided in an embodiment of the present invention;
[0044] Figure 2 This is a flowchart illustrating a method for identifying poor-quality cells provided in an embodiment of the present invention.
[0045] Figure 3 A schematic diagram of the structure of a poor-quality cell identification device provided in an embodiment of the present invention.
[0046] [Explanation of Labels in the Attached Image]
[0047] 1. A device for identifying poor-quality residential areas;
[0048] 2. Processor;
[0049] 3. Memory. Detailed Implementation
[0050] To better understand the above technical solutions, exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention can be implemented in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that the present invention can be understood more clearly and thoroughly, and that the scope of the present invention can be fully conveyed to those skilled in the art.
[0051] Example 1
[0052] Please refer to Figures 1 to 2 This invention provides a method for identifying poor-quality cells, comprising the following steps:
[0053] S1. Collect historical performance data of the cell wireless network, input the historical performance data into the anomaly detection model for anomaly detection, determine whether there is abnormal data in the historical performance data, and if so, remove the abnormal data from the historical performance data to generate anomaly-free historical performance data.
[0054] In this implementation, such as Figure 2As shown, historical performance data of various aspects of the cell's wireless network will be collected. These historical performance data include, but are not limited to: LTE wireless call success rate (%), LTE wireless call drop rate (%), LTE handover success rate (%), CSFB fallback success rate (%), LTE worst cell percentage (%), urban low CQI cell percentage (%), TCP wireless success rate (%), TCP wireless latency (ms), 4G high interference cell percentage (%), E-RAB (QCI=1) establishment success rate, E-RAB call drop rate (QCI=1), VoLTE user handover success rate, eSRVCC handover success rate (network management), and VoLTE two low and two high cell percentage (including swallowed characters). The collected historical performance data will be input into an anomaly detection model for anomaly detection. Due to significant differences between historical performance data and historical performance data under normal conditions in harsh environments such as typhoons and rainstorms, distortion may occur, which is not conducive to threshold calculation. Therefore, when abnormal data is found in the historical performance data, it will be removed to ensure the stability of the historical performance data.
[0055] At this point, the step S1 of collecting historical performance data of the cell's wireless network, inputting the historical performance data into the anomaly detection model for anomaly detection, and determining whether there is abnormal data in the historical performance data includes:
[0056] S11. Collect historical performance data of the cell wireless network, and randomly select a random number of historical performance data from the historical performance data as a sample dataset. Use the sample dataset as the first root node, and randomly select a historical performance data from the first root node as a split point. Based on the split point, use the isolated forest anomaly detection algorithm to split the first root node and generate the first child node of the first root node.
[0057] In this embodiment, a random number of historical indicator data points are randomly selected from the historical indicator data as a sample dataset, and this sample dataset is used as the first root node. Then, a historical indicator data point is randomly selected from the first root node as a split point. Based on this split point, the first root node is split using the isolated forest anomaly detection algorithm. That is, the isolated forest anomaly detection algorithm will place historical indicator data points larger than the split point in the first child node on the left or the first child node on the right. In other words, the historical indicator data of the first root node is divided into different subspaces, and each subspace stores the corresponding historical indicator data.
[0058] S12. Take the first child node as the second root node, and randomly select a historical indicator data from the second root node as a split point. Based on the split point, split the second root node using the isolated forest outlier detection algorithm to generate the second child node of the second root node.
[0059] In this embodiment, after the first root node is divided in step S11 to generate the first child node, the first child node will be used as the second root node, and the operation of step S11 will be repeated to divide the historical indicator data of the second root node into different subspaces, i.e., different child nodes, and store the corresponding historical indicator data in the corresponding child node.
[0060] S13. When there is only one historical indicator data in the second child node, the segmentation is stopped. Otherwise, it is determined whether the historical indicator data in the second child node are all the same. If so, the segmentation is stopped. Otherwise, it is determined whether the depth of the historical indicator data in the second child node has reached the depth threshold. If so, the segmentation is stopped. Otherwise, the segmentation is continuously performed to generate child nodes.
[0061] In this embodiment, there are three cases for stopping the segmentation: first, when there is only one historical indicator data in the second child node, the segmentation stops; second, when all the historical indicator data in the second child node are the same, the segmentation stops; and third, when the depth of the historical indicator data in the second child node reaches the depth threshold, the segmentation stops. At this time, the depth threshold is calculated using the depth threshold formula log2ψ, where ψ is based on the number of historical indicator data. That is, the depth threshold is updated in real time based on the number of historical indicator data to set the conditions for stopping the segmentation.
[0062] In one specific embodiment, the number of historical indicator data is 90. According to the depth threshold formula log2ψ, the depth threshold = log2 90 = 6.49, that is, the depth threshold of the historical indicator data is 6.49.
[0063] S14. The first root node and all child nodes form an anomaly detection model; the historical indicator data is input into the anomaly detection model for anomaly detection to determine whether there is abnormal data in the historical indicator data.
[0064] In this embodiment, the anomaly detection model consists of a first root node and all child nodes, that is, the anomaly detection model is a binary tree model composed of multiple nodes.
[0065] At this point, step S14, which involves inputting the historical indicator data into the anomaly detection model for anomaly detection to determine whether there is abnormal data in the historical indicator data, includes:
[0066] S141. Input the historical indicator data into the anomaly detection model, so that each historical indicator data traverses each node of the anomaly detection model, and calculate the depth of each historical indicator data at each node. Calculate the anomaly value of the depth using the following anomaly score formula:
[0067]
[0068] Where: s(x) represents outliers, and h(x) is the depth of historical indicator data x in the node of the anomaly detection model. ψ is the average depth of historical indicator data, and c(ψ) is the average path length of nodes in the anomaly detection model constructed with ψ nodes.
[0069]
[0070] Where H(ψ-1)=ln(ψ-1)+ξ, ξ is Euler's constant, and n is the number of historical index data;
[0071] In this embodiment, as Figure 2 As shown, after inputting historical indicator data into the anomaly detection model, each historical indicator data point will be traversed through each node in the anomaly detection model, and the depth of each historical indicator data point on the corresponding node, i.e., which layer it is in the corresponding node, will be calculated. At the same time, the average path length of the nodes in the anomaly detection model will be calculated. The Euler constant in the formula for the average path length is 0.5772156649. The anomaly value corresponding to the node depth will be calculated according to the anomaly score formula.
[0072] In one specific embodiment, the anomaly detection model is constructed from 5 nodes, with 8 historical indicator data points. The depth of the historical indicator data in the nodes of the anomaly detection model is 4, corresponding to an average depth of (4+3+2+1) / 4 = 2.5. c(ψ) = c(5) = 2H(5-1) - 2(5-1) / 8 = 2ln4 + 0.5772156649 - 1 = 2.35, and the anomaly value is 2. -2.35 =0.196.
[0073] S142. When the abnormal value exceeds the abnormal threshold, the historical indicator data is abnormal data.
[0074] In this embodiment, as Figure 2 As shown, outliers are calculated using the outlier score formula. As can be seen from step S141, the outlier score formula is a negative exponent of 2. Therefore, the range of outliers is between (0, 1). Thus, the outlier threshold is set to 0.5. That is, when the outlier exceeds 0.5, the historical indicator data is considered outlier data.
[0075] S2. Input the historical indicator data without abnormalities into the threshold model to calculate the threshold range of each performance indicator scenario and the threshold range of the indicator cell;
[0076] In this embodiment, as Figure 2 As shown, historical indicator data with outliers removed is input into the threshold model to calculate the indicator scenario threshold and indicator cell threshold for each performance.
[0077] At this point, step S2, which involves inputting the historical indicator data without anomalies into the threshold model to calculate the threshold values for each performance indicator scenario, includes:
[0078] S21. Input the abnormal historical indicator data into the threshold model, classify the abnormal historical indicator data according to the preset scenario classification rules, generate classified historical indicator data, calculate the standard deviation of the classified historical indicator data, set the parameters of the threshold model according to the standard deviation, generate a parameterized threshold model, and cluster the classified historical indicator data using a clustering algorithm to generate clustered historical indicator data.
[0079] In this embodiment, when calculating the threshold values for various performance metrics of historical metrics data, the historical metrics data are classified according to preset scenario classification rules. The preset scenario classification rules are based on the coverage range of the wireless network, classifying coverage ranges of 1-10m as personal area networks, 10m-1km as local area networks, 1km-50km as metropolitan area networks, and coverage ranges exceeding 50km as wide area networks. These scenario classification rules include, but are not limited to, districts / counties, coverage ranges, coverage scenarios, manufacturers, and frequency bands, and can be set according to actual conditions.
[0080] After classification is completed, the standard deviation of the classified historical indicator data is calculated, and the parameters of the threshold model are set based on the calculated standard deviation. The threshold model predefines the domain and core object, where ∈ represents the domain: for x j ∈D, whose ∈ neighborhood contains a sample set D of classified historical indicator data that is related to x j The distance is no greater than the subset of samples in the neighborhood: N ∈ (x j )={x i ∈D|distance(x i x j The number of samples in this subset is denoted as |N| ≤ ∈}. ∈ (x j The core object is: for any sample x. j ∈D, if its ∈ neighborhood corresponding to N ∈ (x jIt contains at least min_points samples: |N ∈ (x j If x ≥ min_points, then x j The core object is defined as min_points = n / 3, where n is the number of samples in the sample set D of the classified historical indicator data. The calculated standard deviation of the indicator is set as the neighborhood parameter of the threshold model, i.e., the standard deviation of the indicator is set to ∈. At the same time, the number of samples in the corresponding sample set D of the classified historical indicator data is obtained, and min_points is set according to the formula n / 3. The core object set Ω = φ is initialized, the number of clusters k = 0 is initialized, the set of unvisited samples Γ = D is initialized, and the class partition C = φ, where φ represents the empty set, thereby realizing the parameterization of the threshold model and generating a parameterized threshold model. The parameterized threshold model will cluster the classified historical indicator data through a clustering algorithm. The clustering algorithm is an existing algorithm, so it will not be elaborated here.
[0081] S22. Based on the principle of having the most categories, select the historical indicator data with the most categories from the clustered historical indicator data, and use the historical indicator data as the master data. Obtain the maximum and minimum values of the master data, and generate the current performance indicator scenario threshold range based on the maximum and minimum values.
[0082] In this embodiment, after clustering the classified historical indicator data in step S22, clustered historical indicator data will be generated. The historical indicator data with the most categories will be used as the master data, and the maximum and minimum values of the master data will be obtained. Based on the maximum and minimum values, the threshold range of the indicator scenario for the current performance will be generated.
[0083] S23. Input the historical indicator data without abnormalities into the threshold model, calculate the standard deviation of the historical indicator data without abnormalities, and set the parameters of the threshold model according to the standard deviation of the indicators to generate a parameterized threshold model. The parameterized threshold model clusters the historical indicator data without abnormalities using a clustering algorithm to generate clustered historical indicator data.
[0084] S24. Based on the principle of having the most categories, select the historical indicator data with the most categories from the clustered historical indicator data, and use the historical indicator data as the master data. Obtain the maximum and minimum values of the master data, and generate the current performance indicator cell threshold range based on the maximum and minimum values.
[0085] In this embodiment, not only the threshold range of the indicator scenario is calculated, but also the threshold range of the indicator cell is calculated. The method for calculating the threshold range of the indicator cell is similar to that for calculating the threshold range of the indicator scenario. That is, the standard deviation of the indicator data without abnormalities is calculated, and the parameters of the threshold model are set according to the standard deviation of the indicator. Then, clustering is performed by clustering algorithm, so as to select the master data according to the principle of the most categories, and the threshold range of the indicator cell for the current performance is generated according to the maximum and minimum values of the master data.
[0086] S3. Construct a poor-quality cell identification model based on the threshold range of the performance indicators and the threshold range of the indicator cells, and identify poor-quality cells through the poor-quality cell identification model.
[0087] In this embodiment, as Figure 2 As shown, the poor-quality cell identification model is constructed based on the comprehensive construction of the threshold range of various performance indicators and the threshold range of indicator cells, and the poor-quality cells are identified by the constructed poor-quality cell identification model.
[0088] At this point, step S3, which involves constructing a poor-quality cell identification model based on the threshold range of each performance indicator scenario and the threshold range of the indicator cell, includes:
[0089] S31. Construct degradation features by combining the threshold range of the performance indicators for each scenario with the threshold range of the indicator cells according to preset degradation rules.
[0090] In this embodiment, as Figure 2 As shown, the maintenance personnel have pre-set degradation rules. When the threshold range of the indicator scenario and the threshold range of the indicator cell for each performance are calculated, they will determine whether each performance has degraded, the number of days of continuous degradation for each performance, and the number of days of degradation within the preset time for each performance according to the calculated threshold range of the indicator scenario and the threshold range of the indicator cell for each performance. This will construct degradation characteristics. The preset degradation rules can be adjusted according to the actual situation.
[0091] S32. Obtain a list of poor-quality cells from user complaints and mark the poor-quality cells in the list to generate marked poor-quality complaint cells. At the same time, obtain a list of measured poor-quality cells and mark the poor-quality cells in the list to generate marked poor-quality measured cells.
[0092] In this embodiment, a list of poor-quality cells reported by users is obtained and the poor-quality cells in the list of reported poor-quality cells are marked. At the same time, a list of poor-quality cells actually tested is obtained and the poor-quality cells in the list of actual poor-quality cells are marked. That is, both the poor-quality cells reported by users and the poor-quality cells actually tested are marked.
[0093] S33. Obtain the degradation feature data of the marked poor quality complaint cells and the marked poor quality measured cells according to the degradation features, and perform normalization processing on the degradation feature data to generate normalized degradation feature data.
[0094] In this embodiment, as Figure 2 As shown, based on the degradation features constructed in step S31, the degradation feature data corresponding to the labeled poor quality complaint cells and labeled poor quality measured cells in step S32 are obtained, and the obtained degradation feature data is normalized by (0,1), that is, the performance that has been degraded is marked as 1, and the performance that has not been degraded is marked as 0, thus realizing the normalization of the degradation feature data.
[0095] S34. Based on the normalized deterioration feature data, construct a poor-quality cell identification model using the LightGBM classification algorithm, and identify poor-quality cells using the poor-quality cell identification model.
[0096] In this embodiment, as Figure 2 As shown, based on the normalized degraded feature data from step S33, a poor-quality cell identification model is constructed using the LightGBM classification algorithm to identify poor-quality cells.
[0097] At this point, the step S34 of identifying poor-quality cells using the poor-quality cell identification model includes:
[0098] S341. The existing indicator data of each performance of the cell wireless network are obtained in real time through the poor quality cell identification model, and the existing indicator data are compared with the indicator scenario threshold range and the indicator cell threshold range. When the existing indicator data of each performance of the cell wireless network is not within the indicator scenario threshold range or the indicator cell threshold range, the current performance deteriorates, and the degree of deterioration of the current performance is calculated by the deterioration formula.
[0099] In this embodiment, as Figure 2 As shown, the poor-quality cell identification model acquires existing performance metrics of the cell's wireless network in real time and compares these metrics with the corresponding threshold ranges for the specified scenarios and cells. It then determines whether the existing metrics fall within the corresponding threshold range. If not, the current performance has deteriorated, and the degree of degradation is calculated using the following formula:
[0100]
[0101]
[0102] Where g(x) represents the degree of current performance degradation, max represents the maximum value of the scene / cell threshold range, and mix represents the minimum value of the scene / cell threshold range.
[0103] S342. Obtain the number of days of continuous degradation of the current performance and the number of days of degradation within a preset time period based on the degradation characteristics.
[0104] In this embodiment, as Figure 2 As shown, the number of days of continuous performance degradation and the number of days of degradation within a preset time are obtained based on the degradation characteristics. In this case, the preset time is 7 days, that is, the number of days of current performance degradation within 7 days is obtained. The preset time can be adjusted according to the actual situation.
[0105] S343. Identify poor-quality cells based on the degradation degree, the number of days of continuous degradation, and the number of days of degradation within the preset time period.
[0106] In this embodiment, as Figure 2 As shown, poor-quality cells are identified based on the degree of degradation of the current performance calculated in step S341, the number of days of continuous degradation of the current performance obtained in step S342, and the number of degradation days within a preset time.
[0107] Example 2
[0108] Please refer to Figure 3 A poor-quality cell identification device 1 includes a memory 3, a processor 2, and a computer program stored in the memory 3 and executable on the processor 2. When the processor 2 executes the computer program, it implements the steps in the above embodiment 1.
[0109] Since the systems / devices described in the above embodiments of the present invention are systems / devices used to implement the methods of the above embodiments of the present invention, those skilled in the art can understand the specific structure and modifications of the systems / devices based on the methods described in the above embodiments of the present invention, and therefore will not be repeated here. All systems / devices used in the methods of the above embodiments of the present invention fall within the scope of protection of the present invention.
[0110] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, systems, or computer program products. Therefore, the present invention can take the form of a completely hardware embodiment, a completely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention can take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0111] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, as well as combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions.
[0112] It should be noted that any reference numerals placed between parentheses in the claims should not be construed as limiting the claims. The word "comprising" does not exclude the presence of components or steps not listed in the claims. The word "a" or "an" preceding a component does not exclude the presence of a plurality of such components. The invention can be implemented by means of hardware comprising several different components and by means of a suitably programmed computer. In claims that enumerate several means, several of these means may be embodied by the same hardware. The use of the terms first, second, third, etc., is merely for convenience of expression and does not indicate any order. These terms can be understood as part of the component names.
[0113] Furthermore, it should be noted that in the description of this specification, the terms "one embodiment," "some embodiments," "embodiment," "example," "specific example," or "some examples," etc., refer to specific features, structures, materials, or characteristics described in connection with that embodiment or example, which are included in at least one embodiment or example of the present invention. In this specification, the illustrative expressions of the above terms do not necessarily refer to the same embodiment or example. Moreover, the specific features, structures, materials, or characteristics described may be combined in any suitable manner in one or more embodiments or examples. Furthermore, without contradiction, those skilled in the art can combine and integrate the different embodiments or examples described in this specification, as well as the features of different embodiments or examples.
[0114] Although preferred embodiments of the invention have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the claims should be interpreted to include both the preferred embodiments and all changes and modifications falling within the scope of the invention.
[0115] Obviously, those skilled in the art can make various modifications and variations to this invention without departing from its spirit and scope. Therefore, if these modifications and variations fall within the scope of the claims of this invention and their equivalents, then this invention should also include these modifications and variations.
Claims
1. A method for identifying a poor-quality cell, characterized by, include: Collect historical performance data of various aspects of the community wireless network, input the historical performance data into the anomaly detection model for anomaly detection, determine whether there is any abnormal data in the historical performance data, and if so, remove the abnormal data from the historical performance data to generate historical performance data without anomalies. The historical indicator data without anomalies is input into the threshold model to calculate the threshold range of each performance indicator scenario and the threshold range of the indicator cell; A poor-quality cell identification model is constructed based on the threshold range of the performance indicators and the threshold range of the cell indicators, and poor-quality cells are identified through the poor-quality cell identification model. The threshold range for each performance indicator scenario, calculated by inputting the historical indicator data without anomalies into the threshold model, includes: The abnormal historical indicator data is input into the threshold model. The abnormal historical indicator data is classified according to the preset scenario classification rules to generate classified historical indicator data. The standard deviation of the classified historical indicator data is calculated. The parameters of the threshold model are set according to the standard deviation to generate a parameterized threshold model. The parameterized threshold model is then used to cluster the classified historical indicator data using a clustering algorithm to generate clustered historical indicator data. Based on the principle of having the most categories, the historical indicator data with the most categories is selected from the clustered historical indicator data, and the historical indicator data is used as the master data. The maximum and minimum values of the master data are obtained, and the threshold range of the indicator scenario for the current performance is generated based on the maximum and minimum values. The threshold range for each performance indicator cell, calculated by inputting the historical indicator data without anomalies into the threshold model, includes: The abnormal historical indicator data is input into the threshold model, the standard deviation of the abnormal historical indicator data is calculated, and the parameters of the threshold model are set according to the standard deviation of the indicator to generate a parameterized threshold model. The parameterized threshold model is then used to cluster the abnormal historical indicator data using a clustering algorithm to generate clustered historical indicator data. Based on the principle of having the most categories, the historical indicator data with the most categories is selected from the clustered historical indicator data, and the historical indicator data is used as the master data. The maximum and minimum values of the master data are obtained, and the threshold range of the indicator cell for the current performance is generated based on the maximum and minimum values. The poor-quality cell identification model is constructed based on the threshold range of each performance indicator scenario and the threshold range of the indicator cell, including: Degradation features are constructed by combining the threshold range of the performance indicators for each scenario with the threshold range of the indicator cells with preset degradation rules. Obtain a list of poor-quality cells from user complaints and mark the poor-quality cells in the list to generate marked poor-quality complaint cells. At the same time, obtain a list of actual poor-quality cells and mark the poor-quality cells in the list to generate marked poor-quality measured cells. Based on the degradation characteristics, the degradation characteristic data of the marked poor quality complaint cells and the marked poor quality measured cells are obtained, and the degradation characteristic data is normalized to generate normalized degradation characteristic data. Based on the normalized degradation feature data, a poor-quality cell identification model is constructed using the LightGBM classification algorithm, and poor-quality cells are identified using the poor-quality cell identification model.
2. The method of claim 1, wherein the step of identifying the poor cell area is characterized by, The process of collecting historical performance data of the cell's wireless network and inputting this historical data into an anomaly detection model to determine whether there is abnormal data in the historical data includes: Historical performance data of the community wireless network are collected, and a random number of historical performance data are randomly selected from the historical performance data as a sample dataset. The sample dataset is used as the first root node, and a historical performance data is randomly selected from the first root node as a split point. Based on the split point, the first root node is split using the isolated forest outlier detection algorithm to generate the first child node of the first root node. The first child node is used as the second root node, and a historical indicator data is randomly selected from the second root node as a split point. Based on the split point, the second root node is split using the isolated forest outlier detection algorithm to generate the second child node of the second root node. If there is only one historical indicator data in the second child node, the segmentation stops. Otherwise, it is determined whether all the historical indicator data in the second child node are the same. If so, the segmentation stops. Otherwise, it is determined whether the depth of the historical indicator data in the second child node has reached the depth threshold. If so, the segmentation stops. Otherwise, the segmentation continues to generate child nodes. The first root node and all its child nodes form an anomaly detection model; the historical indicator data is input into the anomaly detection model for anomaly detection to determine whether there is any abnormal data in the historical indicator data.
3. The method for identifying poor-quality cells as described in claim 2, characterized in that, The step of inputting the historical indicator data into the anomaly detection model for anomaly detection, and determining whether there is abnormal data in the historical indicator data, includes: The historical indicator data is input into the anomaly detection model, allowing each historical indicator data point to traverse every node of the anomaly detection model. The depth of each historical indicator data point at each node is calculated, and the anomaly value of the depth is calculated using the following anomaly score formula: Where: s(x) represents outliers, h(x) = ln(x) + ξ, h(x) is the depth of the historical indicator data x in the node of the anomaly detection model, and ξ is Euler's constant; E[ ] is the average depth of historical indicator data, and c(ψ) is the average path length of nodes in the anomaly detection model constructed with ψ nodes; When the outlier exceeds the outlier threshold, the historical indicator data is considered outlier.
4. The method for identifying poor-quality cells as described in claim 1, characterized in that, The step of identifying poor-quality cells using the poor-quality cell identification model includes: The poor-quality cell identification model obtains the existing indicator data of each performance of the cell wireless network in real time, and compares the existing indicator data with the indicator scenario threshold range and the indicator cell threshold range. When the existing indicator data of each performance of the cell wireless network is not within the indicator scenario threshold range or the indicator cell threshold range, the current performance deteriorates, and the degree of deterioration of the current performance is calculated. The number of days of continuous performance degradation and the number of days of degradation within a preset time period are obtained based on the degradation characteristics. Poor-quality cells are identified based on the degree of degradation, the number of days of continuous degradation, and the number of days of degradation within the preset time period.
5. The device for identifying poor-quality cells as described in claim 1, comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, characterized in that, When the processor executes the computer program, it implements the method as described in any one of claims 1 to 4.