A data processing method, device, apparatus, and storage medium
By grouping and aggregating drive test data and using an improved cross-ray matching method, a landmark-data file association table is generated. This solves the storage burden and analysis bias problems caused by unreasonable data preprocessing in existing technologies, and achieves more efficient data processing and accuracy.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA TELECOM CORP LTD
- Filing Date
- 2023-08-16
- Publication Date
- 2026-06-26
AI Technical Summary
Existing technologies fail to effectively combine business and analysis algorithms in drive test data preprocessing, resulting in excessively large data volumes, increased storage burden, and biased analysis results.
By acquiring road test data files based on landmark facilities, grouping and aggregating them, adding tile numbers, and using an improved cross-ray method to match landmark outlines, an updated landmark-data file association table is generated.
It improves the accuracy of data results and the linearization of analysis results, reduces the computation time of data processing, and adapts to the actual configuration requirements of big data platforms.
Smart Images

Figure CN117194596B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of communication technology, and in particular to a data processing method, apparatus, device, and storage medium. Background Technology
[0002] In wireless network optimization, drive testing (DT) is a fundamental task. By connecting terminals with service capabilities to drive testing equipment, coverage and signaling interaction data are collected from the existing network. This provides a relatively objective and comprehensive understanding of the 4G / 5G network operation and service capabilities. However, in current technologies, the preprocessing of drive test data is not well integrated with services and analysis algorithms, which not only affects the efficiency of the algorithms but also causes deviations in the analysis results. Summary of the Invention
[0003] In view of the above problems, embodiments of the present invention are proposed to provide a data processing method, apparatus, device, and storage medium that overcomes or at least partially solves the above problems.
[0004] To address the above problems, this invention discloses a data processing method, the method comprising:
[0005] Obtain drive test data files based on landmark facilities;
[0006] The road test data in the road test data file is grouped to obtain multiple groups of road test data;
[0007] Multiple sets of road test data are aggregated and calculated to obtain multiple aggregation results;
[0008] Tile numbers are added to the multiple aggregation results to obtain multiple target data points;
[0009] Based on the improved cross-ray method, the target data points are matched with the outlines of landmarks to obtain an updated landmark-data file association table.
[0010] Optionally, the step of grouping the road test data in the road test data file to obtain multiple groups of road test data includes:
[0011] Obtain the instantaneous and average speeds of the data points in the road test data;
[0012] The state of the test equipment is determined based on the instantaneous and average speeds of the road test data.
[0013] Based on the state of the test equipment, the road test data is stored in the corresponding array.
[0014] Optionally, determining the state of the testing equipment based on the instantaneous and average velocities of the data points in the road test data includes:
[0015] If the instantaneous velocity is less than the first instantaneous velocity threshold and the average velocity is less than the average velocity threshold, then the test equipment is in a stationary state.
[0016] The step of storing the road test data into a corresponding array according to the state of the test equipment includes:
[0017] If the test equipment is stationary, the road test data is stored in the first array.
[0018] Optionally, it also includes:
[0019] If the instantaneous velocity is greater than or equal to the second instantaneous velocity threshold, then the test device is in a fast fading state, wherein the first instantaneous velocity threshold is less than the second instantaneous velocity threshold;
[0020] The step of storing the road test data into a corresponding array according to the state of the test equipment includes:
[0021] If the test device is in a fast fading state, the road test data is stored in the second array.
[0022] Optionally, it also includes:
[0023] If the instantaneous velocity is greater than or equal to the first instantaneous velocity threshold and less than the second instantaneous velocity threshold, or if the average velocity is greater than the average velocity threshold, then the test device is in a slow fading state.
[0024] The step of storing the road test data into a corresponding array according to the state of the test equipment includes:
[0025] If the test equipment is in a slow fading state, the road test data is stored in the third array.
[0026] Optionally, the aggregation calculation of the multiple sets of road test data is performed to obtain multiple aggregation results, including:
[0027] For the multiple sets of road test data, an aggregation function is used to calculate the aggregation result corresponding to each set of road test data. The aggregation function includes one of the following: summation, maximum value, average value, and first valid value.
[0028] Optionally, adding tile numbers to the multiple aggregation results to obtain multiple target data points includes:
[0029] For each of the multiple aggregation results, obtain the longitude and latitude of each aggregation result;
[0030] Calculate the tile number corresponding to each aggregation result based on the longitude and latitude of each aggregation result;
[0031] Each aggregation result is assigned a corresponding tile number to obtain multiple target data points.
[0032] Optionally, the step of matching the plurality of target data points with the contours of landmarks according to the improved cross-ray method to obtain an updated landmark-data file association table includes:
[0033] Obtain the latitude and longitude range of landmark facilities;
[0034] The multiple target data points are filtered based on the latitude and longitude range of the landmark facilities;
[0035] The improved cross-ray method is used to determine whether the filtered data points are located within the polygon defined by the landmark. The polygon defined by the landmark is determined according to the contour information in the landmark information table. The contour information includes the longitude and latitude of multiple points on the polygon defined by the contour.
[0036] If the location is within the polygon defined by the landmark, the outline of each type of landmark facility information is matched to generate a new landmark and data file association table.
[0037] Optionally, determining whether the filtered data points are located within the polygon defined by the landmark using the improved cross-ray method includes:
[0038] Draw an eastward ray from the filtered data points and calculate the number of intersections between the eastward ray and the polygon;
[0039] If the number of intersection points is odd, then the filtered data points are determined to be located within the polygon defined by the landmark.
[0040] Optionally, it also includes:
[0041] If the number of intersection points is not odd, then draw south, west, and north rays from the filtered data points, and calculate the corresponding number of intersection points respectively;
[0042] If the northward and eastward rays emanating from the data point do not intersect with the polygon, but the southward and westward rays emanating from the data point both intersect with the polygon, then the data point is determined to be a point in the first quadrant.
[0043] Based on the first quadrant point, select the first target quadrant point located in the landmark facility auxiliary information table;
[0044] Remove the first target quadrant point from the landmark facility auxiliary information table, and add the first quadrant point to the landmark facility auxiliary information table.
[0045] Optionally, it also includes:
[0046] If the eastward ray and the southward ray have no intersection with the polygon, then the data point is indeed a point in the second quadrant.
[0047] Based on the second quadrant points, select the second target quadrant points located in the landmark facility auxiliary information table;
[0048] Remove the second target quadrant point from the landmark facility auxiliary information table, and add the second quadrant point to the landmark facility auxiliary information table.
[0049] Optionally, it also includes:
[0050] If the south-facing ray and the west-facing ray have no intersection with the polygon, then the data point is indeed a point in the third quadrant.
[0051] Based on the third quadrant points, select the third target quadrant points located in the landmark facility auxiliary information table;
[0052] The third target quadrant point is deleted from the landmark facility auxiliary information table, and the third quadrant point is added to the landmark facility auxiliary information table.
[0053] Optionally, it also includes:
[0054] If neither the westward ray nor the northward ray intersects with the polygon, then the data point is indeed a point in the fourth quadrant.
[0055] Based on the fourth quadrant point, select the fourth target quadrant point located in the landmark facility auxiliary information table;
[0056] The fourth target quadrant point is deleted from the landmark facility auxiliary information table, and the fourth quadrant point is added to the landmark facility auxiliary information table.
[0057] The present invention also discloses a data processing apparatus, the apparatus comprising:
[0058] The acquisition module is used to acquire drive test data files based on landmark facilities.
[0059] The grouping module is used to group the road test data in the road test data file to obtain multiple groups of road test data;
[0060] The calculation module is used to perform aggregation calculations on the multiple sets of road test data respectively to obtain multiple aggregation results;
[0061] The numbering module is used to add tile numbers to the multiple aggregation results to obtain multiple target data points;
[0062] The matching module is used to match the target data points with the outlines of landmarks using the improved cross-ray method, so as to obtain an updated landmark-data file association table.
[0063] The present invention also discloses an electronic device, comprising: a processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein the computer program, when executed by the processor, implements the steps of the data processing method as described in any of the preceding claims.
[0064] The present invention also discloses a computer-readable storage medium storing a computer program, which, when executed by a processor, implements the steps of the data processing method as described in any one of the above descriptions.
[0065] The embodiments of the present invention have the following advantages:
[0066] This invention discloses a data processing method that involves acquiring a drive test data file based on landmark facilities; grouping the drive test data in the drive test data file to obtain multiple sets of drive test data; performing aggregation calculations on the multiple sets of drive test data to obtain multiple aggregation results; adding tile numbers to each of the multiple aggregation results to obtain multiple target data points; and matching the target data points with the outlines of landmarks using an improved cross-ray method to obtain an updated landmark-data file association table. This invention subdivides the data preprocessing work into two stages, flexibly adapting to the actual configuration requirements of the data platform. By using the improved cross-ray method to match landmark facilities, the analysis results are linearized and standardized, improving the accuracy of the data results. Attached Figure Description
[0067] Figure 1 This is a flowchart of the steps of a data processing method provided in an embodiment of the present invention;
[0068] Figure 2 This is a flowchart of another data processing method provided in an embodiment of the present invention;
[0069] Figure 3 This is a schematic diagram of a landmark outline provided in an embodiment of the present invention;
[0070] Figure 4 A schematic diagram of a cross ray provided in an embodiment of the present invention;
[0071] Figure 5 A schematic diagram of another landmark outline provided in an embodiment of the present invention;
[0072] Figure 6 This is a flowchart of a method for determining the position of data points, provided by an embodiment of the present invention;
[0073] Figure 7 A structural block diagram of a data processing device provided in an embodiment of the present invention. Detailed Implementation
[0074] To make the above-mentioned objects, features and advantages of the present invention more apparent and understandable, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments.
[0075] In wireless network optimization, drive testing (DT) is a fundamental task. By connecting service-capable terminals to drive testing equipment, coverage and signaling interaction data are collected from the existing network. This provides a relatively objective and comprehensive understanding of the 4G / 5G network's operational status and service capabilities. Utilizing 4G / 5G drive test data can not only effectively guide network optimization but also provide multi-layered data support for services such as connected vehicles and the Internet of Things.
[0076] Road test software backends all provide road test data analysis functions, but these are limited to the analysis of single or a few data points. If you want to perform joint analysis on a large amount of test data over a period of time, you can build a relational database or a NoSQL big data platform (data lake) to store the road test data; then write data analysis scripts to perform analysis periodically, and finally use a B / S architecture program to present the analysis results to the user.
[0077] For the requirements of big data platforms (or data lakes), the data preprocessing process is the ETL (Extract, Transform, Load) process. Road test data preprocessing also follows this process. It mainly involves extracting data rows (or data documents) that are valuable for subsequent analysis based on the business characteristics of 4G / 5G road test data, removing data with no or very low information content, merging them according to certain rules, and finally loading (writing) them into the database or big data platform.
[0078] In existing technologies, data preprocessing focuses more on the E (extraction) and L (loading) stages of ETL, while paying less attention to T (transformation). This leads to two problems: First, the data volume is too large, increasing the storage burden on the database or data platform, which is detrimental to improving the efficiency of subsequent analysis and computation algorithms. Moreover, some data platforms (data lakes) have limits on the amount of data that can be imported at one time, which can cause data import failures. Second, the imported data is not well integrated with business and analysis algorithms, which not only affects the efficiency of the algorithms, but more importantly, inappropriate preprocessing algorithms can cause deviations in the analysis results.
[0079] One of the core concepts of this invention is to acquire a road test data file based on landmark facilities; group the road test data in the road test data file to obtain multiple sets of road test data; perform aggregation calculations on the multiple sets of road test data to obtain multiple aggregation results; add tile numbers to the multiple aggregation results to obtain multiple target data points; and match the target data points with the outlines of landmarks using an improved cross-ray method to obtain an updated landmark-data file association table. This invention subdivides the data preprocessing work into two stages, flexibly adapting to the actual configuration needs of the data platform. By matching landmark facilities using the improved cross-ray method, the analysis results are linearized and standardized, improving the accuracy of the data results.
[0080] Reference Figure 1 The diagram illustrates a flowchart of a data processing method provided by an embodiment of the present invention. The method may specifically include the following steps:
[0081] Step 101: Obtain the road test data file based on landmark facilities.
[0082] In this embodiment of the invention, a road test data file based on landmark facilities can be exported from the road test software. The data in the road test data file is stored in the form of a table. The fields of the data table in the road test data file include two parts: necessary fields and variable fields. The necessary fields include longitude, latitude, physical cell identifier, base station number, cell number, and fields indicating coverage (such as RSRP, SINR), as shown in Table 1. The optional fields are specifically determined according to the services tested in the road test data (including 4G, NSA, SA, NB-IoT, VoLTE, and VoNR, etc.), such as CQI, MCS, and uplink and downlink rates.
[0083] Field Name Field type Field Classification Can it be empty? PC Time Date and Time Timestamp no longitude floating point Location no latitude floating point Location no RSRP floating point Coverage intensity yes SINR floating point Coverage intensity yes PCI (Physical Cell Identifier) short plastic Community Information yes Base station number Plastic Surgery Community Information yes Community Number short plastic Community Information yes …… …… …… ……
[0084] Table 1
[0085] Step 102: Group the road test data in the road test data file to obtain multiple groups of road test data.
[0086] In this embodiment of the invention, multiple sets of road test data in a road test data file can be grouped based on the characteristics of the road test data to obtain multiple sets of road test data.
[0087] Step 103: Perform aggregation calculations on multiple sets of road test data to obtain multiple aggregation results.
[0088] In this embodiment of the invention, after obtaining multiple sets of road test data, each set of road test data can be aggregated and calculated using a preset algorithm to obtain multiple aggregated results. In one example, the average value of each set of road test data can be calculated to obtain the average value of multiple sets of road test data, i.e., multiple aggregated results.
[0089] Step 104: Add tile numbers to the multiple aggregation results to obtain multiple target data points.
[0090] In this embodiment of the invention, after performing aggregation calculations on multiple sets of road test data to obtain multiple aggregation results, tile numbers can be added to each set of aggregation results to obtain multiple target data points. In one example, multiple aggregation results can be numbered sequentially according to the size of different aggregation results to obtain multiple target data points.
[0091] Step 105: Based on the improved cross-ray method, match the target data points with the outlines of the landmarks to obtain the updated landmark-data file association table.
[0092] In this embodiment of the invention, the improved cross-ray method can be used to match multiple determined target data points with the contours of landmarks, thereby improving the stability of the matching process.
[0093] This invention acquires drive-test data files based on landmark facilities; groups the drive-test data in the files to obtain multiple sets of drive-test data; aggregates these multiple sets of data to obtain multiple aggregation results; adds tile numbers to each aggregation result to obtain multiple target data points; and matches these target data points with the outlines of landmarks using an improved cross-ray method to obtain an updated landmark-data file association table. This invention subdivides the data preprocessing work into two stages, flexibly adapting to the actual configuration requirements of the data platform. The improved cross-ray method for matching landmark facilities linearizes and standardizes the analysis results, improving the accuracy of the data results.
[0094] Reference Figure 2 The diagram illustrates a flowchart of another data processing method provided by an embodiment of the present invention. The method may specifically include the following steps:
[0095] Step 201: Obtain the road test data file based on landmark facilities.
[0096] Step 202: Obtain the instantaneous speed and average speed of the data points in the road test data.
[0097] In this embodiment of the invention, the instantaneous velocity v of data points in the road test data can be obtained. s and average velocity V t .
[0098] Step 203: Determine the state of the test equipment based on the instantaneous speed and average speed of the road test data.
[0099] In this embodiment of the invention, the instantaneous velocity v of the data point can be used as a basis. sand average velocity v t Determine the status of the test equipment.
[0100] In one embodiment of the present invention, if the instantaneous speed is less than a first instantaneous speed threshold and the average speed is less than an average speed threshold, then the test device is in a stationary state.
[0101] In this embodiment of the invention, a first instantaneous velocity threshold can be set. The average speed threshold is like and This confirms that the test equipment is in a stationary state. Equivalent to Right now Equivalent to Right now
[0102] The condition for determining the stationary state of the test equipment is as follows: and Where the subscripts j and k must satisfy j < k, when the user is in a quasi-static state, the timestamp is used as the basis for grouping, that is... Output array [R] j , ..., R k-1 ].
[0103] In one embodiment of the present invention, if the instantaneous velocity is greater than or equal to a second instantaneous velocity threshold, the test device is in a fast fading state, wherein the first instantaneous velocity threshold is less than the second instantaneous velocity threshold.
[0104] In this embodiment of the invention, if in Based on this formula, the instantaneous velocity threshold for fast fading under various typical frequency configurations can be obtained. As shown in Table 2
[0105]
[0106] Table 2
[0107] In this embodiment of the invention, the RSRP change threshold is set to T. RSRP =3dB / 100ms, then the coverage condition expression is Δ RSRP (k)>T RSRP Based on the above two conditions, the criterion for determining a fast fading state is: Or Δ RSRP (k)>T RSRP In fast fading mode, the grouping condition is immediate grouping; as long as the above condition is met, the array [R] is output immediately. j, ..., R k-1 ].
[0108] In one embodiment of the present invention, if the instantaneous velocity is greater than or equal to a first instantaneous velocity threshold and less than a second instantaneous velocity threshold, or if the average velocity is greater than the average velocity threshold, then the test device is in a slow fading state.
[0109] In this embodiment of the invention, if or This confirms that the test equipment is in a slow fading state.
[0110] Step 204: Store the road test data into the corresponding array according to the status of the test equipment.
[0111] In this embodiment of the invention, if the test device is in a stationary state, the road test data is stored in the first array; if the test device is in a fast fading state, the road test data is stored in the second array; and if the test device is in a slow fading state, the road test data is stored in the third array, thereby enabling the detection and grouping of different road test data.
[0112] Step 205: For multiple sets of road test data, use aggregation functions to calculate the aggregation result corresponding to each set of road test data. Aggregation functions include: summation, maximum value, average value, and the first valid value.
[0113] In this embodiment of the invention, if the output array is R j,k =[R j , ..., R k-1 Let's define the relational expression for the data table being processed as R = R(A, B, C, ...), where A, B, C, ... are the fields of R. Before performing aggregation operations on field A, we need to filter out non-empty data, using the relational expression π. A≠null (σ A (R)) Based on this, mathematical operations such as summation, maximum value, and average value are performed on the output scalar array. In this way, each field is operated on once, and finally a record with the same field as the array is output, as shown in Table 3, which shows a schematic table of aggregate functions applied to some typical fields provided by an embodiment of the present invention.
[0114]
[0115] Table 3
[0116] In one embodiment of the present invention, after aggregating and calculating each set of road test data, the total number of test points and the number of covered test points in the data file can be calculated. Specifically, the effective coverage field S = π can be selected. RSRP≠null∧SINR≠null (R), then the total number of test points n = count(S), and the number of covered points nc =count(π) RSRP>-105dBm∧SINR>-3dB (S)), where -105dBm and -3dB are the set RSRP and SINR thresholds.
[0117] Step 206: For multiple aggregation results, obtain the longitude and latitude of each aggregation result.
[0118] In this embodiment of the invention, each aggregation result carries latitude and longitude information. In one example, the latitude and longitude coordinates of the obtained aggregation result are (x, y).
[0119] Step 207: Calculate the tile number corresponding to each aggregation result based on the longitude and latitude of each aggregation result.
[0120] In this embodiment of the invention, it is assumed that the longitude and latitude ranges of the study area are respectively
[0121] x0≤x<x0+K·Δ x
[0122] y≥y0
[0123] The tile number corresponding to each aggregation result can be calculated using formula (1):
[0124]
[0125] Here, the rectangular length (longitude span) and width (latitude span) of the tile are defined as Δ. x and Δ y Where K is a positive integer, x0 is a value representing longitude, and y0 is a value representing latitude.
[0126] Step 208: Add the corresponding tile number to each aggregation result to obtain multiple target data points.
[0127] In this embodiment of the invention, after determining the tile number of each aggregation result, a corresponding tile number can be added to each aggregation result to obtain multiple target data points.
[0128] Step 209: Based on the improved cross-ray method, match the target data points with the outlines of the landmarks to obtain an updated landmark-data file association table.
[0129] In one embodiment of the present invention, step 209 may include the following sub-steps:
[0130] Sub-step S21: Obtain the latitude and longitude range of the landmark facility.
[0131] In this embodiment of the invention, the latitude and longitude of the lower left corner of the landmark facility rectangle can be set as (x1, y1), and the latitude and longitude of the upper right corner can be set as (x2, y2).
[0132] Sub-step S22: Filter multiple target data points based on the latitude and longitude range of the landmark facilities.
[0133] In this embodiment of the invention, multiple target data points can be initially screened based on the latitude and longitude range of landmark facilities, such as... Figure 3 This diagram illustrates a landmark outline provided by an embodiment of the present invention. The polygon represents the landmark's outline, and the rectangle (actually a curved rectangle on Earth) represents the landmark's latitude and longitude range. The points in the diagram are road test target data points. Clearly, if a test point is within the polygon, it must also be within the rectangle. Based on this principle, the relation representing the road test data points can be denoted as R, where the fields representing latitude and longitude in R are x and y, respectively. The relational algebra for the initial screening process is then... This relational algebra is used to perform preliminary screening of multiple target data points. The present invention has largely filtered out data points that fall outside the landmark polygon through the preliminary screening algorithm.
[0134] Sub-step S23 uses an improved cross-ray method to determine whether the filtered data points are located within the polygon defined by the landmark. The polygon defined by the landmark is determined based on the contour information in the landmark information table. The contour information includes the longitude and latitude of multiple points on the polygon defined by the contour.
[0135] In this embodiment of the invention, after filtering multiple target data points, the improved cross-ray method can be used to determine whether the filtered data points are located within the polygon defined by the landmark.
[0136] In one embodiment of the present invention, determining whether the filtered data points are located within the polygon defined by the landmark using the improved cross-ray method includes: drawing an eastward ray from the filtered data points and calculating the number of intersections between the eastward ray and the polygon; if the number of intersections is odd, then it is determined that the filtered data points are located within the polygon defined by the landmark.
[0137] In embodiments of the present invention, such as Figure 4 The diagram illustrates a cross ray provided by an embodiment of the present invention. A ray can be drawn from the filtered data point A. If the intersection point is a single point, it is determined to be inside the polygon. Point B is a point outside the polygon. If the number of intersection points with the polygon is even, it indicates that the point is outside the polygon.
[0138] In one embodiment of the present invention, if the number of intersection points is not odd, south-facing, west-facing, and north-facing rays are drawn from the filtered data points, and the corresponding number of intersection points is calculated respectively; if the north-facing and east-facing rays drawn from the data points have no intersection points with the polygon, and the south-facing and west-facing rays drawn from the data points both have intersection points with the polygon, then the data points are determined to be points in the first quadrant; based on the first quadrant points, first target quadrant points located in the landmark facility auxiliary information table are filtered out; the first target quadrant points are deleted from the landmark facility auxiliary information table, and the first quadrant points are added to the landmark facility auxiliary information table.
[0139] Specifically, such as Figure 5 This diagram illustrates another landmark outline provided by an embodiment of the present invention. The areas labeled I, II, III, and IV correspond to the first, second, third, and fourth quadrants, respectively. Therefore, if the northward and eastward rays emanating from the data point do not intersect with the polygon, and the southward and westward rays emanating from the data point both intersect with the polygon, then the data point can be determined as a point in the first quadrant. If neither the eastward nor the southward rays intersect with the polygon, then the data point is indeed a point in the second quadrant. If neither the southward nor the westward rays intersect with the polygon, then the data point is indeed a point in the third quadrant. If neither the westward nor the northward rays intersect with the polygon, then the data point is indeed a point in the fourth quadrant.
[0140] In this embodiment of the invention, after determining that the target data point is located outside the polygon and determining the quadrant in which the data point is located, the first target quadrant point in the landmark facility auxiliary information table can be selected according to the quadrant in which the data point is located, and then the data point is added to the landmark facility auxiliary information table, as shown in Table 4, which illustrates a landmark facility auxiliary information provided by this embodiment of the invention.
[0141]
[0142] Table 4
[0143] Specifically, if the data point is a first quadrant point, the first target quadrant point located in the landmark facility auxiliary information table can be selected based on the first quadrant point; the first target quadrant point can be deleted from the landmark facility auxiliary information table, and the first quadrant point can be added to the landmark facility auxiliary information table.
[0144] Specifically, if the latitude and longitude coordinates of the point in the first quadrant are P(x0, y0), then through the relational formula... Select the first target quadrant point from the landmark facility auxiliary information table, where S is the existing set of records in the landmark facility auxiliary information table, delete the selected first target quadrant point from the landmark facility auxiliary information table, and add point P to the landmark facility auxiliary information table.
[0145] This invention fully mines the landmark facility information in the road test data and marks each test data result table; the improved cross ray algorithm has designed a concept and analysis algorithm for four quadrant points, which improves the matching efficiency.
[0146] If the data point is a point in the second quadrant, then based on the point in the second quadrant, the second target quadrant point located in the landmark facility auxiliary information table can be selected; the second target quadrant point can be deleted from the landmark facility auxiliary information table, and the second quadrant point can be added to the landmark facility auxiliary information table.
[0147] If the data point is a third quadrant point, then based on the third quadrant point, the third target quadrant point located in the landmark facility auxiliary information table can be filtered out; the third target quadrant point can be deleted from the landmark facility auxiliary information table, and the third quadrant point can be added to the landmark facility auxiliary information table.
[0148] If the data point is a fourth quadrant point, then based on the fourth quadrant point, the fourth target quadrant point located in the landmark facility auxiliary information table can be filtered out; the fourth target quadrant point can be deleted from the landmark facility auxiliary information table, and the fourth quadrant point can be added to the landmark facility auxiliary information table.
[0149] like Figure 6 The diagram illustrates a flowchart of a method for determining the location of a data point according to an embodiment of the present invention. First, an eastward ray is drawn from the point, and the intersection points with the polygon are calculated. If the number of intersection points is odd, the point is determined to be inside the polygon and added to the landmark and data file association table. If the number of intersection points is even, the number of intersection points with the other three directions is calculated, and it is determined whether the point is a quadrant point. If it is not a quadrant point, the point is determined to be outside the polygon. If the point is a quadrant point, it is matched with the data in the landmark facility auxiliary node information table and added to the landmark facility auxiliary node information table.
[0150] In this embodiment of the invention, the existing cross-ray method is improved by dividing the process of matching road test data with landmark facilities into two steps: preliminary screening and fine synchronization. A self-learning mechanism is introduced to significantly increase the proportion of the coarse screening stage with a very short computation time and correspondingly reduce the proportion of the fine matching stage with a longer computation time, thereby reducing the overall computation time of the process of matching road test data with landmark facilities.
[0151] Sub-step S24: If it is located within the polygon defined by the landmark, then match the outline of the landmark facility information for each type to generate a new landmark and data file association table.
[0152] In this embodiment of the invention, if it is determined that the data point is located within the polygon, the outline of each type of landmark facility information can be matched to generate a record of the landmark and data file association table in the database. This table may include the landmark number, data file number, number of test points, and number of test points that meet the coverage conditions. Table 5 shows a landmark and data file association table provided by this embodiment of the invention.
[0153]
[0154] Table 5
[0155] This invention utilizes an improved cross-ray method to establish and maintain a landmark-data file association table, significantly increasing the proportion of the initial screening stage with its short computation time and correspondingly reducing the proportion of the fine matching stage with its longer computation time. This reduces the overall computation time of the road test point data matching and landmark facility matching process. Furthermore, it provides a quick query of the coverage status of each landmark facility by establishing and maintaining the landmark-data file association table.
[0156] This invention acquires drive-test data files based on landmark facilities; groups the drive-test data in the files to obtain multiple sets of drive-test data; aggregates these multiple sets of data to obtain multiple aggregation results; adds tile numbers to each aggregation result to obtain multiple target data points; and matches these target data points with the outlines of landmarks using an improved cross-ray method to obtain an updated landmark-data file association table. This invention subdivides the data preprocessing work into two stages, flexibly adapting to the actual configuration requirements of the data platform. The improved cross-ray method for matching landmark facilities linearizes and standardizes the analysis results, improving the accuracy of the data results.
[0157] It should be noted that, for the sake of simplicity, the method embodiments are all described as a series of actions. However, those skilled in the art should understand that the embodiments of the present invention are not limited to the described order of actions, because according to the embodiments of the present invention, some steps can be performed in other orders or simultaneously. Furthermore, those skilled in the art should also understand that the embodiments described in the specification are preferred embodiments, and the actions involved are not necessarily essential to the embodiments of the present invention.
[0158] Reference Figure 7 The diagram shows a structural block diagram of a data processing device provided in an embodiment of the present invention, which may specifically include the following modules:
[0159] The acquisition module 301 is used to acquire road test data files based on landmark facilities.
[0160] Grouping module 302 is used to group the road test data in the road test data file to obtain multiple groups of road test data;
[0161] The calculation module 303 is used to perform aggregation calculations on the multiple sets of road test data respectively to obtain multiple aggregation results;
[0162] The number modulo 304 is used to add tile numbers to the multiple aggregation results respectively to obtain multiple target data points;
[0163] The matching module 305 is used to match the target data points with the outlines of landmarks according to the improved cross-ray method, so as to obtain an updated landmark-data file association table.
[0164] This invention discloses a data processing apparatus that acquires a drive test data file based on landmark facilities; groups the drive test data in the drive test data file to obtain multiple sets of drive test data; performs aggregation calculations on the multiple sets of drive test data to obtain multiple aggregation results; adds tile numbers to each of the multiple aggregation results to obtain multiple target data points; and matches the target data points with the contours of landmarks using an improved cross-ray method to obtain an updated landmark-data file association table. This invention subdivides the data preprocessing work into two stages, flexibly adapting to the actual configuration requirements of the data platform. By using the improved cross-ray method to match landmark facilities, the analysis results are linearized and standardized, improving the accuracy of the data results.
[0165] In one embodiment of the present invention, the grouping module 302 may include:
[0166] The first acquisition submodule is used to acquire the instantaneous speed and average speed of the data points in the road test data;
[0167] The state determination submodule is used to determine the state of the test equipment based on the instantaneous speed and average speed of the road test data.
[0168] The storage submodule is used to store the road test data into a corresponding array according to the state of the test equipment.
[0169] In one embodiment of the present invention, the state determination submodule may include:
[0170] The first state determination unit is configured to determine that if the instantaneous speed is less than a first instantaneous speed threshold and the average speed is less than an average speed threshold, then the test device is in a stationary state.
[0171] The storage submodule may include: a first storage unit, used to store the road test data into a first array if the test equipment is in a stationary state.
[0172] In one embodiment of the present invention, it may further include:
[0173] The second state determination unit is used to determine that the test device is in a fast fading state if the instantaneous velocity is greater than or equal to a second instantaneous velocity threshold, wherein the first instantaneous velocity threshold is less than the second instantaneous velocity threshold.
[0174] The storage submodule may include: a second storage unit, used to store the drive test data into a second array if the test device is in a fast fading state.
[0175] In one embodiment of the present invention, it may further include:
[0176] The third state determination unit is used to determine that the test device is in a slow fading state if the instantaneous velocity is greater than or equal to a first instantaneous velocity threshold, less than a second instantaneous velocity threshold, or the average velocity is greater than the average velocity threshold.
[0177] The storage submodule may include: a third storage unit, used to store the drive test data into a third array if the test device is in a slow fading state.
[0178] In one embodiment of the present invention, the calculation module 303 may include:
[0179] The calculation submodule is used to calculate the aggregation result corresponding to each set of road test data using aggregation functions. The aggregation functions include: summation, maximum value, average value, and the first valid value.
[0180] In one embodiment of the present invention, the number modulo 304 may include:
[0181] The second acquisition submodule is used to acquire the longitude and latitude of each of the multiple aggregation results.
[0182] The numbering calculation submodule is used to calculate the tile number corresponding to each aggregation result based on the longitude and latitude of each aggregation result;
[0183] The numbering submodule is used to add a corresponding tile number to each aggregation result to obtain multiple target data points.
[0184] In one embodiment of the present invention, the matching module 305 may include:
[0185] The third acquisition submodule retrieves the latitude and longitude range of landmark facilities;
[0186] The filtering submodule is used to filter the multiple target data points according to the latitude and longitude range of the landmark facilities;
[0187] The judgment submodule is used to determine whether the filtered data points are located within the polygon defined by the landmark using an improved cross ray method. The polygon defined by the landmark is determined based on the contour information in the landmark information table. The contour information includes the longitude and latitude of multiple points on the polygon defined by the contour.
[0188] The matching submodule is used to match the outline of each type of landmark facility information if it is located within the polygon defined by the landmark, and generate a new landmark and data file association table.
[0189] In one embodiment of the present invention, the determination submodule may include:
[0190] The first calculation unit is used to draw an eastward ray from the filtered data points and calculate the number of intersections between the eastward ray and the polygon;
[0191] The first determining unit is used to determine that if the number of intersection points is odd, the filtered data points are located within the polygon defined by the landmark.
[0192] In one embodiment of the present invention, it further includes:
[0193] The second calculation unit is used to draw south, west, and north rays from the filtered data points if the number of intersection points is not odd, and calculate the corresponding number of intersection points respectively.
[0194] The second determining unit is used to determine the data point as a first quadrant point if the north-facing and east-facing rays emanating from the data point do not intersect with the polygon, and the south-facing and west-facing rays emanating from the data point both intersect with the polygon.
[0195] The matching submodule includes:
[0196] The first filtering unit is used to filter out the first target quadrant points located in the landmark facility auxiliary information table based on the first quadrant points;
[0197] The first deletion unit is used to delete the first target quadrant point from the landmark facility auxiliary information table and add the first quadrant point to the landmark facility auxiliary information table.
[0198] In one embodiment of the present invention, it further includes:
[0199] The third determining unit is used to confirm that the data point is a point in the second quadrant if the eastward ray, the southward ray and the polygon have no intersection points.
[0200] The second filtering unit filters out the second target quadrant points located in the landmark facility auxiliary information table based on the second quadrant points;
[0201] The second deletion unit deletes the second target quadrant point from the landmark facility auxiliary information table and adds the second quadrant point to the landmark facility auxiliary information table.
[0202] In one embodiment of the present invention, it further includes:
[0203] The fourth determining unit is used to confirm that the data point is a point in the third quadrant if the south-facing ray, the west-facing ray and the polygon have no intersection points.
[0204] The third filtering unit is used to filter out the third target quadrant points located in the landmark facility auxiliary information table based on the third quadrant points.
[0205] The third deletion unit is used to delete the third target quadrant point from the landmark facility auxiliary information table and add the third quadrant point to the landmark facility auxiliary information table.
[0206] In one embodiment of the present invention, it further includes:
[0207] The fifth determining unit is used to confirm that the data point is a point in the fourth quadrant if neither the westward ray nor the northward ray intersects with the polygon.
[0208] The fourth filtering unit is used to filter out the fourth target quadrant points located in the landmark facility auxiliary information table based on the fourth quadrant points;
[0209] The fourth deletion unit is used to delete the fourth target quadrant point from the landmark facility auxiliary information table and add the fourth quadrant point to the landmark facility auxiliary information table.
[0210] This invention discloses a data processing apparatus that acquires a drive test data file based on landmark facilities; groups the drive test data in the drive test data file to obtain multiple sets of drive test data; performs aggregation calculations on the multiple sets of drive test data to obtain multiple aggregation results; adds tile numbers to each of the multiple aggregation results to obtain multiple target data points; and matches the target data points with the contours of landmarks using an improved cross-ray method to obtain an updated landmark-data file association table. This invention subdivides the data preprocessing work into two stages, flexibly adapting to the actual configuration requirements of the data platform. By using the improved cross-ray method to match landmark facilities, the analysis results are linearized and standardized, improving the accuracy of the data results.
[0211] As the device embodiment is basically similar to the method embodiment, the description is relatively simple, and relevant parts can be found in the description of the method embodiment.
[0212] This invention also provides an electronic device, comprising:
[0213] It includes a processor, a memory, and a computer program stored in the memory and capable of running on the processor. When the computer program is executed by the processor, it implements the various processes of the above-described data processing method embodiments and achieves the same technical effect. To avoid repetition, it will not be described again here.
[0214] This invention also provides a computer-readable storage medium storing a computer program. When the computer program is executed by a processor, it implements the various processes of the above-described data processing method embodiments and achieves the same technical effect. To avoid repetition, it will not be described again here.
[0215] The various embodiments in this specification are described in a progressive manner, with each embodiment focusing on the differences from other embodiments. The same or similar parts between the various embodiments can be referred to each other.
[0216] Those skilled in the art will understand that embodiments of the present invention can be provided as methods, apparatus, or computer program products. Therefore, embodiments of the present invention can take the form of entirely hardware embodiments, entirely software embodiments, or embodiments combining software and hardware aspects. Furthermore, embodiments of the present invention can take the form of computer program products implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
[0217] This invention is described with reference to flowchart illustrations and / or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and / or block diagrams, and combinations of blocks in the flowchart illustrations and / or block diagrams, can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, generate instructions for implementing the flowchart illustrations and / or block diagrams. Figure 1 One or more processes and / or boxes Figure 1 A device that provides the functions specified in one or more boxes.
[0218] These computer program instructions may also be stored in a computer-readable storage medium that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer-readable storage medium produce an article of manufacture including instruction means, which are implemented in a process Figure 1 One or more processes and / or boxes Figure 1The function specified in one or more boxes.
[0219] These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, causing a series of operational steps to be performed on the computer or other programmable terminal equipment to produce a computer-implemented process, thereby providing instructions that execute on the computer or other programmable terminal equipment for implementing the process. Figure 1 One or more processes and / or boxes Figure 1 The steps of the function specified in one or more boxes.
[0220] Although preferred embodiments of the present invention have been described, those skilled in the art, upon learning the basic inventive concept, can make other changes and modifications to these embodiments. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments as well as all changes and modifications falling within the scope of the embodiments of the present invention.
[0221] Finally, it should be noted that in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or terminal device that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or terminal device. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or terminal device that includes said element.
[0222] The above provides a detailed description of the data processing method, apparatus, device, and storage medium provided by the present invention. Specific examples have been used to illustrate the principles and implementation methods of the present invention. The description of the above embodiments is only for the purpose of helping to understand the method and core ideas of the present invention. At the same time, for those skilled in the art, there will be changes in the specific implementation methods and application scope based on the ideas of the present invention. Therefore, the content of this specification should not be construed as a limitation of the present invention.
Claims
1. A data processing method, characterized in that, The method includes: Obtain drive test data files based on landmark facilities; The road test data in the road test data file is grouped to obtain multiple groups of road test data; The road test data of each group were aggregated and calculated separately to obtain multiple aggregation results. The total number of test points and the number of covered test points in the data file were also calculated. Tile numbers are added to the multiple aggregation results to obtain multiple target data points; Based on the improved cross-ray method, the target data points are matched with the contours of the landmarks to obtain an updated landmark-data file association table; The step of matching the multiple target data points with the contours of landmarks using the improved cross-ray method to obtain an updated landmark-data file association table includes: Obtain the latitude and longitude range of landmark facilities; The multiple target data points are filtered based on the latitude and longitude range of the landmark facilities; The improved cross-ray method is used to determine whether the filtered data points are located within the polygon defined by the landmark. The polygon defined by the landmark is determined based on the contour information in the landmark information table. The contour information includes the longitude and latitude of multiple points on the polygon defined by the contour. If it is located within the polygon defined by the landmark, then the outline of each type of landmark facility information is matched to generate a new landmark and data file association table; The step of determining whether the filtered data points are located within the polygon defined by the landmark using the improved cross-ray method includes: Draw an eastward ray from the filtered data points and calculate the number of intersections between the eastward ray and the polygon; If the number of intersection points is odd, then the filtered data points are determined to be located within the polygon defined by the landmark. If the number of intersection points is not odd, then draw south, west, and north rays from the filtered data points, and calculate the corresponding number of intersection points respectively; If the northward and eastward rays emanating from the data point do not intersect with the polygon, but the southward and westward rays emanating from the data point both intersect with the polygon, then the data point is determined to be a point in the first quadrant. Based on the first quadrant point, select the first target quadrant point located in the landmark facility auxiliary information table; Remove the first target quadrant point from the landmark facility auxiliary information table, and add the first quadrant point to the landmark facility auxiliary information table.
2. The method according to claim 1, characterized in that, The process of grouping the road test data in the road test data file to obtain multiple groups of road test data includes: Obtain the instantaneous and average speeds of the data points in the road test data; The state of the test equipment is determined based on the instantaneous and average speeds of the road test data. Based on the state of the test equipment, the road test data is stored in the corresponding array.
3. The method according to claim 2, characterized in that, The step of determining the state of the test equipment based on the instantaneous and average velocities of the data points in the road test data includes: If the instantaneous velocity is less than the first instantaneous velocity threshold and the average velocity is less than the average velocity threshold, then the test equipment is in a stationary state. The step of storing the road test data into a corresponding array according to the state of the test equipment includes: If the test equipment is stationary, the road test data is stored in the first array.
4. The method according to claim 3, characterized in that, Also includes: If the instantaneous velocity is greater than or equal to the second instantaneous velocity threshold, then the test device is in a fast fading state, wherein the first instantaneous velocity threshold is less than the second instantaneous velocity threshold; The step of storing the road test data into a corresponding array according to the state of the test equipment includes: If the test device is in a fast fading state, the road test data is stored in the second array.
5. The method according to claim 4, characterized in that, Also includes: If the instantaneous velocity is greater than or equal to the first instantaneous velocity threshold and less than the second instantaneous velocity threshold, or if the average velocity is greater than the average velocity threshold, then the test device is in a slow fading state. The step of storing the road test data into a corresponding array according to the state of the test equipment includes: If the test equipment is in a slow fading state, the road test data is stored in the third array.
6. The method according to claim 1, characterized in that, The aggregation calculations are performed on the multiple sets of road test data to obtain multiple aggregation results, including: For the multiple sets of road test data, an aggregation function is used to calculate the aggregation result corresponding to each set of road test data. The aggregation function includes one of the following: summation, maximum value, average value, and first valid value.
7. The method according to claim 1, characterized in that, The process of adding tile numbers to the multiple aggregation results yields multiple target data points, including: For each of the multiple aggregation results, obtain the longitude and latitude of each aggregation result; Calculate the tile number corresponding to each aggregation result based on the longitude and latitude of each aggregation result; Each aggregation result is assigned a corresponding tile number to obtain multiple target data points.
8. The method according to claim 1, characterized in that, Also includes: If the eastward ray and the southward ray have no intersection with the polygon, then the data point is indeed a point in the second quadrant. Based on the second quadrant points, select the second target quadrant points located in the landmark facility auxiliary information table; Remove the second target quadrant point from the landmark facility auxiliary information table, and add the second quadrant point to the landmark facility auxiliary information table.
9. The method according to claim 1, characterized in that, Also includes: If the south-facing ray and the west-facing ray have no intersection with the polygon, then the data point is indeed a point in the third quadrant. Based on the third quadrant points, select the third target quadrant points located in the landmark facility auxiliary information table; The third target quadrant point is deleted from the landmark facility auxiliary information table, and the third quadrant point is added to the landmark facility auxiliary information table.
10. The method according to claim 1, characterized in that, Also includes: If neither the westward ray nor the northward ray intersects with the polygon, then the data point is indeed a point in the fourth quadrant. Based on the fourth quadrant point, select the fourth target quadrant point located in the landmark facility auxiliary information table; The fourth target quadrant point is deleted from the landmark facility auxiliary information table, and the fourth quadrant point is added to the landmark facility auxiliary information table.
11. A data processing apparatus, characterized in that, The device includes: The acquisition module is used to acquire drive test data files based on landmark facilities. The grouping module is used to group the road test data in the road test data file to obtain multiple groups of road test data; The calculation module is used to perform aggregation calculations on each group of road test data, obtain multiple aggregation results, and calculate the total number of test points and the number of covered test points in the data file. The numbering module is used to add tile numbers to the multiple aggregation results to obtain multiple target data points; The matching module is used to match the target data points with the outlines of landmarks according to the improved cross-ray method, so as to obtain an updated landmark-data file association table. The matching module includes: The third acquisition submodule retrieves the latitude and longitude range of landmark facilities; The filtering submodule is used to filter the multiple target data points according to the latitude and longitude range of the landmark facilities; The judgment submodule is used to determine whether the filtered data points are located within the polygon defined by the landmark using an improved cross ray method. The polygon defined by the landmark is determined based on the contour information in the landmark information table. The contour information includes the longitude and latitude of multiple points on the polygon defined by the contour. The matching submodule is used to match the outline of each type of landmark facility information if it is located within the polygon defined by the landmark, and generate a new landmark and data file association table. The judgment submodule includes: The first calculation unit is used to draw an eastward ray from the filtered data points and calculate the number of intersections between the eastward ray and the polygon. The first determining unit is used to determine that if the number of intersection points is odd, the filtered data points are located within the polygon defined by the landmark. The second calculation unit is used to draw south, west, and north rays from the filtered data points if the number of intersection points is not odd, and calculate the corresponding number of intersection points respectively. The second determining unit is used to determine the data point as a first quadrant point if the north-facing and east-facing rays emanating from the data point do not intersect with the polygon, and the south-facing and west-facing rays emanating from the data point both intersect with the polygon. The first filtering unit is used to filter out the first target quadrant points located in the landmark facility auxiliary information table based on the first quadrant points; The first deletion unit is used to delete the first target quadrant point from the landmark facility auxiliary information table and add the first quadrant point to the landmark facility auxiliary information table.
12. An electronic device, characterized in that, include: A processor, a memory, and a computer program stored in the memory and capable of running on the processor, wherein the computer program, when executed by the processor, implements the steps of the data processing method as described in any one of claims 1-10.
13. A computer-readable storage medium, characterized in that, A computer program is stored on the computer-readable storage medium, which, when executed by a processor, implements the steps of the data processing method as described in any one of claims 1-10.