Vehicle waybill track data cleaning and abnormal marking method and device, electronic equipment and medium
By calculating spherical distance and time difference, and combining it with midpoint matching, the problem of location positioning plugins forging false trajectories was solved, improving the accuracy of anomaly detection for waybills and vehicle trajectory information, and ensuring the authenticity of driver trajectories.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Patents(China)
- Current Assignee / Owner
- CHINA ACAD OF TRANSPORTATION SCI
- Filing Date
- 2023-06-30
- Publication Date
- 2026-06-26
Smart Images

Figure CN117009449B_ABST
Abstract
Description
Technical Field
[0001] This invention relates to the field of computer technology, and more specifically, to a method, apparatus, electronic device, and medium for cleaning and anomaly labeling of vehicle waybill trajectory data. Background Technology
[0002] With the digitalization of the logistics industry and the increasing popularity of online freight platforms, some non-compliant online freight platforms have been found to have forged waybills and purchased vehicle trajectory data, resulting in consequences such as evading supervision, issuing false invoices, and distorting statistical data. Therefore, some relevant regulations and rules need to add driver trajectory data when standardizing freight information, hoping to prevent data fraud through cross-verification of waybill information, vehicle trajectory data, and driver trajectory data.
[0003] Current technology primarily involves using a location tracking plugin installed on the driver's end to obtain the midpoint and origin / endpoint of the driver's trajectory, and then matching it with the origin / endpoint of the waybill information and the vehicle's trajectory. However, in practice, it is relatively easy to forge a fake driver trajectory using a location tracking plugin, especially a trajectory that matches the origin / endpoint of the waybill information and the vehicle's trajectory. Therefore, the authenticity of the driver's trajectory needs to be verified. Summary of the Invention
[0004] The purpose of this invention is to provide a method, apparatus, electronic device, and medium for cleaning and anomaly labeling of vehicle waybill trajectory data. This addresses the technical problem in existing technologies where false driver trajectories can be easily forged using location-based plugins, leading to inaccurate identification of waybill or vehicle trajectory anomalies.
[0005] This invention provides a method for cleaning and anomaly labeling vehicle waybill trajectory data, including: acquiring waybill information, which includes a waybill identifier, shipping address, delivery address, shipping time, and delivery time; acquiring corresponding driver trajectory information and vehicle trajectory information based on the waybill identifier, whereby the vehicle trajectory information includes a first starting point, a first ending point, a sampling time of the first starting point, a sampling time of the first ending point, multiple first intermediate points, and a first sampling time for each first intermediate point; and the driver trajectory information includes a second starting point, a second ending point, a sampling time of the second starting point, a sampling time of the second ending point, multiple second intermediate points, and a second sampling time for each second intermediate point; calculating a first spherical distance between the first starting point and the shipping address, and a second spherical distance between the first ending point and the delivery address; when both the first and second spherical distances are within a first distance threshold range, calculating the difference between the sampling time of the first starting point and the first time of the shipping time, and the difference between the sampling time of the first ending point and the second time of the delivery time; and calculating the difference when either the first or second spherical distance is outside a certain range. If the first distance threshold is within the specified range, or if the difference between the first and second time points is not within the specified range, the vehicle trajectory information is determined to be abnormal. The third spherical distance between the first and second starting points and the fourth spherical distance between the first and second ending points are verified. If both the third and fourth spherical distances are within the specified range, the difference between the third and fourth time points of the sampling times of the first and second starting points and the first and second ending points is calculated. If either the third or fourth spherical distance is not within the specified range, or if either the third or fourth time point difference is not within the specified range, the driver trajectory information is determined to be abnormal. If both the third and fourth spherical distances are within the specified range, and both the third and fourth time point differences are within the specified range, it is determined whether multiple second intermediate points match multiple first intermediate points. If they do not match, or if the vehicle trajectory information is abnormal, or if the driver trajectory information is abnormal, the waybill information is marked as abnormal.
[0006] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: delineating a first range region corresponding to each second intermediate point with each second intermediate point as the center and a preset third distance threshold as the radius; forming a first trajectory line for each first intermediate point according to each first sampling time; detecting whether each first range region is tangent to or intersects with the first trajectory line; if so, determining that they match; otherwise, determining that they do not match.
[0007] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: when each first sampling time and each second sampling time are synchronized, calculating the fifth spherical distance between the first intermediate point and the second intermediate point corresponding to the same first sampling time and second sampling time; when each fifth spherical distance is within the range of a third distance threshold, it is determined that they match; otherwise, it is determined that they do not match.
[0008] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: delineating multiple corresponding second range regions with two adjacent first intermediate points as diameters, determining whether each second intermediate point is within at least one second range region, and if so, determining that they match; otherwise, determining that they do not match.
[0009] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: forming a first trajectory line for each first intermediate point based on each first sampling time, forming a second trajectory line for each second intermediate point based on each second sampling time; calculating the similarity between the first trajectory line and the second trajectory line, and determining that they match when the similarity exceeds a similarity threshold, otherwise determining that they do not match.
[0010] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: when each first sampling time and each second sampling time are not synchronized, using the first sampling time of each first intermediate point as a reference, calculating the driver's third intermediate point at the first sampling time based on multiple second intermediate points and each second sampling time; calculating the sixth spherical distance between the first intermediate point and the corresponding third intermediate point at the same time; when each fifth spherical distance is within the range of the third distance threshold, determining that they match; otherwise, determining that they do not match.
[0011] In one embodiment, the calculation of the driver's third intermediate point at the first sampling time based on the first sampling time of each first intermediate point includes: selecting the two second sampling times closest to the first sampling time, calculating the intermediate point corresponding to the driver at the first sampling time based on the two second intermediate points corresponding to the two selected second sampling times, and using the calculated intermediate point as the corresponding third intermediate point.
[0012] According to a second aspect of the present invention, a vehicle waybill trajectory data cleaning and anomaly labeling device is provided, comprising: a waybill information acquisition module for acquiring waybill identifier, shipping address, delivery address, shipping time, and delivery time; a trajectory information acquisition module for acquiring corresponding driver trajectory information and vehicle trajectory information based on the waybill identifier; and a trajectory information verification module for calculating a first spherical distance between a first starting point and the shipping address and a second spherical distance between a first ending point and the delivery address, wherein when both the first spherical distance and the second spherical distance are within a first distance threshold range, the module calculates a first time difference between the sampling time of the first starting point and the shipping time and a second time difference between the sampling time of the first ending point and the delivery time; and when either the first spherical distance or the second spherical distance is not within the first distance threshold range, or when the first time difference and the second time difference are not within the first time threshold range, the module determines that the vehicle trajectory information is abnormal; and verifies the first starting point. The third spherical distance between the second starting point and the second starting point, and the fourth spherical distance between the first and second ending points are calculated. When both the third and fourth spherical distances are within the second distance threshold range, the difference between the sampling times of the first and second starting points (third time) and the difference between the sampling times of the first and second ending points (fourth time) are calculated. When either the third or fourth spherical distance is outside the second distance threshold range, or when either the difference between the third and fourth times is outside the second time threshold range, the driver trajectory information is determined to be abnormal. When both the third and fourth spherical distances are within the second distance threshold range, and both the difference between the third and fourth times is within the second time threshold range, it is determined whether multiple second intermediate points match multiple first intermediate points. The waybill information marking module is used to mark the waybill information as abnormal when there is a mismatch, or when the vehicle trajectory information is abnormal, or when the driver trajectory information is abnormal.
[0013] According to a third aspect of the present invention, an electronic device is provided, comprising: one or more processors;
[0014] Memory, used to store one or more programs.
[0015] When one or more programs are executed by one or more processors, the one or more processors execute any of the vehicle waybill trajectory data cleaning and anomaly labeling methods in the embodiments of the present invention.
[0016] According to a fourth aspect of the present invention, a computer-readable medium is provided having executable instructions stored thereon, which, when executed by a processor, cause the processor to perform any of the vehicle waybill trajectory data cleaning and anomaly labeling methods in the embodiments of the present invention.
[0017] This invention provides a method, apparatus, and electronic device for cleaning and anomaly labeling of vehicle waybill trajectory data. By verifying whether the midpoint of the driver's trajectory matches the midpoint of the vehicle's trajectory, the authenticity of the driver's trajectory is ensured, and the accuracy of anomaly detection for waybills and vehicle trajectories is improved. Attached Figure Description
[0018] Figure 1 This is a flowchart of a method for cleaning and anomaly labeling vehicle waybill trajectory data in one embodiment;
[0019] Figure 2 This is a schematic diagram illustrating the matching of multiple second intermediate points with multiple first intermediate points in one embodiment;
[0020] Figure 3 This is a schematic diagram illustrating the matching of multiple second intermediate points with multiple first intermediate points in another embodiment;
[0021] Figure 4 This is a schematic diagram illustrating the matching of multiple second intermediate points with multiple first intermediate points in yet another embodiment;
[0022] Figure 5 This is a flowchart of a method for cleaning and anomaly labeling vehicle waybill trajectory data in another embodiment;
[0023] Figure 6 This is a structural diagram of a vehicle waybill trajectory data cleaning and anomaly marking device in one embodiment;
[0024] Figure 7 This is a schematic diagram of the interface of an electronic device in one embodiment. Detailed Implementation
[0025] Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. However, it should be understood that these descriptions are merely exemplary and not intended to limit the scope of the invention. Furthermore, descriptions of well-known structures and techniques are omitted in the following description to avoid unnecessarily obscuring the concept of the invention.
[0026] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention. The words “a,” “an,” and “the,” as used herein, should also include the meanings of “a plurality” and “multiple,” unless the context clearly indicates otherwise. Furthermore, the terms “comprising,” “including,” etc., as used herein, indicate the presence of features, steps, operations, and / or components, but do not exclude the presence or addition of one or more other features, steps, operations, or components.
[0027] Furthermore, although the terms "first," "second," etc., are used repeatedly in this document to describe various elements (or various thresholds, or various applications, or various instructions, or various operations), these elements (or thresholds, applications, instructions, or operations) should not be limited by these terms. These terms are only used to distinguish one element (or threshold, application, instruction, or operation) from another element (or threshold, application, instruction, or operation). For example, a first starting point may be referred to as a second starting point, and a second starting point may be referred to as a first starting point, without departing from the scope of the invention. The first starting point and the second starting point are not starting addresses recorded by the same device.
[0028] All terms used herein (including technical and scientific terms) have the meanings commonly understood by those skilled in the art, unless otherwise defined. It should be noted that the terms used herein are to be interpreted in a manner consistent with the context of this specification, and not in an idealized or overly rigid way.
[0029] This invention provides a method for identifying anomalies in vehicle waybill data, specifically a method for cleaning and anomaly labeling vehicle waybill trajectory data. Figure 1 This invention illustrates a method for cleaning and anomaly labeling vehicle waybill trajectory data, comprising the following steps:
[0030] Step S202: Obtain waybill information.
[0031] In this embodiment, the waybill information includes a waybill identifier, shipping address, delivery address, shipping time, and delivery time. The waybill information is the information filled in by the user when placing an order on the online freight platform. The shipping address and delivery address are the origin and destination of the shipment, respectively, filled in by the user when placing the order on the online freight platform. The latitude and longitude of the corresponding addresses can be obtained on a map based on the shipping address and delivery address. The shipping time is the start time of the shipment filled in by the user, and the delivery time is the time the user signs for the package.
[0032] In this embodiment, the waybill identifier is a unique identifier generated when the waybill is uploaded to the monitoring platform to match driver trajectory information and vehicle trajectory information. The waybill identifier can be numbers, characters, or a combination of both. Specifically, the waybill information also includes the vehicle's license plate number, the driver's mobile phone number, and the driver's device ID. When uploading the waybill information, the waybill identifier includes a portion automatically generated based on the driver's mobile phone number and driver's device ID that matches the driver's trajectory information, and a portion automatically generated based on the license plate number that matches the vehicle trajectory information. Therefore, the corresponding vehicle trajectory information and driver trajectory information can be matched based on the waybill identifier. The driver's device ID can be the mobile device identification code of the driver's mobile phone, which includes the factory serial number. For example, if the middle four digits of the driver's mobile phone number are 2345, the factory serial number of the driver's device ID is 223344, and the license plate number is Shaanxi A01234, then the generated waybill identifier is Shaanxi A012342345223344. In the waybill identifier, 2345223344 is the part that matches the driver's trajectory information, which is composed of the middle four digits of the driver's mobile phone number and the driver's device ID, and Shaanxi A01234 is the part that matches the vehicle trajectory information.
[0033] In one embodiment, the waybill information includes a waybill number, which is a unique value. When the driver's trajectory is uploaded to the monitoring platform, a unique waybill identifier is generated based on the waybill number, and the corresponding vehicle trajectory information and driver trajectory information are identified based on the waybill identifier.
[0034] In one embodiment, the driver's device may be a driver's mobile device, such as a mobile phone.
[0035] In one embodiment, the monitoring platform is a backend device that verifies whether vehicle trajectory information and driver trajectory information are abnormal, and marks the waybill information corresponding to abnormal vehicle trajectory information or driver trajectory information as abnormal.
[0036] Step S204: Obtain the corresponding driver trajectory information and vehicle trajectory information based on the waybill identifier.
[0037] In this embodiment, the vehicle trajectory information includes a first starting point, a first ending point, a sampling time of the first starting point, a sampling time of the first ending point, multiple first intermediate points, and a first sampling time of each first intermediate point. The driver trajectory information includes a second starting point, a second ending point, a sampling time of the second starting point, a sampling time of the second ending point, multiple second intermediate points, and a second sampling time of each second intermediate point.
[0038] The driver trajectory information is obtained by the location positioning plugin, which is software installed in the driver's device. The second starting point of the driver trajectory is the geographical location automatically reported by the location positioning plugin at the start of transportation, and the sampling time of the second starting point is the time when the location positioning plugin reports the geographical location at the start of transportation. The second ending point of the driver trajectory is the geographical location automatically reported by the location positioning plugin at the end of transportation, and the sampling time of the second ending point is the time when the location positioning plugin reports the geographical location at the end of transportation.
[0039] In this embodiment, the multiple second intermediate points of the driver's trajectory are multiple geographical locations automatically reported by the location positioning plugin during transportation, and the second sampling time of each second intermediate point is the time when the location positioning plugin automatically reports multiple geographical locations during transportation.
[0040] In this embodiment, the vehicle trajectory information is obtained by the vehicle-mounted Beidou positioning system, which is fixed in the cab of the truck. The first starting point of the vehicle trajectory is the geographical location automatically reported by the vehicle-mounted Beidou positioning system at the start of transportation, and the sampling time of the first starting point is the time when the vehicle-mounted Beidou positioning system reports the geographical location at the start of transportation. The first ending point of the vehicle trajectory is the geographical location automatically reported by the vehicle-mounted Beidou positioning system at the end of transportation, and the sampling time of the first ending point is the time when the vehicle-mounted Beidou positioning system reports the geographical location at the end of transportation.
[0041] In this embodiment, the multiple first intermediate points of the vehicle trajectory are multiple geographical locations automatically reported by the vehicle-mounted Beidou positioning system during transportation, and the first sampling time of each first intermediate point is the time when the vehicle-mounted Beidou positioning system automatically reports multiple geographical locations during transportation.
[0042] In one embodiment, the geographical location is the latitude and longitude of that location.
[0043] Step S206: Calculate the first spherical distance between the first starting point and the shipping address and the second spherical distance between the first ending point and the receiving address. When both the first spherical distance and the second spherical distance are within the first distance threshold range, calculate the first time difference between the sampling time of the first starting point and the shipping time and the second time difference between the sampling time of the first ending point and the receiving time. When the first spherical distance or the second spherical distance is not within the first distance threshold range, or when the difference between the first time and the second time is not within the first time threshold range, determine that the vehicle trajectory information is abnormal.
[0044] In this embodiment, the first starting point sampling time and the first ending point sampling time are based on the clock signal built into the vehicle-mounted Beidou positioning system.
[0045] In this embodiment, the administrative divisions of the first starting point and the first ending point are first obtained based on their latitude and longitude. It is then determined whether the administrative division of the first starting point matches the administrative division of the shipping address, and whether the administrative division of the first ending point matches the administrative division of the delivery address. If the administrative division of the first starting point does not match the shipping address, or the administrative division of the first ending point does not match the delivery address, the vehicle trajectory information is deemed abnormal. When the administrative division of the first starting point matches the shipping address and the administrative division of the first ending point matches the delivery address, it is further determined whether the first spherical distance and the second spherical distance exceed a first distance threshold, and whether the difference between the first time point and the second time point exceeds a first time point threshold.
[0046] In one embodiment, a first spherical distance is calculated based on the latitude and longitude of the first starting point and the shipping address, and a second spherical distance is calculated based on the latitude and longitude of the first ending point and the receiving address. For example, if the latitude and longitude of the first starting point are (lon1, lat1) and the latitude and longitude of the shipping address are (lon2, lat2), then the first spherical distance between the first starting point and the shipping address is calculated using the semi-sine formula, as shown in equation (1):
[0047]
[0048] In this embodiment, the first spherical distance threshold and the first time threshold are set based on the area of the shipping address and the receiving address, and whether there is vehicle queuing at the shipping address and the receiving address. For example, when the receiving address is Community A, which has a small area and no vehicle queuing, the first distance threshold is set to 100m and the first time threshold is set to 5min. When the shipping address is Logistics Park B, which has a large area and vehicles are queuing to pick up goods, the first distance threshold is set to 1km and the first time threshold is set to 1h.
[0049] In this embodiment, it is determined whether the first distance threshold is exceeded based on the absolute values of the first spherical distance and the second spherical distance, and it is also determined whether the first time threshold is exceeded based on the absolute value of the difference between the first time and the difference between the second time.
[0050] This embodiment verifies whether the first starting point and the first ending point match the shipping address and the receiving address. If they match, it further verifies whether the sampling time of the first starting point and the sampling time of the first ending point match the shipping time and the receiving time. This can eliminate vehicle trajectory information that is obviously abnormal in matching the waybill.
[0051] Step S208: Calculate the third spherical distance between the first starting point and the second starting point, and the fourth spherical distance between the first ending point and the second ending point. When both the third and fourth spherical distances are within the second distance threshold range, calculate the third time difference between the sampling times of the first and second starting points, and the fourth time difference between the sampling times of the first and second ending points. When either the third or fourth spherical distance is not within the second distance threshold range, or when either the third or fourth time difference is not within the second time threshold range, determine that the driver's trajectory information is abnormal.
[0052] In this embodiment, the second starting point sampling time and the second ending point sampling time are based on the clock signal built into the driver's device.
[0053] In this embodiment, the administrative divisions of the first starting point and the first ending point of the vehicle trajectory are first obtained based on their latitude and longitude. The administrative divisions of the second starting point and the second ending point of the driver trajectory are then obtained based on their latitude and longitude. It is then determined whether the administrative division of the second starting point is consistent with the administrative division of the first starting point, and whether the administrative division of the second ending point is consistent with the administrative division of the first ending point. If the administrative division of the second starting point is inconsistent with the administrative division of the first starting point, or the administrative division of the second ending point is inconsistent with the administrative division of the first ending point, then the driver trajectory information is determined to be abnormal. When the administrative division of the second starting point is consistent with the administrative division of the first starting point and the administrative division of the second ending point is consistent with the administrative division of the first ending point, then it is further determined whether the third spherical distance and the fourth spherical distance exceed the second distance threshold, and whether the difference between the third time and the fourth time exceeds the second time threshold.
[0054] In one embodiment, the distances to the third and fourth spheres are calculated using the semi-versus formula based on the latitude and longitude of the first and second starting points, and the latitude and longitude of the first and second ending points.
[0055] In one embodiment, the second distance threshold is any distance value between 0m and 100m, and the second time threshold is any time value between 0min and 10min.
[0056] Step S210: When the distance between the third spherical surface and the distance between the fourth spherical surface are both within the second distance threshold range, and the difference between the third time point and the difference between the fourth time point are both within the second time threshold range, determine whether the multiple second intermediate points match the multiple first intermediate points.
[0057] In one embodiment, the number of multiple second intermediate points for the driver's trajectory that need to be reported can be preset based on the transportation distance. For example, the transportation distance is estimated based on the shipping address and delivery address in the waybill information. The unit distance of the multiple second intermediate points that need to be reported is set according to the transportation distance. For example, when the transportation distance is less than or equal to 100 kilometers, the unit distance can be preset to 10 kilometers, that is, a second intermediate point is reported once every 10 kilometers, so the maximum number of second intermediate points that need to be reported is 10; when the transportation distance is greater than 100 kilometers and less than or equal to 500 kilometers, the unit distance can be preset to 20 kilometers, so the number of second intermediate points that need to be reported ranges from 5 to 25; when the transportation distance is greater than 500 kilometers and less than or equal to 1000 kilometers, the unit distance can be preset to 25 kilometers, so the number of second intermediate points that need to be reported ranges from 20 to 40; when the transportation distance is greater than 1000 kilometers, the unit distance can be preset to 50 kilometers, so the number of second intermediate points that need to be reported is no less than 20.
[0058] In one embodiment, the number of second intermediate points of the driver's trajectory that need to be reported is preset according to the vehicle speed. The location positioning plug-in can detect the real-time vehicle speed. When the vehicle speed is high during a certain transportation process, the second intermediate points do not need to be reported for that transportation process. For example, if the transportation distance is 300 kilometers and the second intermediate point is reported every 20 kilometers, then the number of second intermediate points that need to be reported is 15. However, if the location positioning plug-in detects that the average vehicle speed is between 80 km / h and 100 km / h during a certain 20-kilometer transportation process, then the second intermediate points do not need to be reported for this 20-kilometer transportation process, that is, the number of second intermediate points that need to be reported is reduced to 14.
[0059] In one embodiment, if the driver's trajectory is a same-city trajectory, the number of second intermediate points that need to be reported is increased, which means the preset unit distance is shortened. For example, if the driver's trajectory is determined to be a same-city trajectory in Xi'an based on the shipping address and delivery address in the waybill information, and the transportation distance is 50 kilometers, then the number of second intermediate points to be reported increases to 10, which means that a second intermediate point is reported once every 5 kilometers.
[0060] In one embodiment, if the driver's trajectory is determined to be an inter-provincial trajectory based on the shipping and receiving addresses in the waybill information, then in addition to the preset number of second intermediate points to be reported, a landmark geographical location will also be reported as a second intermediate point. For example, for an inter-provincial trajectory with a transportation distance of 600 kilometers, if a second intermediate point is reported every 25 kilometers, then 24 second intermediate points need to be reported. In addition, if the 600-kilometer inter-provincial trajectory passes through 3 provinces, then 2 more second intermediate points at service areas at the border of the two provinces need to be reported.
[0061] In one embodiment, the number of second intermediate points on a driver's trajectory is affected by weather. The monitoring platform can query the weather conditions at the corresponding time of the driver's trajectory. Severe weather such as rain or fog affects vehicle speed. If the number of second intermediate points on the driver's trajectory does not change under severe weather conditions, the driver's trajectory information can also be determined to be abnormal. For example, under normal weather conditions, if the transportation distance is 300 kilometers, and the average speed for 40 kilometers is greater than 80 km / h, then the second intermediate point does not need to be reported for those 40 kilometers, reducing the number of reports by 2. The final number of reported second intermediate points is 13. However, under severe weather conditions, the average speed decreases, and the number of reported second intermediate points should normally be greater than 13. If it is less than or equal to 13, the driver's trajectory information can be determined to be abnormal.
[0062] Step S212: When there is a mismatch, or when the vehicle trajectory information is abnormal, or when the driver trajectory information is abnormal, mark the waybill information as abnormal.
[0063] In this embodiment, the anomaly identifier is specific text information. Vehicle trajectory information anomalies include: vehicle origin address anomalies (i.e., the first spherical distance is not within the first distance threshold range); vehicle origin time anomalies (i.e., the first time difference is not within the first time threshold range); vehicle destination address anomalies (i.e., the first spherical distance is not within the first distance threshold range); vehicle destination time anomalies (i.e., the first time difference is not within the first time threshold range); vehicle origin administrative division anomalies (i.e., the administrative division of the latitude and longitude of the first origin is inconsistent with the administrative division of the delivery address); and vehicle destination administrative division anomalies (i.e., the administrative division of the latitude and longitude of the first destination is inconsistent with the administrative division of the delivery address). Similarly, driver trajectory information anomalies include: driver origin address anomalies. The following are considered abnormalities: the third spherical distance is not within the second distance threshold range; the driver's starting time is abnormal, meaning the third time difference is not within the second time threshold range; the driver's ending address is abnormal, meaning the fourth spherical distance is not within the second distance threshold range; the driver's ending time is abnormal, meaning the second time difference is not within the second time threshold range; the driver's starting administrative division is abnormal, meaning the administrative division of the latitude and longitude of the first starting point is inconsistent with the administrative division of the second starting point; the driver's ending administrative division is abnormal, meaning the administrative division of the latitude and longitude of the first ending point is inconsistent with the administrative division of the second ending point; the driver's intermediate point address is abnormal, meaning multiple second intermediate points do not match multiple first intermediate points; the number of driver intermediate point reports is abnormal, meaning the number of second intermediate point reports is abnormal.
[0064] In one embodiment, when a vehicle passes through a tunnel, the driver's device does not have a GPS signal. In this case, the driver's device will indicate that there is no GPS signal at that location. If a second intermediate point does not report its location, but the driver's device indicates that there is no GPS signal at that location, then in this case, the second intermediate point will not be considered as a second intermediate point that needs to be verified.
[0065] In one embodiment, when a driver stops or stays overnight at a service area, the driver and vehicle may be separated, resulting in an abnormal match between the second intermediate point and the first intermediate point. When the latitude and longitude of the detected second intermediate point are in a service area, hotel, or other place where people can rest and stay, the second intermediate point is not considered as the second intermediate point that needs to be verified.
[0066] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: delineating a first range region corresponding to each second intermediate point with the latitude and longitude of each second intermediate point as the center and a preset third distance threshold as the radius; forming a first trajectory line for each first intermediate point according to each first sampling time; detecting whether each first range region is tangent to or intersects with the first trajectory line; if so, determining that they match; otherwise, determining that they do not match.
[0067] In this embodiment, the third distance threshold can be set to any distance value between 0m and 100m, wherein the magnitude of the third distance threshold is set according to the strength of the GPS signal at the geographical location of the second intermediate point.
[0068] For example, such as Figure 2 As shown, there exists a trajectory of driver A, which corresponds to the trajectory of vehicle A. The multiple second intermediate points of driver A's trajectory are a1, a2, a3, a4, etc. The GPS signal is strong at the location of the second intermediate point a1, and weak at the location of the second intermediate point a3. Therefore, the third distance threshold at the second intermediate point a1 is set to 50m, the third distance threshold at the second intermediate point a3 is set to 100m, and the third distance threshold at the other second intermediate points is 80m. Using the second intermediate points a1, a2, a3, a4... as centers, and the corresponding third distance threshold as the radius, a circular area is delineated. It is determined whether each circular area corresponding to a second intermediate point is tangent to or intersects with the trajectory of vehicle A. If they are not tangent or intersecting, it indicates that the second intermediate point does not correspond to a point in the trajectory of vehicle A, meaning the second intermediate point is false. The presence of any false second intermediate point indicates that the driver's trajectory is false, i.e., the driver's intermediate point address is abnormal. Only when all the circular areas corresponding to the second intermediate points are tangent to or intersect with the trajectory of vehicle A is the driver's trajectory considered genuine. Figure 2As shown in the trajectory of driver A, if all the second intermediate points in driver A's trajectory are tangent to or intersect with the vehicle's trajectory, then driver A's trajectory is real. A fake driver trajectory is as follows... Figure 2 As shown in the trajectory of driver D, although there is a second intermediate point d3 that is tangent to the trajectory of vehicle D, the second intermediate points d1, d2, and d4 are not tangent to or intersect with the trajectory of vehicle D. Therefore, the trajectory of driver D is a false driver trajectory, and it is determined that the driver trajectory has an abnormal driver intermediate point address.
[0069] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: when each first sampling time and each second sampling time are synchronized, calculating the fifth spherical distance between the first intermediate point and the second intermediate point corresponding to the same first sampling time and second sampling time; when each fifth spherical distance is within the range of a third distance threshold, it is determined that they match; otherwise, it is determined that they do not match.
[0070] In this embodiment, based on the second intermediate point of the preset driver trajectory, the corresponding first intermediate point is simultaneously set in the vehicle-mounted Beidou positioning system. That is to say, when the vehicle passes through the set second intermediate point, the location positioning plug-in automatically reports the second intermediate point once, and at the same time, the vehicle-mounted Beidou positioning system also reports the first intermediate point once at the same geographical location.
[0071] For example, if the location positioning plugin reports a second intermediate point every 20 kilometers, then the vehicle-mounted BeiDou positioning system is also set to report a first intermediate point every 20 kilometers. If the driver's trajectory reports a second intermediate point at a landmark geographical location across provinces, then the vehicle-mounted BeiDou positioning system is also set to report a first intermediate point at that landmark geographical location.
[0072] In this embodiment, it is verified whether the distance between the first intermediate point and the corresponding second intermediate point on the fifth sphere exceeds a third distance threshold. For example, such as... Figure 3As shown, there exists a trajectory for driver B, and a corresponding trajectory for vehicle B. The second intermediate points of driver B's trajectory are b1, b2, b3, b4…, and the first intermediate points of vehicle B's trajectory are B1, B2, B3, B4…. The first intermediate point B1 and the second intermediate point b1 are at the same geographical location at the same time, with their latitude and longitude reported by the location positioning plugin and the vehicle-mounted BeiDou positioning system, respectively. Similarly, B2 and b2, B3 and b3, and B4 and b4 are also at the same geographical location at the same time, with their latitude and longitude reported by the location positioning plugin and the vehicle-mounted BeiDou positioning system, respectively. Based on the latitude and longitude of B1 and b1, the spherical distance between B1 and b1 is calculated. Similarly, the spherical distances between the other second intermediate points and their corresponding first intermediate points are calculated. It is then determined whether all fifth spherical distances exceed the third distance threshold. If none exceed it, the driver's trajectory is considered real. Figure 3 As shown in the driver's trajectory B, if any distance from the fifth sphere exceeds the third distance threshold, the driver's trajectory is determined to be spurious, meaning the driver's midpoint address is abnormal. Figure 2 As shown in the C driver trajectory, although the fifth spherical distances between C1 and c1, and between C4 and c4 do not exceed the third distance threshold, the fifth spherical distances between C2 and c2, and between C3 and c3 exceed the third distance threshold. Therefore, the C driver trajectory is determined to be false, that is, the C driver trajectory is determined to be an abnormal driver midpoint address.
[0073] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes delineating multiple corresponding second range regions with two adjacent first intermediate points as diameters, determining whether each second intermediate point is within at least one second range region, and if so, determining that they match; otherwise, determining that they do not match.
[0074] In this embodiment, a circular range is defined with the spherical distance between two adjacent first intermediate points as the diameter. It is then determined whether the second intermediate point is included within this range. If all the second intermediate points are within the circular range, the driver's trajectory is determined to be real. If any second intermediate point is not within the circular range, the driver's trajectory is determined to be fake.
[0075] For example, such as Figure 4 As shown, E1 and E2 are adjacent first intermediate points. A circular region is defined with the spherical distance between E1 and E2 as its diameter. The question then determines whether a second intermediate point exists within this circular region. Figure 4 As shown in the trajectory of driver E, if all the second intermediate points are contained within the circular area, then the trajectory of driver E is determined to be real. Figure 4If the trajectory of driver F in the image shows that there are second intermediate points f2 and f3 that are not included in the circular area, then the trajectory of driver F is determined to be false, that is, the driver intermediate point address is abnormal.
[0076] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: forming a first trajectory line for each first intermediate point based on each first sampling time, forming a second trajectory line for each second intermediate point based on each second sampling time; calculating the similarity between the first trajectory line and the second trajectory line, and determining that they match when the similarity exceeds a similarity threshold, otherwise determining that they do not match.
[0077] In this embodiment, each first intermediate point is connected together according to the time sequence of each first sampling moment to form an image of the first trajectory line. Similarly, each second intermediate point is connected together according to the time sequence of each second sampling moment to form an image of the second trajectory line. The shape difference between the images is judged, and the similarity is judged based on the shape difference to determine whether the similarity exceeds the similarity threshold. The similarity threshold can be set to any ratio between 95% and 100%.
[0078] In one embodiment, shape differences can be determined based on pixel variations in the image.
[0079] In one embodiment, determining whether multiple second intermediate points match multiple first intermediate points includes: when each first sampling time and each second sampling time are not synchronized, using the first sampling time of each first intermediate point as a reference, calculating the driver's third intermediate point at the first sampling time based on the second intermediate point and the second sampling time; calculating the sixth spherical distance between the first intermediate point and the corresponding third intermediate point at the same time; if each sixth spherical distance is within the range of the third distance threshold, a match is determined; otherwise, a mismatch is determined.
[0080] In this embodiment, the timing of the location positioning plugin reporting the second intermediate point and the timing of the BeiDou positioning system reporting the first intermediate point are sequential, and the time difference between the two exceeds the second time threshold. For example, at geographical location P, the BeiDou positioning system reports first, and the location positioning plugin may report 20 minutes later, or the location positioning plugin may report first, and the BeiDou positioning system may report 15 minutes later. Therefore, due to the influence of vehicle speed, the spherical distance between the second intermediate point reported by the location positioning plugin and the first intermediate point reported by the BeiDou positioning system exceeds the third distance threshold. Therefore, it is necessary to calculate the theoretical position of each second intermediate point, i.e., the third intermediate point, based on the first sampling time of each first intermediate point, and determine whether it matches the corresponding first intermediate point.
[0081] In this embodiment, the third intermediate point corresponding to the first intermediate point at the same time is calculated based on the average vehicle speed.
[0082] In one embodiment, the distance to the sixth sphere is calculated using the semi-versus formula based on the latitude and longitude of the first intermediate point and the third intermediate point.
[0083] In one embodiment, the calculation of the driver's third intermediate point at the first sampling time based on the first sampling time of each first intermediate point includes: selecting the two second sampling times closest to the first sampling time, calculating the intermediate point corresponding to the driver at the first sampling time based on the two second intermediate points corresponding to the two selected second sampling times, and using the calculated intermediate point as the corresponding third intermediate point.
[0084] In this embodiment, the second sampling time of each second intermediate point is stored in the form of an array. Similarly, the latitude and longitude of each second intermediate point and the first sampling time of each first intermediate point are also stored in the form of an array. The latitude and longitude of the second intermediate point corresponding to the second sampling time are matched according to the data position stored in the array.
[0085] For example, consider the first sampling time array (T1, T2, T3, T4, T5), the second sampling time array (t1, t2, t3, t4, t5), and the second intermediate point array [(m1, n1), (m2, n2), (m3, n3), (m4, n4), (m5, n5)]. The times in the second sampling time array and the latitude and longitude of the second intermediate point matrix are arranged in chronological order. Therefore, based on the position in the array, the latitude and longitude of the second intermediate point corresponding to the second sampling time can be matched. For example, t1 is the first time value in the second sampling time array, and it matches the first latitude and longitude value in the second intermediate point matrix.
[0086] For example, taking the first sampling time T1 as the reference, the latitude and longitude of the first intermediate point corresponding to T1 is (M1, N1). The time difference between T1 and all second sampling times in the second sampling time array is calculated. The two second sampling times with the smallest time difference are found, for example, t1 and t2. Based on t1 and t2, the latitude and longitude of the corresponding second intermediate points are matched (m1, n1) and (m2, n2). Based on the time difference between T1 and t1 and t2, and according to the average speed recorded by the BeiDou positioning system for traveling to T1, the time difference between T1 and t1 and t2 is calculated. The travel distances within the time difference are d1 and d2, respectively. Calculate the latitude and longitude (m1, n1) and (m2, n2) of the second intermediate point at travel distances d1 and d2, respectively, and calculate the latitude and longitude (k1, p1) and (k2, p2) of (m1, n1) and (k2, p2) respectively. Calculate the average latitude and longitude (k3, p3) of (k1, p1) and (k2, p2). Take (k3, p3) as the third intermediate point and check whether the sixth spherical distance between the third intermediate point (k3, p3) and the corresponding first intermediate point (M1, N1) exceeds the fifth spherical distance threshold, where m, M, and k are longitudes, and n, N, and p are latitudes.
[0087] In one embodiment, time t1 may be earlier or later than T1, and similarly, time t2 may be earlier or later than T1.
[0088] In this embodiment, the orientation of the first intermediate point and the corresponding second intermediate point is due north, or due east, or due west, or due south.
[0089] Figure 5 A flowchart of a method for cleaning and anomaly labeling vehicle waybill trajectory data is shown in another embodiment. The method includes the following steps:
[0090] Step S402: Obtain waybill information.
[0091] Step S404: Obtain the corresponding driver trajectory information and vehicle trajectory information based on the waybill identifier.
[0092] In this embodiment, the content of step S402 is the same as that of step S202, and the content of step S404 is the same as that of step S204, so they will not be described again here.
[0093] Step S406: Calculate the first spherical distance between the first starting point and the shipping address and the second spherical distance between the first ending point and the receiving address.
[0094] Step S408: Compare whether the distance between the first spherical surface and the distance between the second spherical surface are both within the first distance threshold range.
[0095] Step S410: Calculate the first time difference between the first starting sampling time and the first shipping time, and the second time difference between the first ending sampling time and the receiving time.
[0096] Step S412: Compare whether the difference at the first time point and the difference at the second time point are both within the threshold range at the first time point.
[0097] In one embodiment, when the execution result of step S408 is negative, the vehicle trajectory information is determined to be abnormal, that is, step S426 is executed, and the corresponding waybill information is marked as abnormal vehicle origin address, abnormal vehicle destination address, abnormal vehicle origin administrative division, or abnormal vehicle destination administrative division, that is, step S430 is executed, and the process ends.
[0098] In one embodiment, when the execution result of step S412 is negative, the vehicle trajectory information is determined to be abnormal, i.e., step S426 is executed, and the corresponding waybill information is marked with an abnormal vehicle start time or an abnormal vehicle end time, i.e., step S430 is executed, and the process ends.
[0099] In one embodiment, when the execution result of step S408 is yes, step S410 is executed, followed by step S412. When the execution result of step S412 is yes, it indicates that the vehicle trajectory information is normal, and step S414 and subsequent steps can continue to be executed.
[0100] In one embodiment, steps S406 and S408 can be swapped with steps S410 and S412. That is, step S410 is executed first, followed by step S412. When the result of step S412 is yes, step S406 is executed, followed by step S408. When the result of step S408 is yes, it indicates that the vehicle trajectory information is normal, and step S408 and subsequent steps can continue to be executed.
[0101] Step S414: Calculate the third spherical distance between the first starting point and the second starting point, and the fourth spherical distance between the first ending point and the second ending point.
[0102] Step S416: Compare the distances of the third and fourth spherical surfaces, both of which are within the second distance threshold range.
[0103] Step S418: Calculate the third time difference between the first starting sampling time and the second starting sampling time, and the fourth time difference between the first ending sampling time and the second ending sampling time.
[0104] Step S420: Compare whether the difference at the third time point or the difference at the fourth time point is within the threshold range at the second time point.
[0105] In one embodiment, when the execution result of step S416 is negative, the driver trajectory information is determined to be abnormal, that is, step S428 is executed, and the corresponding waybill information is marked as abnormal driver origin address, abnormal driver destination address, abnormal driver origin administrative division, or abnormal driver destination administrative division, that is, step S430 is executed, and the process ends.
[0106] In one embodiment, when the execution result of step S420 is negative, the driver trajectory information is determined to be abnormal, that is, step S428 is executed, the corresponding waybill information is marked with abnormal driver start time or abnormal driver end time, that is, step S430 is executed, and the process ends.
[0107] In one embodiment, when the execution result of step S416 is yes, step S418 is executed, followed by step S420. When the execution result of step S420 is yes, it means that the second starting point and the second ending point of the driver's trajectory match the first starting point and the first ending point of the vehicle trajectory normally, including normal geographical location, normal administrative division, and normal sampling time. Step S422 and subsequent steps can continue to be executed.
[0108] In one embodiment, steps S414 and S416 can be swapped with steps S418 and S420. That is, step S418 is executed first, followed by step S420. When the execution result of step S420 is yes, step S414 is executed, followed by step S416. When the execution result of step S416 is yes, it indicates that the second starting point and second ending point of the driver's trajectory match the first starting point and first ending point of the vehicle trajectory normally, and step S422 and subsequent steps can continue to be executed.
[0109] Step S422: Determine whether multiple second intermediate points match multiple first intermediate points.
[0110] In this embodiment, the determination of whether multiple second intermediate points match multiple first intermediate points in steps S422 and S210 is consistent, and will not be repeated here.
[0111] Step S424: Determine that the waybill information is normal.
[0112] In this embodiment, after step S404 is executed, it is first determined whether the vehicle trajectory information is normal. That is, if the first spherical distance and the second spherical distance are within the first distance threshold range, and the first time difference and the second time difference are within the first time threshold range, it indicates that the vehicle trajectory information is normal. If the vehicle trajectory information is normal, it is then determined whether the driver trajectory information is normal.
[0113] In this embodiment, based on normal vehicle trajectory information, it is determined whether the driver trajectory information is normal. That is, when the distance between the third spherical surface and the distance between the fourth spherical surface are within the second distance threshold, and the difference between the third time and the difference between the fourth time are within the second time threshold, it is determined whether the multiple second intermediate points of the driver trajectory match the multiple first intermediate points normally. When the multiple second intermediate points of the driver trajectory match the multiple first intermediate points normally, the waybill information corresponding to the driver trajectory information or vehicle trajectory information is normal.
[0114] In one embodiment, such as Figure 6 As shown, a device for cleaning and anomaly labeling vehicle waybill trajectory data is provided. The device includes:
[0115] The waybill information acquisition module 602 is used to acquire waybill identifier, shipping address, delivery address, shipping time, and delivery time.
[0116] The trajectory information acquisition module 604 is used to acquire the corresponding driver trajectory information and vehicle trajectory information based on the waybill identifier.
[0117] The trajectory information verification module 606 is used to calculate the first spherical distance between the first starting point and the delivery address, and the second spherical distance between the first ending point and the delivery address. When both the first and second spherical distances are within a first distance threshold range, it calculates the first time difference between the sampling time of the first starting point and the delivery time, and the second time difference between the sampling time of the first ending point and the delivery time. When either the first or second spherical distance is not within the first distance threshold range, or when the difference between the first and second times is not within the first time threshold range, the vehicle trajectory information is determined to be abnormal. It also verifies the third spherical distance between the first and second starting points, and the fourth spherical distance between the first and second ending points. When the third spherical distance... If the distances to the first and fourth spherical surfaces are both within the second distance threshold, calculate the third time difference between the sampling times of the first and second starting points and the fourth time difference between the sampling times of the first and second endpoints. If either the distance to the third or fourth spherical surface is not within the second distance threshold, or if either the difference between the third and fourth times is not within the second time threshold, the driver's trajectory information is determined to be abnormal. If both the distance to the third and fourth spherical surfaces are within the second distance threshold, and both the difference between the third and fourth times are within the second time threshold, determine whether multiple second intermediate points match multiple first intermediate points. If they do not match, the driver's trajectory information is determined to be abnormal.
[0118] The waybill information marking module 608 is used to mark the waybill information as abnormal when there is a mismatch, or when the vehicle trajectory information is abnormal, or when the driver trajectory information is abnormal.
[0119] In one embodiment, a computer storage medium is provided that stores computer-executable instructions, which, when executed by a processor, cause the processor to perform the steps of the method in any of the above embodiments, including the steps of performing the vehicle waybill trajectory data cleaning and anomaly labeling method in any of the above embodiments.
[0120] In one embodiment, an electronic device is provided, which may be the server described above. It includes a memory and a processor. The memory stores a computer program, which, when executed by the processor, causes the processor to perform the steps of the vehicle waybill trajectory data cleaning and anomaly labeling method in any of the above embodiments.
[0121] In one embodiment, such as Figure 7 As shown, server 800 includes a central processing unit (CPU) 801, which can perform various appropriate actions and processes based on programs stored in read-only memory (ROM) 802 or programs loaded from storage section 808 into random access memory (RAM) 803. RAM 803 also stores various programs and data required for the operation of server 800. CPU 801, ROM 802, and RAM 803 are interconnected via bus 804. Input / output (I / O) interface 805 is also connected to bus 804.
[0122] The following components are connected to I / O interface 805: an input section 806 including a keyboard, mouse, etc.; an output section 807 including a cathode ray tube (CRT), liquid crystal display (LCD), etc., and speakers, etc.; a storage section 808 including a hard disk, etc.; and a communication section 809 including a network interface card such as a LAN card, modem, etc. The communication section 809 performs communication processing via a network such as the Internet. A drive 810 is also connected to I / O interface 805 as needed. A removable medium 811, such as a disk, optical disk, magneto-optical disk, semiconductor memory, etc., is installed on drive 810 as needed so that computer programs read from it can be installed into storage section 808 as needed.
[0123] The integrated units implemented as software functional units described above can be stored in a computer-readable storage medium. These software functional units, stored in a storage medium, include several instructions to cause a computer device (which may be a personal computer, server, or network device, etc.) or processor to execute partial steps of the methods of the various embodiments of the present invention. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
[0124] Although exemplary embodiments have been described, it will be apparent to those skilled in the art that various changes and modifications can be made without departing from the spirit and scope of the invention. Therefore, it should be understood that the above exemplary embodiments are not restrictive but illustrative.
Claims
1. A method for cleaning and anomaly labeling of vehicle waybill trajectory data, characterized in that, The method includes: Obtain waybill information, which includes waybill identifier, shipping address, delivery address, shipping time, and delivery time; Based on the waybill identifier, the corresponding driver trajectory information and vehicle trajectory information are obtained. The vehicle trajectory information includes the first starting point, the first ending point, the sampling time of the first starting point, the sampling time of the first ending point, multiple first intermediate points, and the first sampling time of each first intermediate point. The driver trajectory information includes the second starting point, the second ending point, the sampling time of the second starting point, the sampling time of the second ending point, multiple second intermediate points, and the second sampling time of each second intermediate point. The driver trajectory information is obtained by the location positioning plugin. Calculate the first spherical distance between the first starting point and the shipping address and the second spherical distance between the first ending point and the receiving address. When both the first spherical distance and the second spherical distance are within the first distance threshold range, calculate the first time difference between the sampling time of the first starting point and the shipping time and the second time difference between the sampling time of the first ending point and the receiving time. When either the first spherical distance or the second spherical distance is not within the first distance threshold range, or when the difference between the first time and the difference between the second time is not within the first time threshold range, determine that the vehicle trajectory information is abnormal. Calculate the third spherical distance between the first starting point and the second starting point, and the fourth spherical distance between the first ending point and the second ending point. When both the third spherical distance and the fourth spherical distance are within the second distance threshold range, calculate the third time difference between the sampling time of the first starting point and the sampling time of the second starting point, and the fourth time difference between the sampling time of the first ending point and the sampling time of the second ending point. When the third spherical distance or the fourth spherical distance is not within the second distance threshold range, or when the third time difference or the fourth time difference is not within the second time threshold range, determine that the driver trajectory information is abnormal. When the distance between the third spherical surface and the distance between the fourth spherical surface are both within the second distance threshold range, and the difference between the third time point and the difference between the fourth time point are both within the second time threshold range, it is determined whether the plurality of second intermediate points match the plurality of first intermediate points; The number of multiple second intermediate points of the driver's trajectory to be reported is preset according to the transportation distance or vehicle speed, and the driver trajectory information is judged to be abnormal based on the number of times multiple second intermediate points are reported. When there is a mismatch, or when the vehicle trajectory information is abnormal, or when the driver trajectory information is abnormal, the waybill information will be marked as abnormal.
2. The method according to claim 1, characterized in that, The step of determining whether the plurality of second intermediate points match the plurality of first intermediate points includes: Using each second intermediate point as the center and a preset third distance threshold as the radius, a first range area corresponding to each second intermediate point is delineated, and a first trajectory line is formed for each first intermediate point according to each first sampling time. Detect whether each first range region is tangent to or intersects with the first trajectory line. If so, it is determined to be a match; otherwise, it is determined to be a mismatch.
3. The method according to claim 1, characterized in that, The step of determining whether the plurality of second intermediate points match the plurality of first intermediate points includes: When each first sampling time and each second sampling time are synchronized, the fifth spherical distance between the first midpoint and the second midpoint corresponding to the same first sampling time and second sampling time is calculated. When each of the fifth spherical distances is within the range of the third distance threshold, it is determined that they match; otherwise, it is determined that they do not match.
4. The method according to claim 1, characterized in that, The step of determining whether the plurality of second intermediate points match the plurality of first intermediate points includes: Multiple second range regions are defined with two adjacent first midpoints as diameters. It is determined whether each second midpoint is within at least one second range region. If so, they are considered to be matched; otherwise, they are considered to be mismatched.
5. The method according to claim 1, characterized in that, The step of determining whether the plurality of second intermediate points match the plurality of first intermediate points includes: A first trajectory line is formed for each first intermediate point based on each first sampling time, and a second trajectory line is formed for each second intermediate point based on each second sampling time. Calculate the similarity between the first trajectory line and the second trajectory line. If the similarity exceeds the similarity threshold, it is determined that they match; otherwise, it is determined that they do not match.
6. The method according to claim 1, characterized in that, The step of determining whether the plurality of second intermediate points match the plurality of first intermediate points includes: When each first sampling time and each second sampling time are not synchronized, the third intermediate point of the driver at the first sampling time is calculated based on the first sampling time of each first intermediate point, according to the plurality of second intermediate points and each second sampling time. Calculate the sixth spherical distance between the first intermediate point and the corresponding third intermediate point at the same time. If each of the sixth spherical distances is within the range of the third distance threshold, it is determined that they match; otherwise, it is determined that they do not match.
7. The method according to claim 6, characterized in that, The step of calculating the driver's third intermediate point at the first sampling time, based on the first sampling time of each first intermediate point and the plurality of second intermediate points and each second sampling time, includes: Select the two second sampling times that are closest to the first sampling time, calculate the midpoint corresponding to the driver when he is at the first sampling time based on the two second midpoints corresponding to the two selected second sampling times, and use the calculated midpoint as the corresponding third midpoint.
8. A device for cleaning and anomaly marking of vehicle waybill trajectory data, characterized in that, The device includes: The waybill information acquisition module is used to acquire waybill information, which includes waybill identifier, shipping address, delivery address, shipping time, and delivery time. The trajectory information acquisition module is used to acquire corresponding driver trajectory information and vehicle trajectory information based on the waybill identifier. The vehicle trajectory information includes a first starting point, a first ending point, a sampling time of the first starting point, a sampling time of the first ending point, multiple first intermediate points, and a first sampling time of each first intermediate point. The driver trajectory information includes a second starting point, a second ending point, a sampling time of the second starting point, a sampling time of the second ending point, multiple second intermediate points, and a second sampling time of each second intermediate point. The driver trajectory information is obtained by the location positioning plugin. The trajectory information verification module is used to calculate the first spherical distance between the first starting point and the delivery address, and the second spherical distance between the first ending point and the delivery address. When both the first spherical distance and the second spherical distance are within a first distance threshold range, it calculates the first time difference between the sampling time of the first starting point and the delivery time, and the second time difference between the sampling time of the first ending point and the delivery time. When either the first spherical distance or the second spherical distance is not within the first distance threshold range, or when the first time difference and the second time difference are not within the first time threshold range, the vehicle trajectory information is determined to be abnormal. The module also verifies the third spherical distance between the first starting point and the second starting point, and the fourth spherical distance between the first ending point and the second ending point. When both the third spherical distance and the fourth spherical distance are within a second distance threshold range... Within this process, the difference between the first starting point sampling time and the second starting point sampling time (third time difference) and the difference between the first ending point sampling time and the second ending point sampling time (fourth time difference) are calculated. If the third spherical distance or the fourth spherical distance is not within the second distance threshold range, or if the difference between the third and fourth times is not within the second time threshold range, the driver trajectory information is determined to be abnormal. If both the third and fourth spherical distances are within the second distance threshold range, and both the difference between the third and fourth times are within the second time threshold range, it is determined whether the plurality of second intermediate points match the plurality of first intermediate points. The number of the plurality of second intermediate points of the driver trajectory to be reported is preset according to the transportation distance or vehicle speed, and the driver trajectory information is determined to be abnormal based on the number of times the plurality of second intermediate points are reported. The waybill information marking module is used to mark the waybill information as abnormal when there is a mismatch, or when the vehicle trajectory information is abnormal, or when the driver trajectory information is abnormal.
9. An electronic device, characterized in that, include: One or more processors; Memory, used to store one or more programs. Wherein, when the one or more programs are executed by the one or more processors, the one or more processors perform the method as described in any one of claims 1 to 7.
10. A computer-readable medium, characterized in that, The computer-readable medium stores executable instructions that, when executed by a processor, cause the processor to perform the method as described in any one of claims 1 to 7.