A marking line coding method and device, electronic equipment and storage medium
By identifying and encoding road signs and markings in road video frames, the problem of the lack of digital information in traditional road signs and markings has been solved, and standardized coding that can be recognized by machines has been achieved, thereby improving the perception and decision-making capabilities of autonomous driving systems.
Patent Information
- Authority / Receiving Office
- CN · China
- Patent Type
- Applications(China)
- Current Assignee / Owner
- RES INST OF HIGHWAY MINIST OF TRANSPORT
- Filing Date
- 2026-03-31
- Publication Date
- 2026-06-19
Smart Images

Figure CN122244834A_ABST
Abstract
Description
Technical Field
[0001] This application relates to the field of computer technology, and more specifically, to a method, apparatus, electronic device, and storage medium for encoding markings and lines. Background Technology
[0002] With the widespread application of autonomous driving technology and intelligent transportation, traditional traffic signs and markings can no longer fully meet the high-precision and high-reliability requirements of autonomous driving. Autonomous vehicles rely on perception systems (such as LiDAR, cameras, and ultrasonic sensors) to "understand" road conditions. This requires that signs and markings not only be accurate and clear in physical form, but also provide readable digital information for the autonomous driving system. Currently, most road signs and markings are presented primarily in a human visual form, guiding traditional drivers, but lacking sufficient digital information support for autonomous driving systems, thus limiting their performance in complex traffic environments. Summary of the Invention
[0003] The purpose of some embodiments of this application is to provide a method, apparatus, electronic device, and storage medium for encoding traffic signs and markings. Through the technical solutions of the embodiments of this application, road video frames are acquired, wherein the road video frames include station numbers and latitude and longitude information; a pre-trained recognition model is used to recognize the road video frames to determine the traffic recognition area image in the road video frames, wherein the recognition model is obtained by training a neural network model using sample data; the traffic recognition area image includes text data and symbol identification data; a target template corresponding to the traffic recognition area image is determined according to a pre-stored traffic template database; the text data and symbol identification data from the traffic recognition area image are filled into the target template to obtain the filled traffic sign, and the signal is then processed according to preset traffic rules and regulations. By setting an encoding method, the filled traffic signs are encoded to obtain the encoded information of the traffic identification area image corresponding to the station number and the latitude and longitude information. In this embodiment, the traffic identification area image in the collected road video frame is identified, that is, the signs and markings in the road video frame are identified to determine the location information and traffic type of the signs and markings. Then, the traffic identification area image is matched with the traffic template database, and the text data and symbol identification data in the traffic identification area image are filled into the target template. According to the preset traffic rules and preset encoding method, the filled traffic signs are encoded to obtain the encoded information of the traffic identification area image corresponding to the station number and the latitude and longitude information. In this way, various types of traffic signs and markings can be standardized to generate corresponding encoded information, which can be applied to various machine recognition.
[0004] Firstly, some embodiments of this application provide a method for encoding markings, including: Acquire road video frames, wherein the road video frames include station numbers and latitude and longitude information; A pre-trained recognition model is used to identify the road video frame and determine the traffic recognition area image in the road video frame. The recognition model is obtained by training a neural network model with sample data. The traffic recognition area image includes text data and symbol identification data. Based on a pre-stored traffic template database, a target template corresponding to the traffic recognition area image is determined; The text data and symbol data in the traffic identification area image are filled into the target template to obtain the filled traffic sign. The filled traffic sign is then encoded according to preset traffic rules and preset encoding methods to obtain the encoded information of the traffic identification area image corresponding to the station number and the latitude and longitude information.
[0005] Some embodiments of this application identify traffic identification area images in collected road video frames, specifically identifying signs and markings in the road video frames to determine the location information and traffic type of the signs and markings. Then, the traffic identification area images are matched with a traffic template database, and the text data and symbol data from the traffic identification area images are filled into a target template. According to preset traffic rules and preset encoding methods, the filled traffic signs are encoded to obtain the encoded information of the traffic identification area images corresponding to the station number and the latitude and longitude information. In this way, various types of traffic signs and markings can be standardized to generate corresponding encoded information, which can be applied to various machine recognition methods.
[0006] Optionally, determining the target template corresponding to the traffic recognition area image based on a pre-stored traffic template database includes: The traffic identification area image is input into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic identification area image; Based on a pre-stored template library, the similarity between the second feature vector in the template library and the first feature vector is calculated respectively; wherein, the pre-stored template library includes templates and second feature vectors corresponding to the templates; If the similarity is less than a preset value, the template corresponding to the second feature vector corresponding to the similarity will be used to determine the target template corresponding to the traffic recognition area image.
[0007] In some embodiments of this application, after obtaining the signs and markings in the road video frame, the signs and markings are matched with a pre-stored template. This can standardize various similar traffic recognition area images and obtain a template that can be automatically recognized by the machine.
[0008] Optionally, the step of filling the target template with text data and symbol data from the traffic recognition area image to obtain the filled traffic sign includes: Obtain the configuration parameters of the target template, including slot name, slot address information and keyword mapping information; Map the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image; If the text location coordinates and symbol identification coordinates of the target template match the traffic recognition area image, the text data and the symbol identification data are filled into the corresponding slots in the target template according to the configuration parameters.
[0009] Some embodiments of this application match the positions of each slot in the obtained target template with the text and symbols in the traffic recognition area image, and then fill the text and symbols into the corresponding slots of the target template to facilitate subsequent unified encoding.
[0010] Optionally, filling the text data and the symbol identification data into the corresponding slots in the target template according to the configuration parameters includes: The text data and the symbol identification data are verified using preset verification rules to obtain text data and symbol identification data that pass the verification. The preset verification rules include format verification rules, logic verification rules and ambiguity resolution verification rules. By using a pre-defined traffic domain mapping library, the verified text data and symbol identification data are standardized to obtain standardized text data and symbol identification data. Based on the slot name, slot address information, and keyword mapping information in the configuration parameters, the standardized text data and symbol identification data are respectively filled into the corresponding slots in the target template.
[0011] In some embodiments of this application, standardized text data and symbol identification data are obtained by verifying and standardizing the text data and symbol identification data, and then the standardized data is filled into the corresponding slots of the target template, thereby improving the accuracy of the mark and line coding. Optionally, mapping the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image includes: According to the pre-set transformation matrix, the text position coordinates and symbol identification position coordinates of the target template are mapped to the traffic recognition area image to obtain the mapped target template; Calculate the overlap region and Euclidean distance between the mapped target template and the traffic recognition area image; If the overlapping area is smaller than a preset area value and the Euclidean distance is smaller than a preset distance, then the traffic recognition area image and the target template are determined to match.
[0012] Some embodiments of this application map the target template onto the traffic recognition area image, effectively resisting coordinate offset caused by shooting distortion; and adopt a "nearest neighbor priority" strategy for overlapping areas to avoid multiple characters being mistakenly filled in the same slot.
[0013] Optionally, the method further includes: Having obtained the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information, the image is verified in the preset road network database based on the latitude and longitude information and semantic information of the traffic identification area image. If the latitude, longitude, and semantic information of the traffic identification area image do not match the information in the preset road network database, the traffic identification area image will be corrected.
[0014] Some embodiments of this application combine road network information to verify, correct, and improve the coded signs, thereby improving the accuracy of the generated signs and markings.
[0015] Optionally, the acquisition of road video frames, wherein the road video frames include station numbers and latitude and longitude information, includes: Acquire road data, wherein the road data includes at least road video data, radar data, latitude and longitude information, and station numbers; Based on the timestamp, the road video data, the radar data, the latitude and longitude information, and the station number are associated to obtain the associated road data; The road video frame is determined based on the associated road data.
[0016] Optionally, the station number is determined based on latitude and longitude information, operating time, radar data, and the mileage station number identified through a station number recognition model.
[0017] Secondly, some embodiments of this application provide an encoding device for markings and lines, including: An acquisition module is used to acquire road video frames, wherein the road video frames include station numbers and latitude and longitude information; The recognition module is used to identify the road video frame using a pre-trained recognition model to determine the traffic recognition region image in the road video frame. The recognition model is obtained by training a neural network model using sample data. The traffic recognition region image includes text data and symbol identification data. The matching module is used to determine the target template corresponding to the traffic recognition area image based on a pre-stored traffic template database; The generation module is used to fill the target template with text data and symbol identification data from the traffic identification area image to obtain the filled traffic sign, and to encode the filled traffic sign according to preset traffic rules and preset encoding method to obtain the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information.
[0018] Some embodiments of this application identify traffic identification area images in collected road video frames, specifically identifying signs and markings in the road video frames to determine the location information and traffic type of the signs and markings. Then, the traffic identification area images are matched with a traffic template database, and the text data and symbol data from the traffic identification area images are filled into a target template. According to preset traffic rules and preset encoding methods, the filled traffic signs are encoded to obtain the encoded information of the traffic identification area images corresponding to the station number and the latitude and longitude information. In this way, various types of traffic signs and markings can be standardized to generate corresponding encoded information, which can be applied to various machine recognition methods.
[0019] Optionally, the matching module is used to: The traffic identification area image is input into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic identification area image; Based on a pre-stored template library, the similarity between the second feature vector in the template library and the first feature vector is calculated respectively; wherein, the pre-stored template library includes templates and second feature vectors corresponding to the templates; If the similarity is less than a preset value, the template corresponding to the second feature vector corresponding to the similarity will be used to determine the target template corresponding to the traffic recognition area image.
[0020] In some embodiments of this application, after obtaining the signs and markings in the road video frame, the signs and markings are matched with a pre-stored template. This can standardize various similar traffic recognition area images and obtain a template that can be automatically recognized by the machine.
[0021] Optionally, the matching module is used to: Obtain the configuration parameters of the target template, including slot name, slot address information and keyword mapping information; Map the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image; If the text location coordinates and symbol identification coordinates of the target template match the traffic recognition area image, the text data and the symbol identification data are filled into the corresponding slots in the target template according to the configuration parameters.
[0022] Some embodiments of this application match the positions of each slot in the obtained target template with the text and symbols in the traffic recognition area image, and then fill the text and symbols into the corresponding slots of the target template to facilitate subsequent unified encoding.
[0023] Optionally, the matching module is used to: The text data and the symbol identification data are verified using preset verification rules to obtain text data and symbol identification data that pass the verification. The preset verification rules include format verification rules, logic verification rules and ambiguity resolution verification rules. By using a pre-defined traffic domain mapping library, the verified text data and symbol identification data are standardized to obtain standardized text data and symbol identification data. Based on the slot name, slot address information, and keyword mapping information in the configuration parameters, the standardized text data and symbol identification data are respectively filled into the corresponding slots in the target template.
[0024] In some embodiments of this application, standardized text data and symbol identification data are obtained by verifying and standardizing the text data and symbol identification data, and then the standardized data is filled into the corresponding slots of the target template, thereby improving the accuracy of the mark and line coding. Optionally, the matching module is used to: According to the pre-set transformation matrix, the text position coordinates and symbol identification position coordinates of the target template are mapped to the traffic recognition area image to obtain the mapped target template; Calculate the overlap region and Euclidean distance between the mapped target template and the traffic recognition area image; If the overlapping area is smaller than a preset area value and the Euclidean distance is smaller than a preset distance, then the traffic recognition area image and the target template are determined to match.
[0025] Some embodiments of this application map the target template onto the traffic recognition area image, effectively resisting coordinate offset caused by shooting distortion; and adopt a "nearest neighbor priority" strategy for overlapping areas to avoid multiple characters being mistakenly filled in the same slot.
[0026] Optionally, the generation module is used for: Having obtained the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information, the image is verified in the preset road network database based on the latitude and longitude information and semantic information of the traffic identification area image. If the latitude, longitude, and semantic information of the traffic identification area image do not match the information in the preset road network database, the traffic identification area image will be corrected.
[0027] Some embodiments of this application combine road network information to verify, correct, and improve the coded signs, thereby improving the accuracy of the generated signs and markings.
[0028] Optionally, the acquisition module is used to: Acquire road data, wherein the road data includes at least road video data, radar data, latitude and longitude information, and station numbers; Based on the timestamp, the road video data, the radar data, the latitude and longitude information, and the station number are associated to obtain the associated road data; The road video frame is determined based on the associated road data.
[0029] Optionally, the station number is determined based on latitude and longitude information, operating time, radar data, and the mileage station number identified through a station number recognition model.
[0030] Thirdly, some embodiments of this application provide an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor, when executing the program, can implement the marking and tagged encoding method as described in any embodiment of the first aspect.
[0031] Fourthly, some embodiments of this application provide a computer-readable storage medium having a computer program stored thereon, which, when executed by a processor, can implement the marking and tagged line encoding method as described in any embodiment of the first aspect.
[0032] Fifthly, some embodiments of this application provide a computer program product, the computer program product including a computer program, wherein when the computer program is executed by a processor, it can implement the marking and tagged encoding method as described in any embodiment of the first aspect. Attached Figure Description
[0033] To more clearly illustrate the technical solutions of some embodiments of this application, the accompanying drawings used in some embodiments of this application will be briefly described below. It should be understood that the following drawings only show some embodiments of this application and should not be regarded as a limitation of the scope. For those skilled in the art, other related drawings can be obtained based on these drawings without creative effort.
[0034] Figure 1 A flowchart illustrating a marking and marking encoding method provided in an embodiment of this application; Figure 2 A flowchart illustrating another method for encoding markings provided in this application embodiment; Figure 3 This is a schematic diagram of the identified markings provided in the embodiments of this application; Figure 4 This is a schematic diagram of a template provided for an embodiment of this application; Figure 5 A flowchart illustrating the training process of the feature extraction model provided in this application embodiment; Figure 6 This is a schematic diagram of the slot filling process provided in an embodiment of this application; Figure 7 This is a schematic diagram of the filled template provided in an embodiment of this application; Figure 8 A schematic diagram of the encoding information corresponding to the filled template provided in the embodiments of this application; Figure 9 A schematic diagram of the structure of a marking and marking encoding device provided in an embodiment of this application; Figure 10 This is a schematic diagram of an electronic device provided in an embodiment of this application. Detailed Implementation
[0035] The technical solutions of some embodiments of this application will now be described with reference to the accompanying drawings.
[0036] It should be noted that similar reference numerals and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures. Furthermore, in the description of this application, terms such as "first," "second," etc., are used only to distinguish descriptions and should not be construed as indicating or implying relative importance.
[0037] With the widespread application of autonomous driving technology and intelligent transportation, traditional traffic signs and markings can no longer fully meet the high-precision and high-reliability requirements of autonomous driving. Autonomous vehicles rely on perception systems (such as LiDAR, cameras, and ultrasonic sensors) to "understand" road conditions. This requires that signs and markings not only be accurate and clear in physical form, but also provide readable digital information for the autonomous driving system. However, most current road signs and markings are primarily visual, guiding traditional drivers but lacking sufficient digital information support for autonomous driving systems, thus limiting their performance in complex traffic environments.
[0038] The current traffic sign and marking design system is essentially a human-computer interaction system built on human cognitive patterns, and its information transmission mechanism has significant limitations in scenario adaptability. Specifically, in traditional driving scenarios, directional signs follow the principle of "minimizing information presentation," only needing to convey key route selection elements to the driver; the completeness, semantic uniqueness, and logical consistency of the sign's information are not core considerations. However, when the application scenario switches to autonomous driving systems, the information interaction logic of traffic signs undergoes a fundamental change—requiring the establishment of a standardized semantic system based on machine recognition. Each visual element must carry unique and definite directional information, and any semantic ambiguity or missing information will directly affect the vehicle's environmental perception accuracy and decision-making reliability.
[0039] The fundamental difference in this design paradigm stems from the essential distinction between human drivers and autonomous driving systems in terms of information processing: humans rely on pattern recognition and experience-based inference to complete information, while autonomous driving systems strictly adhere to structured data-driven decision-making logic. Therefore, upgrading traffic signage systems for vehicle-road cooperation requires breaking through the traditional "human-centered" design framework and constructing a standardized information coding system that meets the perception needs of autonomous driving, ensuring precise alignment between physical traffic signs and semantic mapping in the digital world.
[0040] In the process of coding road signs and markings, it is necessary not only to meet the design specifications of the signs and markings, but also to master the digital coding method and be familiar with the local road network. Therefore, coding directional signs requires a combination of digital coding logic, directional sign design, and a certain level of understanding of the local road network. Therefore, this application provides a vehicle-mounted automatic coding method for road signs and markings, which can automatically identify road signs and markings along the route (including location, content, attributes, etc.) as the vehicle travels, and generate standard codes that can express the meaning of the signs and markings.
[0041] like Figure 1 As shown, an embodiment of this application provides an encoding method for markings, the method comprising: S101. Acquire road video frames, wherein the road video frames include station numbers and latitude and longitude information; Specifically, the terminal device acquires road data, which includes road video data, radar data, latitude and longitude information, and station numbers. Specifically, road video data is acquired through industrial cameras or action cameras, and radar data is acquired through radar equipment. Then, station number identification is performed based on the acquired road video data to obtain the station numbers corresponding to video frames within a preset time period. At the same time, GPS is used to acquire the latitude and longitude information of the video frames.
[0042] S102. Using a pre-trained recognition model, the road video frame is identified to determine the traffic recognition area image in the road video frame and the traffic type corresponding to the traffic recognition area image. The recognition model is obtained by training a neural network model with sample data. The recognition model is used to identify the location and traffic type of the traffic recognition area image in the road video frame. The traffic recognition area image includes text data and symbol identification data. Specifically, the terminal device trains a neural network model based on sample data. The neural network model is a YOLOv11 model, which is used to identify the location information of signs and markings and traffic types in road video frames.
[0043] For each road video frame, the terminal device extracts features to obtain a feature vector corresponding to the road video frame. This feature vector is then input into a pre-trained recognition model to obtain the location information and traffic signs of the traffic recognition area image in the road video frame. Specifically, the traffic recognition area image consists of road signs and markings that conform to the traffic standard GB5768. The traffic signs include speed limit signs, intersection signs, warning signs, exit number signs, and directional signs.
[0044] Subsequently, after acquiring the traffic recognition area image, the terminal device uses PaddleOCR to recognize the text data in the traffic recognition area image, which includes text and numbers. Then, the YOLOV11 model is used to recognize the signs and markings in the traffic recognition area image other than text and numbers, including straight, turning, and arrows.
[0045] S103. Determine the target template corresponding to the traffic recognition area image based on the pre-stored traffic template database; Specifically, the terminal device has a traffic template database pre-stored, which includes a variety of standard templates. After acquiring a traffic recognition area image, the terminal device extracts the feature vector of the traffic recognition area image and calculates the similarity between the feature vector of the traffic recognition area image and the feature vector of each standard template in the traffic template database. If the similarity is less than a preset value, the standard template corresponding to the similarity is determined as the target template corresponding to the traffic recognition area image.
[0046] S104. Fill the target template with the text data and symbol identification data in the traffic identification area image to obtain the filled traffic sign. Then, according to the preset traffic rules and preset encoding method, encode the filled traffic sign to obtain the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information.
[0047] Specifically, the terminal device extracts the text data and symbol data from the traffic recognition area image from the original image, and fills the text data and symbol data into the corresponding positions of the target template according to the positional mapping relationship between the traffic recognition area image and the target template to obtain the filled traffic sign. Then, according to the preset traffic rules and preset encoding method, the filled traffic sign is encoded to obtain the encoded information corresponding to the traffic recognition area image with the station number and latitude and longitude information. The preset traffic rules are various standard rules that comply with traffic regulations, and the preset encoding method can be JSON, etc.
[0048] Some embodiments of this application identify traffic identification area images in collected road video frames, specifically identifying signs and markings in the road video frames to determine the location information and traffic type of the signs and markings. Then, the traffic identification area images are matched with a traffic template database, and the text data and symbol data from the traffic identification area images are filled into a target template. According to preset traffic rules and preset encoding methods, the filled traffic signs are encoded to obtain the encoded information of the traffic identification area images corresponding to the station number and the latitude and longitude information. In this way, various types of traffic signs and markings can be standardized to generate corresponding encoded information, which can be applied to various machine recognition methods.
[0049] Another embodiment of this application further supplements the description of the encoding method for signs and markings provided in the above embodiments.
[0050] like Figure 2 As shown in the embodiments of this application, a complete set of vehicle-mounted sign and marking fast encoding methods is provided to recognize and understand road signs and markings during driving, and to perform topology and correction in combination with road network information to finally form an encoding.
[0051] The whole process is as follows Figure 2 As shown, the data is presented in the form of a flow channel diagram and is divided into four main modules: the data acquisition module, the information association module, the video module, and the correction output module.
[0052] Data Acquisition Module: Collects basic road information (i.e., road data) through integrated vehicle-mounted equipment, including road video data, matched radar data, latitude and longitude information, station numbers, and external information (GIS information geographic information system).
[0053] Information association module: Associates road video frames and radar data based on latitude and longitude, station number, and road video data according to preset units (such as timestamps).
[0054] The vision module is used to identify traffic recognition areas, i.e., signs and markings, in road video frames. Specifically, it includes: 1. Identify the positions of signs and markings in the image; 2. Identify the elements (text, arrows, layout) in signs and markings; 3. Perform template matching and relationship analysis on the layout to understand the semantic meaning of signs and markings; 4. Encode and express the semantics of signs and markings.
[0055] The output correction module is used to correct the encoded information generated corresponding to the traffic identification area image. Specifically, it includes: performing road network matching in a GIS (Geographic Information System) based on the encoded information, verifying and correcting the encoded result; correcting the encoding based on standard knowledge and road network results constructed according to 5768.2, 5768.3, etc.; and generating sign layouts from the encoded information.
[0056] Specifically, the acquisition of road video frames, wherein the road video frames include station numbers and latitude and longitude information, including: Acquire road data, wherein the road data includes at least road video data, radar data, latitude and longitude information, and station numbers; Based on the timestamp, the road video data, the radar data, the latitude and longitude information, and the station number are associated to obtain the associated road data; The road video frame is determined based on the associated road data.
[0057] Optionally, the station number is determined based on latitude and longitude information, operating time, radar data, and the mileage station number identified through a station number recognition model.
[0058] Specifically, the acquisition module includes a road video acquisition submodule, a latitude and longitude information acquisition submodule, a stationing information acquisition submodule, and an external information access submodule, wherein: Road video acquisition submodule: Acquires road video data by mounting an industrial camera or action camera on the outside of the vehicle at 30 frames per second.
[0059] Latitude and longitude information acquisition submodule: External antenna, acquires latitude and longitude coordinate information and timestamp once per second; The station number information acquisition submodule: When the acquisition begins and during the acquisition process, the station number can be entered into the handheld device to obtain the frame corresponding to the station number. Alternatively, the YOLOv11 model can be used to identify the mileage station in the road video frame through the acquired road video frame. The station number corresponding to the road video frame can be identified through OCR, and the distance to the mileage station can be calculated through radar to deduce the current station number.
[0060] External information acquisition submodule: Based on latitude and longitude information, automatically match road network information within 300 kilometers and key node road network information within 1000 kilometers (for subsequent correction of directional signs).
[0061] Information Association Module: The information association module includes an information management submodule and a stationing calculation submodule. The information association module is used to associate information such as video frames, radar data (point cloud data), latitude and longitude, and stationing through timestamps, and to match the time range of radar data.
[0062] Station number calculation: Based on the latitude and longitude information per second, calculate the distance interval between two seconds. Using the identified station number and the input station number as anchor points, calculate the station number in meters to both ends.
[0063] The vision module consists of a sign and marking detection submodule, an OCR (Optical Character Recognition) module, a tracking module, a ReID (Person re-identification) module, an image detection module, a layout matching module, a semantic prediction module, and an encoding module.
[0064] The sign and marking detection submodule uses YOLOv11 to detect and identify signs and markings, and to perceive the position of signs and markings, traffic types, etc. in the interval frames.
[0065] The OCR module is used to crop out directional signs and signs with text and numbers from the original image, and then use PaddleOCR to recognize the text and numbers within them.
[0066] The tracking module is used to integrate with the sign and marking detection submodule through Multitarget-tracker to track all signs and non-continuous markings (content in multiple consecutive video frames).
[0067] The ReID module is used to re-identify markers and non-continuous markings by combining text content recognized by the tracking module and the OCR module, and to associate the same marker or marking within all frames and assign it a unique ID. Simultaneously, it combines the latitude and longitude station numbers assigned to each frame obtained by the information association module with the point cloud data synchronized by the acquisition module to deduce the specific latitude and longitude information, station number, and type for each ID marker and marking.
[0068] The image detection module is used to crop out the sign panels and road text, symbols, and markings from the image. Using a trained YOLOv11 model, it identifies arrows, graphics, and symbols within the sign panels, such as... Figure 3 As shown.
[0069] Optionally, determining the target template corresponding to the traffic recognition area image based on a pre-stored traffic template database includes: The traffic identification area image is input into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic identification area image; Based on a pre-stored template library, the similarity between the second feature vector in the template library and the first feature vector is calculated respectively; wherein, the pre-stored template library includes templates and second feature vectors corresponding to the templates; If the similarity is less than a preset value, the template corresponding to the second feature vector corresponding to the similarity will be used to determine the target template corresponding to the traffic recognition area image.
[0070] Specifically, the terminal device stores a traffic template database, which contains 60 sets of directional sign templates for all layout formats of directional signs in 5768.2. Examples of the templates are shown below. Figure 4 As shown.
[0071] Terminal equipment adopts such as Figure 5 The model training flowchart shown illustrates the training of the feature extraction model, which involves constructing a lightweight triplet network to identify which standard template corresponds to the previously detected directional sign. The triplet network constructs a "resistant feature fingerprint" generator: during training, a real-life sign image is used as the anchor, the corresponding standard template is considered a positive sample, and other template images with different layout structures are considered negative samples. 128-dimensional features are extracted by sharing a lightweight CNN backbone (such as EfficientNetB0), and the following calculations are performed: dap = llfa - fpll. 2, dan =llfa-fnll 2 Then, the ternary loss function is calculated, and the network is forced to learn the embedding space of "tight clustering of similar template features and significant separation of dissimilar ones" by using the triplet loss. This makes the model automatically focus on the essential layout structure such as arrow layout and text partitioning, while being "immune" to shooting interference. The terminal device inputs the traffic recognition area image into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic recognition area image; then, based on a pre-stored template library, it calculates the similarity between the second feature vector in the template library and the first feature vector; wherein, the pre-stored template library includes templates and second features corresponding to the templates, that is, when deploying vectors, the standard template library generates feature indexes offline (FAISS acceleration), and after extracting features from the real-shot image in a single inference, it performs cosine similarity matching in seconds, without the need for geometric correction or OCR, and adding new templates only requires supplementing the feature library.
[0072] In some embodiments of this application, after obtaining the signs and markings in the road video frame, the signs and markings are matched with a pre-stored template. This can standardize various similar traffic recognition area images and obtain a template that can be automatically recognized by the machine.
[0073] Optionally, the step of filling the target template with text data and symbol data from the traffic recognition area image to obtain the filled traffic sign includes: Obtain the configuration parameters of the target template, including slot name, slot address information and keyword mapping information; Map the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image; If the text location coordinates and symbol identification coordinates of the target template match the traffic recognition area image, the text data and the symbol identification data are filled into the corresponding slots in the target template according to the configuration parameters.
[0074] Some embodiments of this application match the positions of each slot in the obtained target template with the text and symbols in the traffic recognition area image, and then fill the text and symbols into the corresponding slots of the target template to facilitate subsequent unified encoding.
[0075] Optionally, filling the text data and the symbol identification data into the corresponding slots in the target template according to the configuration parameters includes: The text data and the symbol identification data are verified using preset verification rules to obtain text data and symbol identification data that pass the verification. The preset verification rules include format verification rules, logic verification rules and ambiguity resolution verification rules. By using a pre-defined traffic domain mapping library, the verified text data and symbol identification data are standardized to obtain standardized text data and symbol identification data. Based on the slot name, slot address information, and keyword mapping information in the configuration parameters, the standardized text data and symbol identification data are respectively filled into the corresponding slots in the target template.
[0076] In some embodiments of this application, standardized text data and symbol identification data are obtained by verifying and standardizing the text data and symbol identification data, and then the standardized data is filled into the corresponding slots of the target template, thereby improving the accuracy of the mark and line coding. Optionally, mapping the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image includes: According to the pre-set transformation matrix, the text position coordinates and symbol identification position coordinates of the target template are mapped to the traffic recognition area image to obtain the mapped target template; Calculate the overlap region and Euclidean distance between the mapped target template and the traffic recognition area image; If the overlapping area is smaller than a preset area value and the Euclidean distance is smaller than a preset distance, then the traffic recognition area image and the target template are determined to match.
[0077] Specifically, the semantic prediction module in this embodiment is used to generate the final semantics by combining the layout matched with the OCR results. The entire process is as follows: Figure 6 As shown, it specifically includes: 1. Load predefined semantic rules using a JSON templated rule library. Each template defines a structured schema, including slot names, normalized region coordinates, keyword mapping tables, and unit rules. It is dynamically loaded through a lightweight rule engine and supports hot updates—when adding a template for a local standard, only the configuration file needs to be supplemented.
[0078] 2. Coordinate region matching is used to accurately map the normalized region coordinates in the rule to the pixel domain of the real-shot image based on the homography transformation matrix obtained in the template matching stage; calculate the IoU and Euclidean distance between the center point of the OCR text / icon bounding box and the target region, and set a tolerance threshold of ±15% to effectively resist coordinate offsets caused by shooting distortion; for the overlapping region, adopt the "nearest neighbor first" strategy. This nearest neighbor (Nearest Neighbor) first strategy means that in the feature space, the sample with the closest distance is selected. In the embodiments of this application, when mapping a text or symbol to a certain slot on the template, the text or symbol with the closest distance to the slot is selected, and the text or symbol with the closest distance to the slot is filled into the slot to avoid mis-filling the same slot with multiple texts.
[0079] 3. Slot filling, fill the text content with successful coordinate matching into the predefined slots, and synchronously perform triple verification: ① Format verification (regularly verify whether the distance is a number + unit); ② Logical verification (the consistency of the arrow direction and the text space of "turn right"); ③ Context verification (eliminate ambiguity for text data with ambiguous data, such as "exit ahead" which refers to ambiguous data); trigger lightweight inference of the LLM for low-confidence slots to assist in completion to ensure filling integrity.
[0080] 4. Unit standardization / keyword mapping, integrate the text cleaning pipeline of the knowledge base, including: regularize units uniformly and correct typos, and build a keyword mapping library for the transportation field. This keyword mapping library for the transportation field includes transportation types, names, priorities, and grades; For example, correct the misrecognized text data in text recognition, and modify "Medical Anhui" to "Hospital"; Each text word is stored in the following way: "Service area" is represented as {"type": "service_area", "priority": 2}; "Hospital" is represented as {"type": "hospital"}).
[0081] After cleaning the text data, then perform standardization processing on the cleaned text data to become a unified format that can be recognized by machines and become machine-readable semantic tags. That is, convert the recognized text data, such as Chinese characters, numbers, etc. into English format for easy machine recognition.
[0082] 5. Structured semantic output, according to the Schema-on-read / write principle, output the standardized data as standardized JSON, that is, according to the semantics output by the semantic module, combine the encoding rules of GB30699 to convert JSON into encoding.
[0083] The standard GB / T 30699 specifies the coding of signs and markings, defining the methods for expressing (coding) information such as the type, content, location, and performance of signs and markings. For example... Figure 7 As shown, this directional sign expresses the following through its prescribed meaning: Driving to the "upper left" ahead (distance not shown) will take you to "City C"; Turn right ahead (distance not shown) to reach City D; Link "E102" and "City A" together (regardless of direction or order); Link "F304" and "City B" together (regardless of direction or order); Turn left ahead (distance not shown) to reach City A and E102; Driving forward (distance not shown) to the "upper right" will take you to "City B" and "F304"; The roundabout contains four directions in sequence: left, upper left, upper right, and right (clockwise). Ahead (distance and direction not shown) is a roundabout; The sign is located at chainage K121+496 (uphill); the sign is located at (116.65565900, 39.74691800); the sign is located on National Highway 201, uphill; the sign is located on the right-hand side of the road; the sign is a single-suspension sign; the sign is a rectangular sign with a blue background, white lettering, a white border, and a blue trim; the sign's retroreflection coefficient; the sign's asset management ID and maintenance time; [The remaining text appears to be incomplete and possibly contains errors.] Figure 7 After encoding the Chinese icons, generate Figure 8 The encoded information shown.
[0084] After being coded according to the coding requirements for traffic signs and markings in GB / T 30699, the traffic signs and markings have a digital expression method for navigation, autonomous driving, and intelligent vehicles, which can effectively support the development of intelligent driving-related fields and effectively solve the limitations of traffic signs and markings based on human cognitive patterns. Their information transmission mechanism has significant limitations in scene adaptability.
[0085] Some embodiments of this application map the target template onto the traffic recognition area image, effectively resisting coordinate offset caused by shooting distortion; and adopt a "nearest neighbor priority" strategy for overlapping areas to avoid multiple characters being mistakenly filled in the same slot.
[0086] Optionally, the method further includes: Having obtained the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information, the image is verified in the preset road network database based on the latitude and longitude information and semantic information of the traffic identification area image. If the latitude, longitude, and semantic information of the traffic identification area image do not match the information in the preset road network database, the traffic identification area image will be corrected.
[0087] Specifically, the terminal device also includes a correction output module for correcting the encoded information of the generated traffic recognition area image. The correction output module includes a road network matching module, a sign knowledge module, a correction module, an encoding module, and a layout generation module, wherein: The road network matching module is used to retrieve geographic information related to directional signs from the GIS based on the type and location of those signs. It uses the GPS coordinates and sign type as joint query conditions and performs spatial and semantic dual filtering in the GIS road network database: a buffer zone is defined centered on the sign coordinates, filtering geographic features within the buffer that match the sign type, and returning the topological relationships of the features.
[0088] For example, the radius of a highway scene is 50KM, urban roads are 5Km, and the main location names can be based on the topology of phenotype type within 500KM; An exit warning sign (type: exit_preview) was detected at K88+200 on the expressway (coordinates: 30.2561°N, 120.1892°E).
[0089] The terminal device returns the following content from the GIS: Export Name: Export to "Hangzhou West"; Export code: G60-12; Connecting road: XXXXX; Topology: Exit XX is 480 meters from the current sign. The ramp leads to the road, etc. This is actually a road network topology diagram. Based on the understood semantic meaning of the markers, relevant semantic geographic information is obtained from GIS.
[0090] Use the structured results output by the semantic parsing module as query input: For example, {"destination":"West Lake","distance":3000,"direction":"left"}: Terminal devices can also use geocoding for correction, calling the stored map API or local POI library to standardize "a certain lake" as a unique geographic entity and eliminate ambiguity such as "a certain lake district" or "a certain lake street"; The terminal device calculates the azimuth and straight-line distance between the entity coordinates and the marker location, and verifies whether they conform to the semantic direction and distance (allowing ±15% error), i.e., performs spatial verification; then, combined with the road network topology, it supplements the "recommended path node sequence from the current point to a certain lake" to achieve path enhancement; The terminal device is also equipped with a sign knowledge module, which is a reasoning engine based on traffic engineering specifications and road network topology. Its core is to structure industry standards such as "GB 5768.2—2022 Road Traffic Signs and Markings Part 2: Road Traffic Signs" and "JTG D81—2017 Design Specifications for Highway Traffic Safety Facilities" into a computable knowledge graph. This module not only stores static specifications such as the establishment conditions, information selection principles, layout combination rules, and spacing requirements of various directional signs, but also integrates dynamic reasoning capabilities.
[0091] The terminal device inputs the road network location of the sign, sign type (e.g., exit warning, location direction sign, etc.), road level (highway / expressway / arterial road), and topological relationships of adjacent nodes (upstream / downstream entrances / exits, intersections, POI distribution). The rule engine (custom AST parser) performs multi-layered reasoning to determine compliance: whether the current location allows the placement of this type of sign (e.g., highway exit warning signs should be placed at three levels: 2km, 1km, and 500m from the exit). Based on the principles of "priority based on importance, distance from nearest to farthest, and complementary functions," the system automatically filters the destinations to be displayed from the road network. Then, it performs content integrity verification, checking for missing mandatory information (e.g., highway exits must include the exit number and connecting road name). After processing, it generates "required semantic content" conforming to national standards, used for comparison with actual recognition results, supporting sign compliance assessment, high-precision map updates, or anomaly detection. The generated coded information is corrected, and the directional signs are modified based on the semantic results output by the sign knowledge module, combining the original semantics. The revised sign semantics are then converted from JSON to standard encoding according to GB30699. The layout generation module is a vector-level automatic directional sign drawing engine that conforms to mandatory standards. Its core function is to reverse-interpret the structured semantic encoding according to GB / T 30699—2025. That is, it parses the semantics of the sign from the obtained encoding information, combines it with the standard template for directional signs, and follows the technical requirements for layout, font, color, size, etc., specified in GB 5768.2—2022 to generate the directional sign layout.
[0092] Some embodiments of this application combine road network information to verify, correct, and improve the coded signs, thereby improving the accuracy of the generated signs and markings.
[0093] It should be noted that each of the implementable methods in this embodiment can be implemented individually or in any combination without conflict. This application does not limit this.
[0094] Another embodiment of this application provides a sign and marking encoding device for performing the sign and marking encoding method provided in the above embodiments.
[0095] like Figure 9 The diagram shown is a structural schematic of the marking and sign encoding device provided in an embodiment of this application. The marking and sign encoding device includes an acquisition module 1001, an identification module 1002, a matching module 1003, and a generation module 1004, wherein: The acquisition module 1001 is used to acquire road video frames, wherein the road video frames include station numbers and latitude and longitude information; The recognition module 1002 is used to recognize the road video frame using a pre-trained recognition model to determine the traffic recognition area image in the road video frame. The recognition model is obtained by training a neural network model using sample data. The traffic recognition area image includes text data and symbol identification data. The matching module 1003 is used to determine the target template corresponding to the traffic recognition area image based on a pre-stored traffic template database; The generation module 1004 is used to fill the text data and symbol identification data in the traffic identification area image into the target template to obtain the filled traffic sign, and to encode the filled traffic sign according to the preset traffic rules and preset encoding method to obtain the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information.
[0096] Regarding the apparatus in this embodiment, the specific manner in which each module performs its operations has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0097] Some embodiments of this application identify traffic identification area images in collected road video frames, specifically identifying signs and markings in the road video frames to determine the location information and traffic type of the signs and markings. Then, the traffic identification area images are matched with a traffic template database, and the text data and symbol data from the traffic identification area images are filled into a target template. According to preset traffic rules and preset encoding methods, the filled traffic signs are encoded to obtain the encoded information of the traffic identification area images corresponding to the station number and the latitude and longitude information. In this way, various types of traffic signs and markings can be standardized to generate corresponding encoded information, which can be applied to various machine recognition methods.
[0098] Another embodiment of this application further provides a description of the coding device for markings and lines provided in the above embodiments.
[0099] Optionally, the matching module is used to: The traffic identification area image is input into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic identification area image; Based on a pre-stored template library, the similarity between the second feature vector in the template library and the first feature vector is calculated respectively; wherein, the pre-stored template library includes templates and second feature vectors corresponding to the templates; If the similarity is less than a preset value, the template corresponding to the second feature vector corresponding to the similarity will be used to determine the target template corresponding to the traffic recognition area image.
[0100] In some embodiments of this application, after obtaining the signs and markings in the road video frame, the signs and markings are matched with a pre-stored template. This can standardize various similar traffic recognition area images and obtain a template that can be automatically recognized by the machine.
[0101] Optionally, the matching module is used to: Obtain the configuration parameters of the target template, including slot name, slot address information and keyword mapping information; Map the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image; If the text location coordinates and symbol identification coordinates of the target template match the traffic recognition area image, the text data and the symbol identification data are filled into the corresponding slots in the target template according to the configuration parameters.
[0102] Some embodiments of this application match the positions of each slot in the obtained target template with the text and symbols in the traffic recognition area image, and then fill the text and symbols into the corresponding slots of the target template to facilitate subsequent unified encoding.
[0103] Optionally, the matching module is used to: The text data and the symbol identification data are verified using preset verification rules to obtain text data and symbol identification data that pass the verification. The preset verification rules include format verification rules, logic verification rules and ambiguity resolution verification rules. By using a pre-defined traffic domain mapping library, the verified text data and symbol identification data are standardized to obtain standardized text data and symbol identification data. Based on the slot name, slot address information, and keyword mapping information in the configuration parameters, the standardized text data and symbol identification data are respectively filled into the corresponding slots in the target template.
[0104] In some embodiments of this application, standardized text data and symbol identification data are obtained by verifying and standardizing the text data and symbol identification data, and then the standardized data is filled into the corresponding slots of the target template, thereby improving the accuracy of the mark and line coding. Optionally, the matching module is used to: According to the pre-set transformation matrix, the text position coordinates and symbol identification position coordinates of the target template are mapped to the traffic recognition area image to obtain the mapped target template; Calculate the overlap region and Euclidean distance between the mapped target template and the traffic recognition area image; If the overlapping area is smaller than a preset area value and the Euclidean distance is smaller than a preset distance, then the traffic recognition area image and the target template are determined to match.
[0105] Some embodiments of this application map the target template onto the traffic recognition area image, effectively resisting coordinate offset caused by shooting distortion; and adopt a "nearest neighbor priority" strategy for overlapping areas to avoid multiple characters being mistakenly filled in the same slot.
[0106] Optionally, the generation module is used for: Having obtained the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information, the image is verified in the preset road network database based on the latitude and longitude information and semantic information of the traffic identification area image. If the latitude, longitude, and semantic information of the traffic identification area image do not match the information in the preset road network database, the traffic identification area image will be corrected.
[0107] Some embodiments of this application combine road network information to verify, correct, and improve the coded signs, thereby improving the accuracy of the generated signs and markings.
[0108] Optionally, the acquisition module is used to: Acquire road data, wherein the road data includes at least road video data, radar data, latitude and longitude information, and station numbers; Based on the timestamp, the road video data, the radar data, the latitude and longitude information, and the station number are associated to obtain the associated road data; The road video frame is determined based on the associated road data.
[0109] Optionally, the station number is determined based on latitude and longitude information, operating time, radar data, and the mileage station number identified through a station number recognition model.
[0110] Regarding the apparatus in this embodiment, the specific manner in which each module performs its operations has been described in detail in the embodiments related to the method, and will not be elaborated upon here.
[0111] It should be noted that each of the implementable methods in this embodiment can be implemented individually or in any combination without conflict. This application does not limit this.
[0112] This application also provides a computer-readable storage medium storing a computer program thereon. When the program is executed by a processor, it can implement the operation of any of the methods corresponding to the methods in the above embodiments of the marking and grading methods.
[0113] This application also provides a computer program product, which includes a computer program, wherein when the computer program is executed by a processor, it can implement the operation of any of the methods corresponding to the methods in the above embodiments of the marking and grading encoding methods.
[0114] like Figure 10 As shown, some embodiments of this application provide an electronic device 1100, which includes: a memory 1110, a processor 1120, and a computer program stored in the memory 1110 and executable on the processor 1120. When the processor 1120 reads the program from the memory 1110 via a bus 1130 and executes the program, it can implement any of the methods included in the above-described marking and labeling encoding method.
[0115] Processor 1120 can process digital signals and may include various computing architectures. For example, it may be a complex instruction set computer architecture, a reduced instruction set computer architecture, or an architecture that implements multiple instruction set combinations. In some examples, processor 1120 may be a microprocessor.
[0116] Memory 1110 can be used to store instructions executed by processor 1120 or data related to the execution of instructions. These instructions and / or data may include code for implementing some or all of the functions of one or more modules described in the embodiments of this application. Processor 1120 of this disclosure embodiment can be used to execute instructions in memory 1110 to implement the methods shown above. Memory 1110 includes dynamic random access memory, static random access memory, flash memory, optical memory, or other memories well known to those skilled in the art.
[0117] The above are merely embodiments of this application and are not intended to limit the scope of protection of this application. Various modifications and variations can be made to this application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc., made within the spirit and principles of this application should be included within the scope of protection of this application. It should be noted that similar reference numerals and letters in the following figures indicate similar items; therefore, once an item is defined in one figure, it does not need to be further defined and explained in subsequent figures.
[0118] The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
[0119] It should be noted that, in this document, relational terms such as "first" and "second" are used only to distinguish one entity or operation from another, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Furthermore, the terms "comprising," "including," or any other variations thereof are intended to cover non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements includes not only those elements but also other elements not expressly listed, or elements inherent to such a process, method, article, or apparatus. Without further limitations, an element defined by the phrase "comprising one..." does not exclude the presence of other identical elements in the process, method, article, or apparatus that includes said element.
Claims
1. A method for encoding signs and markings, characterized in that, The method includes: Acquire road video frames, wherein the road video frames include station numbers and latitude and longitude information; A pre-trained recognition model is used to identify the road video frame, determine the traffic identification area image in the road video frame and the traffic type corresponding to the traffic identification area image, wherein the recognition model is obtained by training a neural network model with sample data, and the recognition model is used to identify the location and traffic type of the traffic identification area image in the road video frame, wherein the traffic identification area image includes text data and symbol identification data; Based on a pre-stored traffic template database, a target template corresponding to the traffic recognition area image is determined; The text data and symbol data in the traffic identification area image are filled into the target template to obtain the filled traffic sign. The filled traffic sign is then encoded according to preset traffic rules and preset encoding methods to obtain the encoded information of the traffic identification area image corresponding to the station number and the latitude and longitude information.
2. The encoding method for signs and markings according to claim 1, characterized in that, The step of determining the target template corresponding to the traffic recognition area image based on a pre-stored traffic template database includes: The traffic identification area image is input into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic identification area image; Based on a pre-stored template library, the similarity between the second feature vector in the template library and the first feature vector is calculated respectively; wherein, the pre-stored template library includes templates and second feature vectors corresponding to the templates; If the similarity is less than a preset value, the template corresponding to the second feature vector corresponding to the similarity will be used to determine the target template corresponding to the traffic recognition area image.
3. The encoding method for signs and markings according to claim 1, characterized in that, The step of filling the target template with text data and symbol data from the traffic recognition area image to obtain the filled traffic sign includes: Obtain the configuration parameters of the target template, including slot name, slot address information and keyword mapping information; Map the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image; If the text location coordinates and symbol identification coordinates of the target template match the traffic recognition area image, the text data and the symbol identification data are filled into the corresponding slots in the target template according to the configuration parameters.
4. The encoding method for signs and markings according to claim 3, characterized in that, The step of filling the text data and the symbol identification data into the corresponding slots in the target template according to the configuration parameters includes: The text data and the symbol identification data are verified using preset verification rules to obtain text data and symbol identification data that pass the verification. The preset verification rules include format verification rules, logic verification rules and ambiguity resolution verification rules. By using a pre-defined traffic domain mapping library, the verified text data and symbol identification data are standardized to obtain standardized text data and symbol identification data. Based on the slot name, slot address information, and keyword mapping information in the configuration parameters, the standardized text data and symbol identification data are respectively filled into the corresponding slots in the target template.
5. The coding method for signs and markings according to claim 3, characterized in that, The step of mapping the text location coordinates and symbol identification location coordinates of the target template to the traffic recognition area image includes: According to the pre-set transformation matrix, the text position coordinates and symbol identification position coordinates of the target template are mapped to the traffic recognition area image to obtain the mapped target template; Calculate the overlap region and Euclidean distance between the mapped target template and the traffic recognition area image; If the overlapping area is smaller than a preset area value and the Euclidean distance is smaller than a preset distance, then the traffic recognition area image and the target template are determined to match.
6. The encoding method for signs and markings according to claim 1, characterized in that, The method further includes: Having obtained the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information, the image is verified in a preset road network database based on the latitude and longitude information and semantic information of the traffic identification area image. If the latitude, longitude, and semantic information of the traffic identification area image do not match the information in the preset road network database, the traffic identification area image will be corrected.
7. The encoding method for signs and markings according to claim 1, characterized in that, The acquisition of road video frames, wherein the road video frames include station numbers and latitude and longitude information, includes: Acquire road data, wherein the road data includes at least road video data, radar data, latitude and longitude information, and station numbers; Based on the timestamp, the road video data, the radar data, the latitude and longitude information, and the station number are associated to obtain the associated road data; The road video frame is determined based on the associated road data.
8. The coding method for signs and markings according to claim 7, characterized in that, The station number is determined based on latitude and longitude information, operating time, radar data, and the mileage station number identified through a station number recognition model.
9. A coding device for signs and markings, characterized in that, The device includes: An acquisition module is used to acquire road video frames, wherein the road video frames include station numbers and latitude and longitude information; The recognition module is used to identify the road video frame using a pre-trained recognition model to determine the traffic recognition region image in the road video frame. The recognition model is obtained by training a neural network model using sample data. The traffic recognition region image includes text data and symbol identification data. The matching module is used to determine the target template corresponding to the traffic recognition area image based on a pre-stored traffic template database; The generation module is used to fill the target template with text data and symbol identification data from the traffic identification area image to obtain the filled traffic sign, and to encode the filled traffic sign according to preset traffic rules and preset encoding method to obtain the encoding information of the traffic identification area image corresponding to the station number and the latitude and longitude information.
10. The marking and coding device according to claim 9, characterized in that, The matching module is used for: The traffic identification area image is input into a pre-trained feature extraction model to obtain a first feature vector corresponding to the traffic identification area image; Based on a pre-stored template library, the similarity between the second feature vector in the template library and the first feature vector is calculated respectively; wherein, the pre-stored template library includes templates and second feature vectors corresponding to the templates; If the similarity is less than a preset value, the template corresponding to the second feature vector corresponding to the similarity will be used to determine the target template corresponding to the traffic recognition area image.
11. An electronic device, characterized in that, It includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein when the processor executes the program, it can implement the marking encoding method of any one of claims 1-8.
12. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores a computer program, characterized in that, when the program is executed by a processor, it can implement the encoding method for marking lines as described in any one of claims 1-8.