Road surface character recognition processing method, device and medium

What is AI technical title?
AI technical title is built by PatSnap AI team. It summarizes the technical point description of the patent document.
By separating road surface point clouds from laser point cloud data and converting them into two-dimensional BEV images, and combining image semantic segmentation and text recognition networks, the accuracy and efficiency issues of road surface text recognition in high-precision maps are solved, achieving efficient and automated text extraction and map production.

CN116229446BActive Publication Date: 2026-06-23WUHAN NAVINFO TECH CO LTD

View PDF 2 Cites 0 Cited by

Patent Information

Authority / Receiving Office: CN · China
Patent Type: Patents(China)
Current Assignee / Owner: WUHAN NAVINFO TECH CO LTD
Filing Date: 2023-03-10
Publication Date: 2026-06-23

Smart Images

Figure CN116229446B_ABST

Patent Text Reader

Abstract

The application provides a kind of road surface character recognition processing method, device and medium, the method comprises: obtaining the laser point cloud of the road surface to be identified collected by map collection vehicle;According to road surface separation algorithm, road surface point cloud is obtained from laser point cloud data, and road surface point cloud is converted into two-dimensional intensity overhead BEV image;Using image semantic segmentation neural network, BEV image is identified and processed to obtain semantic information containing road surface sign, and the pixel containing character is separated from the semantic information containing road surface sign;According to the pixel containing character, and using text recognition network, the character detection frame containing road surface character content is obtained and saved to map database.The accuracy and efficiency of the processing of road surface character recognition are high, so as to improve the accuracy and efficiency of high-precision map making with road surface printed characters.

Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to the field of high-precision map technology, and in particular to a method, apparatus and medium for road surface text recognition. Background Technology

[0002] High-definition maps (HD maps) are of great significance in the field of autonomous driving. Autonomous vehicles can use HD maps to understand the road conditions and surrounding environment, enabling them to make appropriate driving decisions. Given the auxiliary role of HD maps in autonomous driving, the accurate production of HD maps has become a hot research topic for those skilled in the art, especially the production of HD maps containing printed text related to road regulations.

[0003] In existing technologies, the production of high-precision maps with printed road text first requires acquiring road images with printed road text, using a target detection model to perform target detection processing on the road images to identify the text box pixels of the printed road text, and then performing image depth transformation processing on the coordinates corresponding to the text box pixels to obtain the text recognition results; then the road recognition results are marked on the corresponding positions of the high-precision map to complete the production of the high-precision map with printed road text.

[0004] However, existing technologies for acquiring road surface images are easily affected by weather factors and obstructions, which may result in unclear road surface images in the initial acquisition. Based on these unclear images, the road surface text recognition results obtained by using depth image conversion processing methods are not accurate enough. Furthermore, the positional deviation of depth image processing increases the workload of manual correction, thereby reducing the processing efficiency of text recognition and thus affecting the efficiency of high-precision map production. Summary of the Invention

[0005] This invention provides a method, apparatus, and medium for recognizing road surface text, in order to solve the problems of low accuracy and efficiency in the production of high-precision maps carrying printed road surface text in the prior art.

[0006] A first aspect of the present invention provides a method for recognizing and processing road surface text, comprising:

[0007] Obtain the laser point cloud of the road surface to be identified, collected by the map data collection vehicle;

[0008] According to the road surface separation algorithm, the road surface point cloud is separated from the laser point cloud data and converted into a two-dimensional intensity top-view BEV image.

[0009] An image semantic segmentation neural network is used to perform recognition processing on the BEV image to obtain semantic information containing road markings, and to separate and obtain pixels containing text from the semantic information containing road markings.

[0010] Based on the pixels containing text, and using a text recognition network, text detection boxes containing road surface text are obtained and saved to the map database.

[0011] In one optional implementation, the step of separating and obtaining the road surface point cloud from the laser point cloud according to the road surface separation algorithm includes:

[0012] The coordinate information of each point in the laser point cloud is traversed to obtain the maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value. The maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value are used as the boundary of the laser point cloud.

[0013] Based on the boundaries of the laser point cloud, determine the mesh division parameters, and based on the mesh division parameters, divide the laser point cloud to construct a point cloud mesh;

[0014] Obtain the GPS trajectory of the map collection vehicle, and based on the GPS trajectory, obtain the altitude range of the collection vehicle;

[0015] Using the GPS trajectory as seed points, effective grids are obtained from the point cloud grid based on the maximum height value in the height range of the map acquisition vehicle, a preset height threshold range, and a preset density threshold, wherein the point cloud density in each effective grid exceeds the preset density threshold.

[0016] The point set composed of the points of the effective grid is taken as the road surface point cloud.

[0017] In one optional embodiment, converting the road surface point cloud into a two-dimensional intensity-viewed BEV image includes:

[0018] The GPS trajectory is divided according to a preset distance to obtain multiple trajectory segments, wherein each trajectory segment includes multiple GPS points.

[0019] Traverse each trajectory segment to verify whether the multiple GPS points in the trajectory segment actually fall within the trajectory segment;

[0020] For each trajectory segment after verification:

[0021] Using the central GPS point in the trajectory segment as the center, the maximum and minimum recording times among the recording times of all GPS points in the trajectory segment are used as the time interval. The road surface point cloud is traversed to obtain the first road surface point cloud within the time interval.

[0022] Based on the distance from the first road surface point cloud to each GPS point in the trajectory segment, a second road surface point cloud that meets a preset range is selected, and the second road surface point cloud is mapped to a preset two-dimensional pixel coordinate system to record the position index of the second road surface point cloud and the pixel coordinate system.

[0023] The BEV image is generated from the second road surface point cloud based on the location index.

[0024] In one optional implementation, traversing each trajectory segment to verify whether the plurality of GPS points in the trajectory segment actually fall within the trajectory segment includes:

[0025] For the k-th trajectory segment, based on the preset distance D, the formula is used:

[0026] centerDis k = (k*(1-overlay)+0.5)*D

[0027] Obtain the first distance from the center GPS point in the k-th trajectory segment to the starting GPS point of the GPS trajectory, and iterate through the other GPS points in the k-th trajectory segment to check if their second distances to the center GPS point are within the range of centerDis. k Within a range of / 2, determine whether the other GPS points fall within the kth trajectory segment;

[0028] Here, overlay represents the repetition rate of GPS trajectory lengths between adjacent trajectory segments.

[0029] In one optional implementation, the step of obtaining a text detection box containing road surface text content based on the text-containing pixels and using a text recognition network includes:

[0030] For each text pixel, the associated text pixels are obtained using the connected component search method. The text pixels and their associated text pixels are then clustered to obtain text string pixels.

[0031] Based on the pixel coordinates of the text string pixels, obtain the corner points of the minimum bounding box of the text string pixels, and based on the corner points, obtain the text string atomic image corresponding to the text string pixels and the black and white text image corresponding to the text string atomic image.

[0032] A text recognition network is used to perform text recognition processing on the black and white text image to obtain text detection boxes containing road surface text content.

[0033] A second aspect of the present invention provides a processing apparatus for road surface text recognition, comprising:

[0034] The acquisition module is used to acquire the laser point cloud of the road surface to be identified, collected by the map acquisition vehicle;

[0035] The processing module is used to separate and obtain the road surface point cloud from the laser point cloud data according to the road surface separation algorithm, and convert the road surface point cloud into a two-dimensional intensity top-view BEV image.

[0036] The acquisition module is further configured to use an image semantic segmentation neural network to perform recognition processing on the BEV image to obtain semantic information containing road markings, and to separate and obtain pixels containing text from the semantic information containing road markings.

[0037] The processing module is further configured to obtain text detection boxes containing road surface text based on the text-containing pixels and using a text recognition network, and save them to the map database.

[0038] In one optional implementation, the acquisition module is specifically used for:

[0039] The coordinate information of each point in the laser point cloud is traversed to obtain the maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value. The maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value are used as the boundary of the laser point cloud.

[0040] Obtain the GPS trajectory of the map collection vehicle, and based on the GPS trajectory, obtain the altitude range of the collection vehicle;

[0041] Based on the boundary of the laser point cloud and the height of the acquisition vehicle, the grid division parameters are determined, and the laser point cloud is divided according to the grid division parameters to construct a point cloud grid.

[0042] Using the GPS trajectory as seed points, effective grids are obtained from the point cloud grid based on the maximum height value in the height range of the map acquisition vehicle, a preset height threshold range, and a preset density threshold, wherein the point cloud density in each effective grid exceeds the preset density threshold.

[0043] The point set composed of the points of the effective grid is taken as the road surface point cloud.

[0044] A third aspect of the present invention provides a processing apparatus for road surface text recognition, the processing apparatus comprising: at least one processor and a cloud server;

[0045] The cloud server is used to store computer execution instructions;

[0046] The at least one processor executes the computer execution instructions of the cloud server to implement the above-described method for road surface text recognition.

[0047] A fourth aspect of the present invention provides a computer-readable storage medium storing computer-executable instructions, which, when executed by a processor, implement the above-described method for road surface text recognition.

[0048] A fifth aspect of the present invention provides a computer program product including computer instructions that, when executed by a processor, implement the above-described method for recognizing road surface text.

[0049] This invention provides a method, apparatus, and medium for road surface text recognition. The method includes: acquiring laser point clouds of the road surface to be recognized collected by a map acquisition vehicle; separating the road surface point cloud from the laser point cloud data according to a road surface separation algorithm, and converting the road surface point cloud into a two-dimensional intensity-viewed BEV image; using an image semantic segmentation neural network to perform recognition processing on the BEV image to obtain semantic information containing road surface markings, and separating pixels containing text from the semantic information containing road surface markings; based on the pixels containing text, and using a text recognition network, obtaining text detection boxes containing road surface text content, and saving them to a map database. Compared with existing technologies, the road surface text recognition processing method provided by this invention automatically extracts and outputs road surface text elements, reducing labor costs while improving the accuracy and efficiency of high-precision map production carrying road surface printed text. Attached Figure Description

[0050] Figure 1 A flowchart illustrating an embodiment of the road surface text recognition processing method provided by the present invention;

[0051] Figure 2 A schematic flowchart of Embodiment 2 of the road surface text recognition processing method provided by the present invention;

[0052] Figure 3 A flowchart illustrating Embodiment 3 of the road surface text recognition processing method provided by the present invention;

[0053] Figure 4 A schematic flowchart of Embodiment 4 of the road surface text recognition processing method provided by the present invention;

[0054] Figure 5 A schematic diagram of the structure of a processing device for road surface text recognition provided by the present invention (Embodiment 1);

[0055] Figure 6 This is a schematic diagram of the second embodiment of the road surface text recognition processing device provided by the present invention. Detailed Implementation

[0056] To make the objectives, technical solutions, and advantages of this invention clearer, the technical solutions in the embodiments of this invention will be clearly and completely described below in conjunction with the embodiments of this invention. Obviously, the described embodiments are only some embodiments of this invention, not all embodiments. Based on the embodiments of this invention, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this invention.

[0057] High-precision maps, which can serve autonomous driving systems, represent a new map data paradigm for self-driving vehicles. High-precision maps accurately and comprehensively represent road features and require higher real-time performance. Therefore, how to create high-precision maps with high accuracy has become a hot research topic for researchers in the field, especially the creation of high-precision maps with printed text on the road surface.

[0058] Existing technologies include methods for creating high-precision maps with printed road text based on data reported by map collection vehicles. However, since the data processed in this method is presented as images, and the clarity of the collected images is easily affected by weather, the high-precision maps with printed road text created from images with low clarity are not accurate enough. In addition, existing methods include a step of transforming the coordinates of the collected images into three-dimensional coordinates. This step has a deviation in the transformation position, which requires manual correction by those skilled in the art, thus reducing the efficiency of creating high-precision maps with printed road text.

[0059] Based on the above-mentioned technical problems, the inventive concept of this invention is: how to design a road surface text recognition method with high accuracy and high efficiency.

[0060] Figure 1 This is a flowchart illustrating an embodiment of the road surface text recognition processing method provided by the present invention. Figure 1 The execution entity of the illustrated method flow can be a road surface text recognition processing device, which can be implemented by any software and / or hardware. For example... Figure 1 As shown, the road surface text recognition processing method provided in this embodiment may include:

[0061] S201, acquire the laser point cloud of the road surface to be identified collected by the map collection vehicle.

[0062] In this embodiment, when the map acquisition vehicle performs map acquisition and processing on the road surface to be identified, the high-precision acquisition equipment, such as LiDAR, mounted on the vehicle can acquire multiple LiDAR point clouds of the scene where the road surface to be identified is located. The LiDAR point clouds include three-dimensional coordinates, reflection intensity, etc. Specifically, the three-dimensional coordinates refer to the three-dimensional pixel coordinates of all objects on the road surface to be identified, in the airspace above it, and in its surroundings. The reflection intensity refers to the degree of reflection of different ground objects by the LiDAR.

[0063] It should be noted that after the map data collection vehicle acquires the laser point cloud of the road surface to be identified, the multiple single-frame laser point clouds acquired from the same location need to be fused to obtain a more accurate laser point cloud. Optionally, the map data collection vehicle can fuse multiple single-frame laser point clouds according to the positional relationship between consecutive frames and the coordinates of the acquisition area to obtain a fused laser point cloud. For ease of explanation, the fused laser point cloud will still be referred to as the laser point cloud in the following description.

[0064] Correspondingly, after the map data collection vehicle completes the fusion process, it sends the fused laser point cloud to the road surface text recognition processing device so that the processing device can perform road surface text recognition processing.

[0065] It is conceivable that the method provided in this embodiment can simultaneously process multiple different road surface laser point clouds to obtain text content for multiple different road surfaces to be identified, and then create high-precision maps based on the text content of multiple different road surfaces to be identified. Since the principle of processing laser point clouds of multiple different road surfaces to be identified is similar to that of processing laser point clouds of a single road surface to be identified, this embodiment only uses a single road surface to be identified as an example to illustrate the road surface text recognition processing.

[0066] S202, based on the road surface separation algorithm, separate and obtain the road surface point cloud from the laser point cloud, and convert the road surface point cloud into a two-dimensional intensity top-view BEV image.

[0067] In this embodiment, after receiving the laser point cloud uploaded by the acquisition vehicle, the processing device immediately retrieves the road separation algorithm stored in its internal memory. This algorithm was developed by technicians through repeated analysis and processing of a large number of point clouds. The algorithm is used to separate the point clouds corresponding to objects on the road surface to be identified and those above the road surface from the laser point cloud.

[0068] After the processing device separates the road surface point cloud from the laser point cloud, it retrieves the GPS driving trajectory of the map collection vehicle and combines it with the preset regional road segment index rules inside the processing device to map the road surface point cloud into a two-dimensional strong top-down BEV image for subsequent road surface text recognition processing.

[0069] The preset regional road segment index rule specifically refers to the mapping relationship between GPS points in the GPS trajectory and road surface point cloud within a preset area. The specific implementation of the regional road segment index rule includes, but is not limited to, being set by those skilled in the art based on experience.

[0070] It is conceivable that, in this embodiment, after separating the road surface point cloud related to the road surface to be identified from the laser point cloud, the point cloud related to each GPS point is determined from the road surface point cloud according to the regional road segment index rules. In this way, the processing device will finally obtain the point cloud with the highest correlation to the road surface to be identified. This greatly reduces the amount of data to be processed and can improve the processing efficiency of road surface text recognition. In addition, since a preset area range of GPS points is set, the loss of point clouds related to the road surface text content is avoided. Furthermore, this embodiment maps the road surface point cloud into a high-resolution BEV image, which can improve the accuracy of road surface text recognition.

[0071] S203 employs an image semantic segmentation neural network to perform recognition processing on BEV images to obtain semantic information containing road markings, and separates pixels containing text from the semantic information containing road markings.

[0072] In this embodiment, after the processing device acquires the BEV image, it needs to input the BEV image into a preset image semantic segmentation neural network to identify the semantic information of the road markings, and extract the semantic information corresponding to the text markings from the semantic information of the road markings based on the text marking identifiers, that is, extract the text pixels containing the road text.

[0073] Optionally, the semantic segmentation network used in this embodiment can be the DeeplabV3Plus semantic segmentation network model, but is not limited to other neural network models that can be used for semantic segmentation. Specifically, the DeeplabV3Plus semantic segmentation network model in this embodiment has been trained using a road-related image training set before being put into use.

[0074] For example, a road sign can be “sign 1, text 2, and railing 3”. Correspondingly, after inputting the BEV image into the DeeplabV3Plus semantic segmentation network model, the semantic information corresponding to sign 1, text 2, and railing 3 can be obtained. Then, the processing device will extract the semantic information corresponding to text 2 from this semantic information.

[0075] S204. Based on pixels containing text, and using a text recognition network, obtain text detection boxes containing road surface text content, and save them to the map database.

[0076] In this embodiment, considering that the text pixels obtained from the image are all independent pixels, and that independent pixels cannot form continuous text content, the processing device needs to perform clustering processing on the text pixels after obtaining them to obtain a set of pixels associated with each road surface text. Optionally, the processing device can use a clustering algorithm to cluster the text pixels to obtain a set of multiple pixels, and generate a text string pixel based on these pixel sets.

[0077] In order to identify the text content corresponding to the pixels of the text string, the processing device converts the text string pixels into an atomic image of the text string, and then obtains the text content of the road surface to be identified based on the recognition processing of the image.

[0078] Furthermore, to improve the efficiency and accuracy of text recognition and processing, the processing device needs to perform color processing on the atomic images of the text string. Optionally, the processing device can use an adaptive single-peak thresholding method to perform color processing on the atomic images of the text string to obtain a binarized black and white text image.

[0079] Next, the processing device inputs the black and white text image into a pre-stored text recognition network. Optionally, the text recognition network used in this embodiment can be the RCNN text recognition network model, but is not limited to other neural network models that can be used for text recognition, similar to the DeeplabV3Plus semantic segmentation network model. The RCNN text recognition network model was also trained using a relevant training dataset before being put into use. Based on this, the processing device can obtain text detection boxes containing road surface text content, read the text detection boxes containing the text content, obtain the text content of the road surface to be recognized, and simultaneously save the text content in the map data.

[0080] This embodiment provides a method for recognizing road surface text. It acquires laser point clouds of the road surface to be identified from a map-collecting vehicle. Based on a road surface separation algorithm, it extracts the road surface point cloud from the laser point cloud and converts it into a two-dimensional intensity-based overhead BEV image. An image semantic segmentation neural network is then used to process the BEV image to obtain semantic information containing road markings. Pixels containing text are then extracted from this semantic information. Finally, based on these text-containing pixels, a text recognition network is used to obtain text detection boxes containing the road surface text content, which are then saved to a map database. Compared to existing technologies, this embodiment utilizes laser point clouds to extract text content from the road surface to be identified, solving the problem of insufficient accuracy in high-precision map production due to weather conditions. It also avoids the need for manual calibration by technicians, thus improving the efficiency of high-precision map production.

[0081] The following is combined with Figure 2The present invention further explains how the road surface text recognition processing method provided by the present invention separates and obtains the road surface point cloud from the laser point cloud according to the road surface separation algorithm. Figure 2 This is a flowchart illustrating a second embodiment of the road surface text recognition processing method provided by the present invention, as shown below. Figure 2 As shown, the road surface text recognition processing method provided in this embodiment may include:

[0082] S301, traverse the coordinate information of each point in the laser point cloud to obtain the maximum X value, maximum Y value, maximum Z value, minimum X value, minimum Y value and minimum Z value, and use the maximum X value, maximum Y value, maximum Z value, minimum X value, minimum Y value and minimum Z value as the boundary of the laser point cloud.

[0083] In this embodiment, when separating the road surface point cloud from the laser point cloud, it is first necessary to determine the boundary of the space formed by the laser point cloud, and then perform the road surface point cloud extraction process based on the boundary of the space.

[0084] It is conceivable that laser point clouds carry three-dimensional coordinates, and a large number of three-dimensional coordinates can be gathered together to form a three-dimensional space. Correspondingly, the boundary of the space formed by the laser point cloud can be determined by the range of values of the three-dimensional coordinates of the laser point cloud on the X-axis, Y-axis and Z-axis.

[0085] Specifically, the processing device traverses the coordinate information of each laser point cloud, that is, the coordinate values of the three-dimensional coordinates of the laser point cloud in the X-axis, Y-axis and Z-axis, and then determines the maximum and minimum X-values in the X-axis direction, the maximum and minimum Z-values in the Y-axis direction, and the maximum and minimum Z-values in the Z-axis direction. Thus, the processing device determines the spatial boundary corresponding to the laser point cloud.

[0086] This can be understood as follows: when the map collection vehicle is performing map collection and processing on the road surface to be identified, all objects in the scene that the map collection vehicle passes through can be collected by the LiDAR and form point clouds. Therefore, the LiDAR point cloud is equivalent to displaying the scene that the road surface to be identified passes through in a three-dimensional form, and the boundary of the space formed by the LiDAR point cloud is actually the boundary of the scene that the road surface to be identified passes through.

[0087] S302: Determine the meshing parameters based on the boundary of the laser point cloud, and perform meshing processing on the laser point cloud according to the meshing parameters to construct the point cloud mesh.

[0088] In this embodiment, the processing device pre-stores the size of a single voxel grid, which serves as the unit volume of the space formed by the laser point cloud. This size can be used to perform grid division processing on the space, and the size of the single voxel grid can be set by technicians according to the actual scene. For example, if the surrounding scene of the road surface to be identified is relatively complex, the value of the single voxel grid should be set to be smaller to ensure the accuracy of the processing.

[0089] Specifically, for ease of explanation, this embodiment uses the following example to illustrate the mesh division process:

[0090] For example, the maximum and minimum X values in the X-axis direction, the maximum and minimum Z values in the Y-axis direction, and the maximum and minimum Z values in the Z-axis direction will be represented by MaxX, MinX, MaxY, MinY, MaxZ, and MinZ, respectively. The size of a single voxel grid is set to gridSize = 0.5m (the length of a single grid in the XYZ directions is 0.5m). At the same time, it is assumed that the maximum slope of the road surface to be identified, maxSlope, is 0.11.

[0091] The processing device can use the following formula (1) to calculate the number of grids in the X-axis direction countX, use the following formula (2) to calculate the number of grids in the Y-axis direction countY, use the following formula (3) to calculate the number of grids in the Z-axis direction countZ, and use the following formula (4) to calculate the XY diagonal range maxXY.

[0092] countX=(MaxX-MinX) / gridSize formula (1)

[0093] countY=(MaxY-MinY) / gridSize formula (2)

[0094] countZ=(MaxZ-MinZ) / gridSize formula (3)

[0095]

[0096] Then, the processing device obtains the grid division parameters: countX, countY, countZ, and maxXY.

[0097] Optionally, the processing device can divide the corresponding number of grids in the X-axis, Y-axis and Z-axis directions, and determine the maximum boundary of the grid by combining the length range of the XY diagonal. Then, the processing device constructs the space corresponding to the laser point cloud into a point cloud grid and obtains the number of the point cloud grid on the corresponding X-axis, Y-axis and Z-axis. This number can be used to represent the spatial position of each grid in the point cloud grid on the X-axis, Y-axis and Z-axis.

[0098] It is conceivable that after the processing device processes the laser point cloud data according to S302, it can divide the space corresponding to the laser point cloud into multiple small cubes, and each small cube contains at least one laser point cloud.

[0099] After dividing the laser point cloud into point cloud grids, the processing device also needs to obtain the GPS trajectory of the map collection vehicle according to S303, and determine the grids related to the GPS trajectory in order to obtain the road surface point cloud.

[0100] S303: Obtain the GPS trajectory of the map collection vehicle and, based on the GPS trajectory, obtain the height range of the collection vehicle. Using the GPS trajectory as seed points, obtain valid grids from the point cloud grid based on the maximum height value in the height range of the map collection vehicle, the preset height threshold range, and the preset density threshold. The point cloud density in each valid grid exceeds the preset density threshold. The point set composed of the points of the valid grids is taken as the road surface point cloud.

[0101] In this embodiment, the processing device acquires the GPS trajectory formed by the road surface to be identified collected by the map collection vehicle. Optionally, the GPS trajectory can be generated by the dashcam of the map collection vehicle, and this embodiment is not limited to acquiring the GPS trajectory of the map collection vehicle from other implementation methods.

[0102] It should be noted that the high-precision data acquisition equipment of the map acquisition vehicle is set at a certain height. The processing device needs to determine the vehicle height based on the coordinates of each GPS point in the GPS trajectory along the Z-axis; that is, the maximum value of the GPS trajectory point coordinates along the Z-axis is used as the vehicle height. Furthermore, since the road surface to be identified may have slopes, to better reflect the actual scene, the vehicle height range can be determined based on the GPS trajectory point coordinates. Then, point cloud meshes within this height range can be selected.

[0103] Specifically, before filtering out point cloud grids within the height range, the processing device needs to establish an index relationship between each GPS point in the GPS trajectory and the point cloud grid. Optionally, to reduce the amount of data processing, the processing device only processes the coordinate values of the laser point cloud in the X-axis direction and the coordinate values in the Y-axis direction in this step. Preferably, the processing device calculates the coordinates (x, y) of each point cloud grid in the point cloud grid using the following formula (5) based on MinX, MinY, and gridSize.

[0104] x=MinX +m*gridSize, y=MinY+n*gridSize Formula (5)

[0105] Where m represents the grid number in the X-axis direction (numbering starts from 1), and n represents the grid number in the Y-axis direction (numbering starts from 1).

[0106] Then, for each point cloud grid, the processing device calculates the distance between the coordinates of the point cloud grid and the coordinates of each GPS point, determines the GPS point closest to the point cloud grid, and establishes an index relationship between the point cloud grid and the GPS point. For example, the index relationship established between the first GPS point and the coordinates of the point cloud grid is (1,1) can be (1,(1,1)).

[0107] More specifically, the processing device filters the point cloud grid that has an index relationship with the GPS points according to the aforementioned established index relationship, so as to obtain the road surface point cloud.

[0108] It should be noted that the processing device stores a preset height threshold range and a preset density threshold. The height threshold range represents the range of height fluctuations of the vehicle traveling on the road surface to be identified, and the preset density threshold represents the reasonableness of the point cloud grid density, i.e., the basis for ensuring that the point cloud grid contains neither too many nor too few points. The specific implementation methods of these two preset thresholds include, but are not limited to, being set by those skilled in the art based on experience.

[0109] Furthermore, the processing device uses GPS points as seed points to extend the search for valid point cloud grids. Specifically, based on the maximum height value within the height range of the map acquisition vehicle, the processing device iterates through the coordinates of all GPS points in the Z-axis direction, calculates the difference between the maximum height value and the coordinates of each GPS point in the Z-axis direction, and determines whether the absolute value of this difference falls within a preset height threshold range. If so, it proves that the point cloud grid corresponding to the current GPS point falls on the road surface where the map acquisition vehicle is traveling, and this point cloud grid is designated as a qualified grid.

[0110] Furthermore, the processing device needs to perform further screening of the qualified grids to determine whether the current qualified grid is a valid grid. Optionally, for each qualified grid, the processing device calculates the point cloud density of the qualified grid and compares the point cloud density with a preset density threshold. When the point cloud density exceeds the preset density threshold, the qualified grid corresponding to the point cloud density is regarded as a valid grid.

[0111] Similarly, the processing device can determine whether the point cloud grid corresponding to each GPS point is a valid grid, and then use the point set composed of the point clouds of the valid point cloud grids as the road surface point cloud.

[0112] This embodiment specifically illustrates the processing steps for quickly separating road surface point clouds from laser point clouds. By utilizing the processing steps provided in this embodiment, the amount of data processed for road surface text recognition can be reduced, thereby improving processing efficiency.

[0113] The following is combined with Figure 3 The present invention further explains how the road surface point cloud is converted into a two-dimensional intensity top-view BEV image in the road surface text recognition processing method provided by the present invention. Figure 3 This is a flowchart illustrating Embodiment 3 of the road surface text recognition processing method provided by the present invention, as follows: Figure 3 As shown, the road surface text recognition processing method provided in this embodiment may include:

[0114] S401, according to the preset distance, divides the GPS track into multiple track segments, where each track segment includes multiple GPS points.

[0115] In this embodiment, in order to convert the road surface point cloud into a two-dimensional intensity top-down BEV image, the processing device divides the GPS trajectory into segments so as to determine the road surface point cloud to be converted based on the center GPS point of the segmented trajectory. This can be understood as the processing device filtering out the road surface point cloud to be converted according to preset regional road segment area search rules.

[0116] Specifically, the processing device stores a preset distance, which represents the distance of each track segment after the GPS track has been divided. The specific implementation of the preset distance includes, but is not limited to, methods set by those skilled in the art based on experience. The processing device equally divides the GPS track segment into multiple track segments according to the preset distance, and each track segment includes multiple GPS points.

[0117] S402, traverse each track segment to verify whether multiple GPS points in the track segment actually fall within the track segment.

[0118] In this embodiment, considering the curvature of the road surface to be identified, it is necessary to verify the GPS points in each trajectory segment to prevent the GPS points corresponding to the curves formed by the map collection vehicle from deviating from their intended road segments when passing through large curves. This also prevents the conversion processing using the center GPS point of each trajectory segment as the conversion center from being inaccurate.

[0119] Specifically, the processing device will calculate the distance from the center GPS point of each trajectory segment to the starting GPS point of the GPS trajectory according to the following formula (6): centerDis kWhere k (k = 1, 2, ...) represents the k-th segment of the GPS track, D represents the preset distance, and overlay represents the repetition rate of the GPS track length between adjacent track segments.

[0120] centerDis k =(k*(1-overlay)+0.5)*D Formula (6)

[0121] This can be understood as the processing device obtaining the first distance from the center GPS point in the k-th trajectory segment to the starting GPS point of the GPS trajectory as centerDis. k .

[0122] More specifically, the processing device iterates through the k-th trajectory segment to check whether the second distance between other GPS points and the central GPS point is within centerDis. k Within a range of / 2, determine whether other GPS points fall within the k-th trajectory segment.

[0123] In other words, the processing device calculates the second distance from the non-center GPS point in the k-th trajectory segment to the center GPS point of that trajectory segment, and determines whether the second distance is half of the first distance. If so, it determines that the current GPS point actually falls within the current k-th trajectory segment.

[0124] Furthermore, after the processing device completes the verification process of the trajectory segment where the GPS point falls, the processing device will continue to execute S403.

[0125] S403: Using the center GPS point in the trajectory segment as the center, the maximum and minimum recording times of the recording times corresponding to the GPS points in the trajectory segment are used as the time interval. The road point cloud is traversed to obtain the first road point cloud within the time interval.

[0126] In this embodiment, the processing device determines the center GPS point for each trajectory segment. Specifically, it uses the starting GPS point of the trajectory segment as the first GPS point and obtains the GPS point in the middle of the sequence number as the center GPS point. Simultaneously, the processing device also needs to obtain the recording time corresponding to each GPS point collected within the trajectory segment, determine the maximum and minimum recording times for that trajectory segment, and determine the time interval based on these maximum and minimum recording times.

[0127] Subsequently, the processing device retrieves the point cloud grid corresponding to each GPS point in the trajectory segment, as well as the point cloud in the point cloud grid and the recording time of the point cloud, so that the first road surface point cloud located within the time interval can be determined based on the recording time of the point cloud.

[0128] As can be seen, this embodiment performs filtering processing on the road surface point cloud, filtering out point clouds that only fall within the time interval of the current trajectory segment. This reduces the amount of data to be processed and improves processing efficiency while ensuring the accuracy of road surface text recognition.

[0129] S404, based on the distance from the first road surface point cloud to each GPS point in the trajectory segment, filter the second road surface point cloud that meets the preset range, and map the second road surface point cloud to the preset two-dimensional pixel coordinates to record the position index of the second road surface point cloud and the pixel coordinates; generate a BEV image from the second road surface point cloud based on the position index.

[0130] In this embodiment, since the first road surface point cloud data obtained in S403 may be far from the ground, that is, in some scenarios, the point cloud within the vehicle height range may be objects above the road surface, such as speed measuring devices, the processing device needs to further filter the first road surface point cloud to obtain a more accurate road surface point cloud, so that the processing device can obtain the text content of the road surface to be identified with a smaller amount of data.

[0131] Specifically, for each first road surface point cloud, the processing device obtains the first distance when each first road surface point cloud is perpendicular to the current trajectory segment and the perpendicular foot point when it is perpendicular, and at the same time calculates the second distance between the perpendicular foot point and the starting GPS point of the current trajectory segment.

[0132] It should be noted that the processing device has a preset distance threshold, which is used as the radius of the rectangle with the central GPS point as the interception center, and the specific implementation of the distance threshold includes, but is not limited to, being set by those skilled in the art based on experience.

[0133] Subsequently, the processing device compares the first distance and the second distance with distance thresholds respectively. When both the first and second distance thresholds are less than a certain threshold, the first road surface point cloud is determined to be the second road surface point cloud. In this way, the processing device can extract the second road surface point cloud from the first road surface point cloud and simultaneously determine the maximum and minimum values of the second road surface point cloud's coordinates along the X and Y axes, thereby determining the spatial boundary corresponding to the second road surface point cloud. For example, the maximum value of the second road surface point cloud along the X-axis is represented by X... max1 To represent the minimum value on the X-axis is X. min1 To represent the maximum value on the Y-axis is denoted by Y. max1 To represent the minimum value on the Y-axis is denoted by Y. min1 To express.

[0134] More specifically, the processing device constructs two-dimensional pixel image coordinates based on a distance threshold and a preset resolution. Optionally, the square of the distance threshold can be used to represent the range of the two-dimensional pixel image, while the preset resolution is used as the basis for dividing the two-dimensional pixel image into coordinates, thereby constructing the two-dimensional pixel coordinates. For example, the distance threshold is 20m (X... max1 =Y min1 =20m), with a preset resolution grid of 0.02m, the pixel coordinate grid of the two-dimensional image corresponding to the second road surface point cloud is (2000, 2000).

[0135] Furthermore, the processing device traverses the second road surface point cloud corresponding to each trajectory segment and calculates the coordinates (X, X, Y) of each second road surface point cloud according to the following formula (7). i ,Y j The coordinate index of the constructed two-dimensional pixel coordinate grid:

[0136]

[0137] Where i represents the row vector index number corresponding to the second road surface point cloud and the two-dimensional pixel coordinates, and j represents the column vector index number corresponding to the second road surface point cloud and the two-dimensional pixel coordinates. Accordingly, the row vector index number and the column vector index number constitute the two-dimensional pixel coordinates corresponding to the second road surface point cloud.

[0138] Based on this, the processing device obtains the pixel coordinates corresponding to the second road surface point cloud. By calculating the average reflection intensity of all point clouds at each pixel coordinate, the pixel value of each pixel coordinate can be obtained. The processing device combines the pixel coordinates corresponding to the second road surface point cloud and the pixel values of the pixel coordinates for processing to generate a BEV image.

[0139] This embodiment specifically illustrates the processing method for converting road surface point clouds into two-dimensional BEV images. This method reduces the amount of data processing while ensuring the accuracy of extracting relevant road surface point clouds through the correspondence between GPS trajectory points and point cloud positions.

[0140] The following is combined with Figure 4 The present invention further explains how the method for recognizing road surface text provided by the present invention obtains text detection boxes containing road surface text content based on pixels containing text and by using a text recognition network. Figure 4 This is a flowchart illustrating Embodiment 4 of the road surface text recognition processing method provided by the present invention, as shown below. Figure 4 As shown, the road surface text recognition processing method provided in this embodiment may include:

[0141] S501: For each text pixel, use the connected component search method to obtain the associated text pixels, and perform clustering processing on the text pixels and the associated text pixels to obtain the text string pixels.

[0142] In this embodiment, after the processing device processes the BEV image using a text semantic segmentation neural network, it obtains the text pixels corresponding to the text content. The text pixels need to be processed into a detection image containing semantic information.

[0143] Specifically, the processing device performs clustering processing on the text pixels. Optionally, an eight-neighborhood connected component range search method can be used to search for associated text pixels in eight directions: top, top left, bottom right, bottom right, bottom left, bottom right, left, and right. The distance between the text pixel and its associated text pixels is calculated, and it is determined whether the distance is less than a preset range threshold. If it is less, it indicates that the current associated text pixel and the current text pixel belong to the same text pixel. Subsequently, the text pixels belonging to the same text pixel and their associated text pixels can be stored in the same text pixel set to obtain the clustered text pixels, i.e., text string pixels.

[0144] S502, based on the pixel coordinates of the text string pixels, obtain the corner points of the smallest bounding box of the text string pixels, and based on the corner points, obtain the text string atomic image corresponding to the text string pixels and the black and white text image corresponding to the text string atomic image.

[0145] Specifically, since text string pixels exist as a set of text pixels, the corner points of the text string pixels can be determined based on their coordinates. Optionally, by determining the boundary values of the text string pixels, the corner points of the minimum bounding box of the text string pixels can be determined, and cropping based on the corner points of the minimum bounding box can obtain the atomic map corresponding to the text string pixels.

[0146] Furthermore, the processing device uses an adaptive single-peak threshold segmentation method to process the atomic image corresponding to the text string, thereby obtaining a binarized black and white text image.

[0147] S503 uses a text recognition network to perform text recognition processing on black and white text images and obtain text detection boxes containing road surface text content.

[0148] In this embodiment, the processing device pre-stores a text recognition network. Optionally, this text recognition network can be a fully transferred-trained RCNN text recognition network, but is not limited to other neural network models that can be used to recognize text content. The RCNN text recognition network uses training data covering a vast amount of Chinese, English, and numerical text data during transfer training. Furthermore, to adapt to black and white images, the RCNN text recognition network requires fine-tuning on binarized black and white images during training to accommodate various scenarios such as horizontal and vertical orientations, single characters, and multiple characters.

[0149] Specifically, the processing device inputs the binarized black and white text image into the fully transferred RCNN text recognition network to detect the content in the black and white text image, thereby obtaining text detection boxes containing road text content. Then, by recognizing the detection boxes, the text content of the road surface to be recognized can be obtained.

[0150] Optionally, after recognizing the text content, the processor needs to bind the text content of the road surface to be recognized with an electronic map to complete the creation of a high-precision map carrying the printed text on the road surface, and to display the text content in the high-precision map for user use. Specifically, the processing device can obtain the point cloud set corresponding to the text string based on the corner points of the text string and the index of the point cloud corresponding to the text pixels in each text string. According to the average coordinates of the coordinates of the point cloud set, the coordinates in the actual electronic map corresponding to the road surface to be recognized can be obtained, thus completing the recording of the actual coordinates of the text content. Therefore, when creating and applying the high-precision map, the actual coordinates and the corresponding content can be directly read for creation or use.

[0151] This embodiment specifically illustrates the method for quickly and accurately recognizing text content based on BEV image features by transferring and training a fully-fledged RCNN text recognition network. By utilizing the mapping relationship between point clouds and two-dimensional pixel coordinates, the impact of weather conditions on the collection of road information to be recognized is avoided, as well as the need for manual correction of the recognized content by technicians. This improves the accuracy and efficiency of high-precision map production, and, while enabling the production of highly accurate high-precision maps, enhances user safety when driving using these maps.

[0152] Figure 5 This is a schematic diagram of the structure of a processing device for road surface text recognition provided by the present invention, as shown in Embodiment 1. Figure 5 As shown, the processing device 600 includes: an acquisition module 601 and a processing module 602.

[0153] The acquisition module 601 is used to acquire the laser point cloud of the road surface to be identified collected by the map acquisition vehicle.

[0154] The processing module 602 is used to separate and obtain the road surface point cloud from the laser point cloud data according to the road surface separation algorithm, and convert the road surface point cloud into a two-dimensional intensity top-view BEV image.

[0155] The acquisition module 601 is also used to perform recognition processing on the BEV image using an image semantic segmentation neural network to obtain semantic information containing road markings, and to separate and obtain pixels containing text from the semantic information containing road markings.

[0156] The processing module 602 is also used to obtain text detection boxes containing road surface text based on the pixels containing text and using a text recognition network, and save them to the map database.

[0157] Optionally, module 601 is used to obtain:

[0158] Traverse the coordinate information of each point in the laser point cloud to obtain the maximum X value, maximum Y value, maximum Z value, minimum X value, minimum Y value, and minimum Z value, and use the maximum X value, maximum Y value, maximum Z value, minimum X value, minimum Y value, and minimum Z value as the boundary of the laser point cloud.

[0159] Obtain the GPS trajectory of the map data collection vehicle, and based on the GPS trajectory, determine the altitude range of the vehicle.

[0160] Based on the boundaries of the laser point cloud and the height of the acquisition vehicle, the grid division parameters are determined, and the laser point cloud is divided according to the grid division parameters to construct the point cloud grid.

[0161] Using GPS trajectory as seed points, effective grids are obtained from point cloud grids based on the maximum height value in the height range of the map collection vehicle, the preset height threshold range, and the preset density threshold. The point cloud density in each effective grid exceeds the preset density threshold.

[0162] The point set composed of points from the effective grid is used as the road surface point cloud.

[0163] Optionally, processing module 602 is specifically used for:

[0164] The GPS track is divided according to a preset distance to obtain multiple track segments, each of which includes multiple GPS points.

[0165] Traverse each track segment to verify whether multiple GPS points within the track segment actually fall within the track segment;

[0166] For each trajectory segment after verification:

[0167] Using the central GPS point in the trajectory segment as the center, the maximum and minimum recording times of the recording times corresponding to all GPS points in the trajectory segment are used as the time interval. The road point cloud is traversed to obtain the first road point cloud within the time interval.

[0168] Based on the distance from the first road surface point cloud to each GPS point in the trajectory segment, a second road surface point cloud that meets the preset range is selected, and the second road surface point cloud is mapped to the preset two-dimensional pixel coordinates to record the position index of the second road surface point cloud and the pixel coordinates.

[0169] Based on the location index, generate a BEV image from the second road surface point cloud.

[0170] Optionally, the processing module 602 is also specifically used for:

[0171] For the k-th trajectory segment, based on the preset distance D, the formula is used:

[0172] centerDis k = (k*(1-overlay)+.0.5)*D

[0173] Obtain the first distance from the center GPS point in the k-th trajectory segment to the starting GPS point of the GPS trajectory, and iterate through the other GPS points in the k-th trajectory segment to check if their second distances from the center GPS point are within the range of centerDis. k Within a range of / 2, determine whether other GPS points fall on the kth trajectory segment;

[0174] Here, overlay represents the repetition rate of GPS trajectory lengths between adjacent trajectory segments.

[0175] Optionally, the processing module 602 is also specifically used for:

[0176] For each text pixel, the connected component search method is used to obtain the associated text pixels. Then, the text pixels and their associated text pixels are clustered to obtain the text string pixels.

[0177] Based on the pixel coordinates of the text string pixels, obtain the corner points of the minimum bounding box of the text string pixels, and based on the corner points, obtain the text string atomic image corresponding to the text string pixels and the black and white text image corresponding to the text string atomic image.

[0178] A text recognition network is used to perform text recognition processing on black and white text images to obtain text detection boxes containing road surface text content.

[0179] The road surface text recognition processing device provided in this embodiment is similar in principle and technical effect to the road surface text recognition processing method described above, and will not be described in detail here.

[0180] Figure 6 This is a schematic diagram of a second embodiment of the road surface text recognition processing device provided by the present invention. This road surface text recognition processing device can be, for example, a server. Figure 6 As shown, the processing device 700 includes a cloud server 701 and at least one processor 702.

[0181] Cloud server 701 is used to store computer execution instructions, laser point clouds, and GPS tracks.

[0182] The processor 702 is used to implement the road surface text recognition processing method in this embodiment when the computer execution instructions in the cloud server 701 are executed. For the specific implementation principle, please refer to the above embodiment. This embodiment will not be repeated here.

[0183] The road surface text recognition processing device 700 may also include an input / output interface 703.

[0184] The input / output interface 703 may include independent output and input interfaces, or it may be an integrated interface that combines input and output. The output interface is used to output data, and the input interface is used to acquire input data.

[0185] The present invention also provides a computer-readable storage medium storing computer-executable instructions, which, when at least one processor of the road surface text recognition processing device executes the computer-executable instructions, implement the road surface text recognition processing method in the above embodiments.

[0186] The present invention also provides a computer program product, including computer instructions, wherein when the computer instructions are executed by a processor, the processing method for road surface text recognition provided in the various embodiments described above is implemented.

[0187] Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, and not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that modifications can still be made to the technical solutions described in the foregoing embodiments, or equivalent substitutions can be made to some or all of the technical features; and these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for recognizing road surface text, characterized in that, include: Obtain the laser point cloud of the road surface to be identified, collected by the map data collection vehicle; The boundary of the laser point cloud is determined based on the coordinate information of the laser point cloud; Based on the boundaries of the laser point cloud, the laser point cloud is divided to construct a point cloud grid; using the GPS trajectory of the map acquisition vehicle as seed points, and combining a preset threshold range related to the driving height of the map acquisition vehicle and a point cloud density threshold, effective grids are selected from the point cloud grid; the set of point clouds within the effective grids is taken as the road surface point cloud, and the road surface point cloud is converted into a two-dimensional intensity top-down BEV image. An image semantic segmentation neural network is used to perform recognition processing on the BEV image to obtain semantic information containing road markings, and to separate and obtain pixels containing text from the semantic information containing road markings. Based on the pixels containing text, and using a text recognition network, text detection boxes containing road surface text are obtained and saved to the map database.

2. The method according to claim 1, characterized in that, Determining the boundary of the laser point cloud based on its coordinate information includes: The coordinate information of each point in the laser point cloud is traversed to obtain the maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value. The maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value are used as the boundary of the laser point cloud. The step of dividing the laser point cloud according to its boundaries to construct a point cloud mesh includes: Based on the boundaries of the laser point cloud, determine the mesh division parameters, and based on the mesh division parameters, divide the laser point cloud to construct a point cloud mesh; The process involves using the GPS trajectory of the map-collecting vehicle as seed points, and combining this with a preset threshold range related to the vehicle's altitude and a point cloud density threshold to filter out valid grids from the point cloud grid. The set of point clouds within these valid grids is then used as the road surface point cloud, including: Obtain the GPS trajectory of the map collection vehicle, and based on the GPS trajectory, obtain the altitude range of the map collection vehicle; Using the GPS trajectory as seed points, effective grids are obtained from the point cloud grid based on the maximum height value in the height range of the map acquisition vehicle, a preset height threshold range, and a preset density threshold, wherein the point cloud density in each effective grid exceeds the preset density threshold. The point set composed of the points of the effective grid is taken as the road surface point cloud.

3. The method according to claim 2, characterized in that, The process of converting the road surface point cloud into a two-dimensional intensity-viewed BEV image includes: The GPS trajectory is divided according to a preset distance to obtain multiple trajectory segments, wherein each trajectory segment includes multiple GPS points. Traverse each trajectory segment to verify whether the multiple GPS points in the trajectory segment actually fall within the trajectory segment; For each trajectory segment after verification: Using the central GPS point in the trajectory segment as the center, the maximum and minimum recording times among the recording times of all GPS points in the trajectory segment are used as the time interval. The road surface point cloud is traversed to obtain the first road surface point cloud within the time interval. Based on the distance from the first road surface point cloud to each GPS point in the trajectory segment, a second road surface point cloud that meets a preset range is selected, and the second road surface point cloud is mapped to a preset two-dimensional pixel coordinate system to record the position index of the second road surface point cloud and the pixel coordinate system. The BEV image is generated from the second road surface point cloud based on the location index.

4. The method according to claim 3, characterized in that, The step of traversing each trajectory segment to verify whether the multiple GPS points in the trajectory segment actually fall within the trajectory segment includes: For the For each trajectory segment, based on a preset distance D, the formula is used: Obtain the first distance from the center GPS point in the k-th trajectory segment to the starting GPS point of the GPS trajectory, and iterate through the other GPS points in the k-th trajectory segment to check if their second distances from the center GPS point are within the range. Within a range of / 2, determine whether the other GPS points fall within the kth trajectory segment; in, This indicates the repetition rate of GPS track lengths between adjacent track segments.

5. The method according to claim 1, characterized in that, The step of obtaining a text detection box containing road surface text based on the pixels containing text and using a text recognition network includes: For each pixel containing text, the connected component search method is used to obtain the associated text pixels with the pixel containing text. The pixel containing text and the associated text pixels with the pixel containing text are clustered to obtain the text string pixels. Based on the pixel coordinates of the text string pixels, obtain the corner points of the minimum bounding box of the text string pixels, and based on the corner points, obtain the text string atomic image corresponding to the text string pixels and the black and white text image corresponding to the text string atomic image. A text recognition network is used to perform text recognition processing on the black and white text image to obtain text detection boxes containing road surface text content.

6. A processing device for road surface text recognition, characterized in that, include: The acquisition module is used to acquire the laser point cloud of the road surface to be identified, collected by the map acquisition vehicle; The processing module is used to determine the boundary of the laser point cloud based on the coordinate information of the laser point cloud; divide the laser point cloud according to the boundary of the laser point cloud to construct a point cloud grid; use the GPS trajectory of the map acquisition vehicle as seed points, and combine a preset threshold range related to the driving height of the map acquisition vehicle and a point cloud density threshold to filter out effective grids from the point cloud grid; take the set of point clouds in the effective grids as the road point cloud, and convert the road point cloud into a two-dimensional intensity top-down BEV image; The acquisition module is further configured to use an image semantic segmentation neural network to perform recognition processing on the BEV image to obtain semantic information containing road markings, and to separate and obtain pixels containing text from the semantic information containing road markings. The processing module is further configured to obtain text detection boxes containing road surface text based on the pixels containing text and using a text recognition network, and save them to the map database.

7. The apparatus according to claim 6, characterized in that, The acquisition module is specifically used for: The coordinate information of each point in the laser point cloud is traversed to obtain the maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value. The maximum X value, the maximum Y value, the maximum Z value, the minimum X value, the minimum Y value, and the minimum Z value are used as the boundary of the laser point cloud. Obtain the GPS trajectory of the map collection vehicle, and based on the GPS trajectory, obtain the altitude range of the map collection vehicle; Based on the boundaries of the laser point cloud and the height of the map acquisition vehicle, the grid division parameters are determined, and the laser point cloud is divided according to the grid division parameters to construct a point cloud grid. Using the GPS trajectory as seed points, effective grids are obtained from the point cloud grid based on the maximum height value in the height range of the map acquisition vehicle, a preset height threshold range, and a preset density threshold, wherein the point cloud density in each effective grid exceeds the preset density threshold. The point set composed of the points of the effective grid is taken as the road surface point cloud.

8. A processing device for road surface text recognition, characterized in that, The processing device includes: at least one processor and a cloud server; The cloud server is used to store computer execution instructions, laser point clouds, and GPS tracks; The at least one processor executes computer execution instructions of the cloud server to implement the method as described in any one of claims 1-5.

9. A computer-readable storage medium, characterized in that, The computer-readable storage medium stores computer-executable instructions that, when executed by a processor, implement the method as described in any one of claims 1-5.

10. A computer program product comprising computer instructions, characterized in that, When the computer instructions are executed by the processor, they implement the method of any one of claims 1-5.