A special-shaped building structure graph element intelligent identification and vectorization method

By using a multi-task convolutional neural network for the recognition and vectorization of irregular building primitives, the problems of poor adaptability of irregular primitives and low spatial alignment accuracy of primitives are solved, and high-precision vectorization results are achieved, which are suitable for 3D reconstruction and BIM integration.

CN122244894APending Publication Date: 2026-06-19CHENGDU UNIV OF INFORMATION TECH

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Applications(China)
Current Assignee / Owner
CHENGDU UNIV OF INFORMATION TECH
Filing Date
2026-05-19
Publication Date
2026-06-19

AI Technical Summary

Technical Problem

Existing technologies suffer from poor adaptability to irregularly shaped architectural drawings, low spatial alignment accuracy of elements, and weak robustness of low-quality drawings, resulting in incomplete vectorization results that do not comply with building codes.

Method used

A multi-task convolutional neural network is used to identify and vectorize irregular building structure primitives. Pixel-level semantic masking, line skeletonization, clustering, and vector parameter extraction are used to determine the ownership of doors and windows by combining distance and angle as dual criteria, ensuring topological correctness and spatial alignment accuracy.

Benefits of technology

It achieves adaptive processing of irregularly shaped primitives, ensuring the integrity and high accuracy of vectorization results. It supports accurate alignment of walls and doors and windows at any angle, and the output vector format is compatible with mainstream CAD/BIM tools, making it suitable for 3D reconstruction and BIM integration.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN122244894A_ABST
    Figure CN122244894A_ABST
Patent Text Reader

Abstract

This invention discloses an intelligent recognition and vectorization method for irregular building structural primitives, involving the intersection of building digitization and computer vision. This method directly predicts the key parameters of irregular primitives through a multi-task convolutional neural network, achieving complete vectorization of irregular structures and solving the problems of fracture and misjudgment of tilted primitives. It innovatively designs a "distance-angle dual criterion" association model to solve the alignment problem between doors and windows and irregular walls. It supports walls tilted at any angle and irregular doors and windows, and can adapt to different types of irregular buildings without modifying the network structure. The output vector format is compatible with mainstream tools such as AutoCAD and Revit, and can be directly used for 3D reconstruction or BIM integration without format conversion. Applicable scenarios cover fields such as renovation of old residential areas and digitization of historical buildings.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] The invention relates to the intersection of building digitization and computer vision, specifically to a method for intelligent recognition and vectorization of irregular building structural primitives. Background Technology

[0002] In recent years, architectural drawing vectorization technology has gradually evolved from traditional image processing to deep learning-driven approaches. Early research often employed mathematical morphology and traditional edge detection algorithms, such as using dilation and erosion operations for noise reduction followed by Hough transform to extract straight line primitives. However, these methods are only suitable for regular drawings with horizontal / vertical walls, and are prone to line breakage when dealing with slanted structures. Other studies have attempted to compress lines using skeletonization algorithms, but when processing slanted (irregular) primitives, the lack of geometric constraints easily leads to topological distortion, such as misalignment of wall connection points and misalignment of doors and windows with the walls.

[0003] Specifically, existing technologies have the following common shortcomings: (1) Poor adaptability of irregular primitives: The mainstream vectorization method relies on the "regular primitive assumption" and presupposes geometric constraints such as horizontal / vertical walls and orthogonal doors and windows. When faced with inclined walls or components, it cannot accurately capture irregular geometric features, resulting in structural breaks or topological disorder in the vectorization results, such as the inclined wall being mistakenly split into multiple straight lines.

[0004] (2) Low spatial alignment accuracy of primitives: The spatial relationship modeling of opening elements such as doors and windows with the wall is missing. Existing technologies mostly determine the ownership of doors and windows by using fixed distance thresholds. In the case of inclined walls, problems such as doors and windows not being parallel to the wall or being offset in position are likely to occur, which cannot meet the requirements of building codes for component alignment accuracy.

[0005] (3) Low-quality drawings are not robust: old building drawings often have problems such as scanning noise, blurred lines, and partial defects. Existing models have not designed enhancement strategies for such data, resulting in a significant decrease in recognition accuracy and serious interference from noise in the vectorization results.

[0006] Therefore, there is an urgent need for an intelligent vectorization method that can adaptively process irregular (tilted) primitives, ensure vectorization integrity and spatial alignment accuracy, and adapt to low-quality drawings, breaking through the dependence of existing technologies on regular structures and supporting the digital upgrade of irregular (tilted) buildings. Summary of the Invention

[0007] To address the aforementioned shortcomings in existing technologies, this invention provides an intelligent recognition and vectorization method for irregularly shaped building structural elements, which solves the problems of poor adaptability of irregularly shaped elements, low spatial alignment accuracy of elements, and weak robustness of low-quality drawings in existing technologies.

[0008] To achieve the above-mentioned objectives, the technical solution adopted by this invention is as follows: A method for intelligent recognition and vectorization of irregular building structural primitives is provided, which includes the following steps: Obtain floor plans of irregularly shaped buildings through scanning or photography; The receptive field of the backbone network of the multi-task convolutional neural network is expanded by the decoder, and the pixel-level semantic mask corresponding to the floor plan of the irregular building is output to distinguish different categories of walls and doors and windows, thus solving the problem of category recognition of irregular primitives; the input of the backbone network of the multi-task convolutional neural network is the floor plan of the irregular building. The pixel-level semantic masks belonging to walls, doors, and windows are skeletonized into lines, and the lines are compressed into single-pixel-width skeletons to preserve the topological structure. Similar line segments are merged by clustering, and the deduplicated line segment set is output to solve the problem of erroneous splitting of tilted primitives and ensure the curve continuity of irregular structures. Based on the deduplicated set of line segments, vector parameters are extracted for different types of primitives, and the extracted vector parameters are converted into standard vector format. The wall to which doors and windows belong is determined by both distance and angle criteria, and the positions of the corresponding doors and windows are adaptively adjusted according to the type of wall to ensure compliance with building codes. Abnormal vectors are eliminated by checking the closure of walls and verifying the intersection of doors / windows and walls to ensure the correctness of the topology.

[0009] Furthermore, the multi-task convolutional neural network is a symmetric encoder-decoder architecture based on an hourglass network. The encoder feature extraction and decoder upsampling processes achieve multi-scale feature fusion through lateral connections to preserve shallow detail features and deep semantic features of irregular primitives. The encoder uses ImageNet pre-trained ResNet-152 with the original classification layer removed and the first 8 convolutional modules retained as the backbone network, and downsampling feature extraction is achieved through 5 convolution operations with a stride of 2. The decoder achieves upsampling through 5 transposed convolutions, and simultaneously fuses the shallow detail features and deep semantic features output by the encoder through skip connections, and performs semantic segmentation and interest point localization. The semantic segmentation outputs a semantic segmentation map of the room and a semantic segmentation map of the building icon; the point of interest localization generates a heat map of points of interest, which includes wall nodes, door and window endpoints, and icon corner points.

[0010] Furthermore, the pixel-level semantic mask includes a room semantic segmentation map and a building icon semantic segmentation map.

[0011] Furthermore, specific methods for performing line skeletonization on the pixel-level semantic mask belonging to the wall, compressing the lines into a single-pixel-width skeleton to preserve the topological structure, include: Select the pixels to be processed from the pixel-level semantic mask belonging to the wall. , the pixels to be processed The 8 neighboring pixels are numbered sequentially in a clockwise direction as follows: Thus, the number of foreground pixels is obtained. and boundary connectivity ;in The value represents the pixel value of the top right neighbor; the foreground pixel value is 1, and the background pixel value is 0. Perform the first iteration: determine , , , If all conditions are met, then mark the pixel to be processed. If the pixel is to be deleted, proceed to the second iteration. Second iteration: Judgment , , and If all conditions are met, then mark the pixel to be processed. Pixels to be deleted; otherwise, pixels to be processed are retained. Continue until no pixels can be deleted, and finally output a wall skeleton diagram S with a width of one pixel.

[0012] Furthermore, specific methods for merging similar line segments through clustering include: The target contour in the wall skeleton image S is extracted using the Canny edge detection operator; Line segments in the target contour are extracted using probabilistic Hough transform, and the image space is transformed. Mapping straight lines in the parameter space Output the initial set of line segments; where The distance from the origin to the line; The angle between the line and the normal. Cluster the initial line segments based on the preset line segment similarity criteria; For each line segment in a cluster, calculate the Euclidean distance between all endpoint pairs, select the endpoint pair with the largest distance to form the merged line segment, complete the merging of similar line segments in a single cluster, and output the deduplicated line segment set.

[0013] Furthermore, the preset line segment similarity criteria include: The directional angle difference between the two line segments is less than or equal to 5°, the spatial distance is less than or equal to 20 pixels, and the overlap is greater than or equal to 0.1.

[0014] Furthermore, specific methods for extracting vector parameters for different types of primitives and converting the extracted vector parameters into a standard vector format include: A fitting algorithm is used to determine the endpoint coordinates of the linear primitives and output the vector line segment parameters; The extracted parameters are converted into a standard vector format, including layer assignment and linetype settings.

[0015] Furthermore, the specific method for determining the wall to which a door or window belongs based on both distance and angle criteria, and adaptively adjusting the position of the corresponding door or window according to the wall type, includes: For each door / window segment, calculate the angle difference and shortest distance between it and the wall segment, filter candidate walls that meet the preset angle difference and shortest distance conditions, and select the best matching wall from the candidate walls; Calculate the coordinates of the midpoint of the door and window line segments and project them onto the optimal matching wall to obtain the projection point; Based on the projection point, the optimal matching wall orientation angle, and the length of the doors and windows, the endpoints of the corrected door and window segments are calculated to complete the adaptive adjustment of the door and window positions.

[0016] Furthermore, based on the projection point, the optimal matching wall orientation angle, and the door / window length, the expression for the endpoints of the corrected door / window line segments is calculated as follows:

[0017] in and These are the coordinates of the two endpoints of the corrected line segment; The projection point; The orientation angle of the wall; The length of the doors and windows; and Projection points The horizontal and vertical coordinates.

[0018] Furthermore, the loss function of a multi-task convolutional neural network during training is:

[0019] in This represents the total loss value. For semantic segmentation loss; The loss is the regression loss of the point of interest.

[0020] The beneficial effects of this invention are as follows: 1. Adaptive processing capability for irregular primitives: Breaking through the dependence of traditional methods on regular primitives, it directly predicts the key parameters of primitives through a multi-task convolutional neural network, realizes the complete vectorization of inclined structures, and solves the problems of fracture and misjudgment of inclined walls.

[0021] 2. Fully automated vectorization process: For the first time, an end-to-end process from image input to DXF vector output is realized without manual intervention: pixel-level segmentation results are converted into editable vector parameters, and the output DXF file can be directly imported into CAD / BIM tools.

[0022] 3. High-precision spatial alignment of elements: The innovative design of the "distance-angle dual criterion" association model solves the alignment problem between doors and windows and irregular walls. Through position correction, the parallelism deviation between doors and windows and walls is ≤5° and the position deviation is ≤2 pixels, which meets the requirements of the "Architectural Drawing Standard" (GB / T 50103-2010) for component alignment accuracy.

[0023] 4. Excellent versatility and scalability: Supports tilting walls, doors, and windows at any angle, and can adapt to different types of irregular buildings without modifying the network structure; the output vector format is compatible with mainstream tools such as AutoCAD and Revit, and can be directly used for 3D reconstruction or BIM integration without format conversion, and is applicable to scenarios such as renovation of old communities and digitization of historical buildings. Attached Figure Description

[0024] Figure 1 This is a flowchart illustrating the method. Figure 2 This is a schematic diagram of the structure of a multi-task convolutional neural network; Figure 3 This is a semantic segmentation diagram of the room in the embodiment; Figure 4 This is the icon semantic segmentation diagram in the embodiment; Figure 5 This is a flowchart illustrating the Zhang-Suen thinning algorithm in the embodiment; Figure 6 This is the input image used in the embodiment to achieve complete delivery of vector data through dual formats; Figure 7 and Figure 8 All of these are semantic segmentation maps that achieve complete delivery of vector data through dual formats in the embodiments; Figure 9 This is the final vectorized output result of the embodiment, which achieves complete delivery of vector data through dual formats. Detailed Implementation

[0025] The specific embodiments of the present invention are described below to enable those skilled in the art to understand the present invention. However, it should be understood that the present invention is not limited to the scope of the specific embodiments. For those skilled in the art, various changes are obvious as long as they are within the spirit and scope of the present invention as defined and determined by the appended claims. All inventions utilizing the concept of the present invention are protected.

[0026] like Figure 1 As shown, the intelligent recognition and vectorization method for irregular building structure primitives includes the following steps: S1. Obtain floor plans of irregularly shaped buildings through scanning or photography; S2. The receptive field of the backbone network of the multi-task convolutional neural network is expanded by the decoder to output the pixel-level semantic mask corresponding to the floor plan of the irregular building, so as to distinguish different types of walls and doors and windows and solve the problem of category recognition of irregular primitives; the input of the backbone network of the multi-task convolutional neural network is the floor plan of the irregular building. S3. Perform line skeletonization on the pixel-level semantic mask belonging to walls, doors and windows, compress the lines into a single-pixel-width skeleton to preserve the topological structure, and merge similar line segments through clustering to output a set of deduplicated line segments to solve the problem of erroneous splitting of tilted primitives and ensure the curve continuity of irregular structures. S4. Based on the deduplicated set of line segments, extract vector parameters for different types of primitives and convert the extracted vector parameters into a standard vector format. S5. Based on standard vector parameters, the system determines the wall to which the doors and windows belong using both distance and angle criteria, and adaptively adjusts the positions of the corresponding doors and windows according to the type of wall to which they belong, so as to ensure compliance with building codes. S6. Based on the vector parameters in the standard vector format, abnormal vectors are eliminated through wall closure detection and door / window-wall intersection verification to ensure topology correctness.

[0027] In this embodiment, as Figure 2 , Figure 3 and Figure 4 As shown, the multi-task convolutional neural network is a symmetric encoder-decoder architecture based on a single-stage hourglass network. The encoder uses a ResNet-152 pre-trained network from ImageNet as the backbone, removing the original classification layer and retaining the first 8 convolutional modules. Downsampling is achieved through 5 convolutional operations with a stride of 2, outputting a deep semantic feature map of 16×16×2048. The encoder feature extraction and decoder upsampling processes achieve multi-scale feature fusion through lateral paths. This architecture design can effectively preserve the shallow details and deep semantic features of irregular primitives (tilted walls, tilted doors, tilted windows), laying a feature foundation for the accurate recognition of irregular primitives. The network finally outputs a 44-channel dense feature map, which serves as the unified feature input for semantic segmentation and interest point detection.

[0028] The decoder achieves upsampling through five transposed convolutions, and simultaneously fuses shallow detail features such as wall edges and door / window corners with deep semantic features such as component categories from the encoder output via skip connections, ultimately outputting two types of results: (1) Semantic segmentation branch: After decoupling the 44-channel feature map, the semantic segmentation maps of 12 types of rooms and 11 types of building icons are output first to complete the pixel-level classification of room functional areas and irregular opening components such as doors, windows, and stairs, and output a 512×512×6 feature map. The pixel-level semantic probability map is obtained by the softmax activation function, and the formula is:

[0029] in, The c-th semantic feature value output by the branch. For pixels The probability of belonging to class c is used as the final label for the pixel, and the class corresponding to the highest probability is taken as the final label to generate a semantic segmentation map.

[0030] (2) Interest Point Heatmap Branch: Output a 512×512×21 feature map (21 channels corresponding to wall nodes / door and window endpoints / icon corners, etc.), and use a Gaussian heatmap regression strategy to analyze the interest points. Its heat map The calculation formula is:

[0031] in, (Controlling the diffusion range of the heat map) By minimizing the MSE loss between the predicted heat map and the labeled heat map, the point of interest is accurately located and a heat map of the point of interest is generated.

[0032] Room categories, icon categories, and point-of-interest heatmaps Figure 3 The class output collaboratively realizes primitive recognition. The room map defines the structural space range, the icon map accurately locates irregular opening primitives, and the heat map captures the key geometric points of primitives such as walls, doors and windows. The three complement each other to ensure the completeness and accuracy of irregular primitive recognition.

[0033] Total loss during training of multi-task convolutional neural networks semantic segmentation loss With interest point regression loss The system is composed of components that automatically learn the weights of each task, and the formula is as follows: .in: Semantic segmentation loss :

[0034] in, One-hot encoding for semantic annotation. For learnable uncertain parameters, through Dynamically adjust the loss weights for difficult-to-classify items such as tilted walls and tilted windows.

[0035] Interest point regression loss :

[0036] in, To predict heat maps, To annotate the heat map, For uncertain parameters, Regularization terms are used to avoid overfitting.

[0037] In this embodiment, for wall regions with a width of multiple pixels in the semantic segmentation map, the foreground pixel value is 1 and the background pixel value is 0, such as... Figure 5 As shown, the Zhang-Suen thinning algorithm is used to compress lines into single-pixel skeletons. While preserving the wall topology, such as intersections and turning points, it eliminates redundant width information, providing accurate single-pixel line input for subsequent line segment detection. The algorithm achieves thinning through multiple rounds of iterative erosion. Each round contains two sub-steps, eroding only boundary pixels that satisfy specific neighborhood conditions. The specific logic is as follows: Let the currently processed pixel be... Its 8 neighboring pixels are denoted as follows, in clockwise order: ( It is the upper right corner. (For above), define the number of foreground pixels. and boundary connectivity ,in Used to determine whether a pixel is a boundary point.

[0038] In the first iteration, only when satisfy: , , , At that time, mark The pixels to be deleted. The second iteration uses a symmetric condition: , , , Similarly, mark the pixels to be deleted. After two rounds of iteration, delete all marked pixels, and repeat the above process until there are no pixels to delete. Finally, output a wall skeleton image S with a width of one pixel, where the skeleton pixels satisfy... Non-skeleton pixels are 0. This process ensures the continuity of curves in irregular structures such as sloping walls, avoids line breaks, and provides clear single-pixel line input for subsequent Hough transform.

[0039] For the input binarized image (i.e., pixel-level semantic mask), after color extraction and thinning, the wall and door / window areas are identified as foreground pixels. First, the target contour is extracted using the Canny edge detection operator. This operator uses a dual-threshold method (high threshold H=150, low threshold L=50) to suppress noise and preserve continuous edges. The calculation formula is as follows:

[0040] Where E(x,y) is the edge pixel marker (1 indicates an edge). The edge detection output is used as the input to the Hough transform, and line segments are extracted using the probabilistic Hough transform Hough Lines P, through the image space. Mapping straight lines in the parameter space The distance from the origin to the line is denoted as . (where the angle is the normal angle to the line) to achieve robust detection of line segments. The algorithm parameters are set as follows: distance resolution... Pixels, angular resolution Radius, detection threshold = 10 (at least 10 edge points are required to support a line segment), minimum line segment length = 1 pixel, maximum line segment gap = 50 pixels (line segments with a gap ≤ 50 pixels are allowed to be merged), and the final output is the initial set of line segments. (Coordinates of the endpoints of the line segment).

[0041] The line segments extracted by the Hough transform contain redundancy, such as overlapping segments and approximately parallel repeating segments, which require optimization through clustering and deduplication. This embodiment first defines the line segment similarity criterion: For any two line segments and Calculate its angle difference Let be the direction angle of the line segment, from Calculate the spatial distance d (shortest distance between two line segments) and the overlap degree o (percentage of overlapping pixels). If they are similar line segments, they are classified into the same cluster.

[0042] For each cluster of line segments, the longest valid line segment is retained using an endpoint fusion algorithm: Let the set of endpoints of line segments within a cluster be... Calculate the Euclidean distance between all endpoint pairs. Select the endpoint pair with the largest distance. The formula for merging line segments is: ; The final output is the set of deduplicated line segments. , as input for spatial correction.

[0043] Door and window lines must maintain spatial consistency with the wall they belong to, with a parallelism deviation of ≤15° and centered position. The correction process is as follows: For each door and window line... Traverse the set of wall segments ,calculate With wall segments angular difference and shortest distance Filter to meet and Candidate walls for each pixel. Find the optimal matching wall. Calculate the midpoint of the door and window line segment. and will Project to Obtain the projection point The projection formula is Based on the wall orientation angle and door and window length Recalculate the corrected end point:

[0044] This ensures that the door and window lines are parallel to and centered with the wall, outputting the corrected set of vector lines. and These are the coordinates of the two endpoints of the corrected line segment; and Projection points The horizontal and vertical coordinates.

[0045] In some embodiments, such as Figure 6 , Figure 7 , Figure 8 and Figure 9 As shown, this method achieves complete delivery of vector data through a dual-format approach, as detailed below: ① Visualized images, with a white background, use differentiated color coding to draw vector line segments of various graphic elements, intuitively showing the spatial distribution of vector line segments, which facilitates manual verification of the accuracy of vectorization results; ② Structured text output uses CSV format to store vector data. The first line of the file indicates the image size (width × height). Starting from the second line, the data is stored in the format of "type endpoint 1 coordinate endpoint 2 coordinate". The text file is encoded in UTF-8 and can be directly imported into engineering software such as AutoCAD and Revit. The built-in script parses the coordinates and parameters and automatically generates editable vector objects, eliminating the need for manual secondary drawing and greatly improving the efficiency of subsequent modeling.

[0046] In some embodiments, this method can be directly applied to the digital surveying stage of old residential community renovation. Addressing issues such as scanning noise, blurred lines, and partial defects in existing irregularly shaped building drawings, it accurately extracts vector information of irregular structures such as tilted walls through multi-dimensional data enhancement preprocessing and improved multi-task CNN recognition. After acquiring paper or scanned building floor plans of old residential communities, the system automatically completes semantic segmentation, line skeletonization, line segment clustering, and door and window correction. The output DXF format vector data can be directly imported into tools such as AutoCAD and Revit. Renovation designers can quickly carry out wall reinforcement planning, door and window replacement selection, and spatial layout optimization based on vector data without manual secondary drawing. Simultaneously, the element spatial alignment optimization function ensures that the parallelism deviation between doors / windows and walls is ≤5°, meeting building code requirements, significantly reducing on-site measurement costs and design cycles, and improving the accuracy and efficiency of renovation projects.

[0047] In some embodiments, this method can address the limitations of traditional vectorization techniques in handling irregular structures, enabling high-precision digital archiving of historical building drawings. After acquiring the original drawings of a historical building through scanning, this method extracts key parameters using a fitting algorithm, combined with line refinement and topology optimization techniques, to fully preserve the structural features and spatial relationships of the historical building, avoiding topological distortion during the vectorization process. The generated vector data can be used to construct a digital twin model of the historical building, providing precise data support for conservation and restoration work—restoration personnel can analyze the structural stress characteristics of the building, reconstruct the original design intent, and formulate targeted restoration plans based on the vector model; simultaneously, the structured vector archives facilitate long-term storage and retrieval, enabling the digital inheritance and sharing of historical building cultural heritage, and providing continuous data support for subsequent research and restoration work.

Claims

1. A method for intelligent recognition and vectorization of irregularly shaped building structural primitives, characterized in that, Includes the following steps: Obtain floor plans of irregularly shaped buildings through scanning or photography; The receptive field of the backbone network of the multi-task convolutional neural network is expanded by the decoder, and the pixel-level semantic mask corresponding to the floor plan of the irregular building is output to distinguish different categories of walls and doors and windows, thus solving the problem of category recognition of irregular primitives; the input of the backbone network of the multi-task convolutional neural network is the floor plan of the irregular building. The pixel-level semantic masks belonging to walls, doors, and windows are skeletonized into lines, and the lines are compressed into single-pixel-width skeletons to preserve the topological structure. Similar line segments are merged by clustering, and the deduplicated line segment set is output to solve the problem of erroneous splitting of tilted primitives and ensure the curve continuity of irregular structures. Based on the deduplicated set of line segments, vector parameters are extracted for different types of primitives, and the extracted vector parameters are converted into standard vector format. The wall to which doors and windows belong is determined by both distance and angle criteria, and the positions of the corresponding doors and windows are adaptively adjusted according to the type of wall to ensure compliance with building codes. Abnormal vectors are eliminated by checking the closure of walls and verifying the intersection of doors / windows and walls to ensure the correctness of the topology.

2. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 1, characterized in that, The multi-task convolutional neural network is a symmetric encoder-decoder architecture based on an hourglass network. The encoder feature extraction and decoder upsampling processes achieve multi-scale feature fusion through lateral connections to preserve shallow detail features and deep semantic features of irregular primitives. The encoder uses ImageNet pre-trained ResNet-152 with the original classification layer removed and the first 8 convolutional modules retained as the backbone network, and downsampling feature extraction is achieved through 5 convolution operations with a stride of 2. The decoder achieves upsampling through 5 transposed convolutions, and simultaneously fuses the shallow detail features and deep semantic features output by the encoder through skip connections, and performs semantic segmentation and interest point localization. The semantic segmentation outputs a semantic segmentation map of the room and a semantic segmentation map of the building icon; the point of interest localization generates a heat map of points of interest, which includes wall nodes, door and window endpoints, and icon corner points.

3. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 2, characterized in that, Pixel-level semantic masks include room semantic segmentation maps and building icon semantic segmentation maps.

4. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 1, characterized in that, Specific methods for performing line skeletonization on pixel-level semantic masks belonging to walls, compressing lines into single-pixel-width skeletons to preserve topological structure, include: Select the pixels to be processed from the pixel-level semantic mask belonging to the wall. , the pixels to be processed The 8 neighboring pixels are numbered sequentially in a clockwise direction as follows: Thus, the number of foreground pixels is obtained. and boundary connectivity ;in The value represents the pixel value of the top right neighbor; the foreground pixel value is 1, and the background pixel value is 0. Perform the first iteration: determine , , , If all conditions are met, then mark the pixel to be processed. If the pixel is to be deleted, proceed to the second iteration. Second iteration: Judgment , , and If all conditions are met, then mark the pixel to be processed. Pixels to be deleted; otherwise, pixels to be processed are retained. Continue until no pixels can be deleted, and finally output a wall skeleton diagram S with a width of one pixel.

5. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 4, characterized in that, Specific methods for merging similar line segments through clustering include: The target contour in the wall skeleton image S is extracted using the Canny edge detection operator; Line segments in the target contour are extracted using probabilistic Hough transform, and the image space is transformed. Mapping straight lines in the parameter space Output the initial set of line segments; where The distance from the origin to the line; The angle between the line and the normal. Clustering of initial line segments is performed based on a preset line segment similarity criterion; For each line segment in a cluster, calculate the Euclidean distance between all endpoint pairs, select the endpoint pair with the largest distance to form the merged line segment, complete the merging of similar line segments in a single cluster, and output the deduplicated line segment set.

6. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 5, characterized in that, The preset line segment similarity criteria include: The directional angle difference between the two line segments is less than or equal to 5°, the spatial distance is less than or equal to 20 pixels, and the overlap is greater than or equal to 0.

1.

7. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 1, characterized in that, Specific methods for extracting vector parameters for different types of primitives and converting the extracted vector parameters into a standard vector format include: A fitting algorithm is used to determine the endpoint coordinates of the linear primitives and output the vector line segment parameters; The extracted parameters are converted into a standard vector format, including layer assignment and linetype settings.

8. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 1, characterized in that, The specific methods for determining the wall to which doors and windows belong using both distance and angle criteria, and adaptively adjusting the positions of corresponding doors and windows based on the wall type, include: For each door / window segment, calculate the angle difference and shortest distance between it and the wall segment, filter candidate walls that meet the preset angle difference and shortest distance conditions, and select the best matching wall from the candidate walls; Calculate the coordinates of the midpoint of the door and window line segments and project them onto the optimal matching wall to obtain the projection point; Based on the projection point, the optimal matching wall orientation angle, and the length of the doors and windows, the endpoints of the corrected door and window segments are calculated to complete the adaptive adjustment of the door and window positions.

9. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 8, characterized in that, Based on the projection point, the optimal matching wall orientation angle, and the door / window length, the expression for calculating the endpoints of the corrected door / window line segments is as follows: in and These are the coordinates of the two endpoints of the corrected line segment; For the projection point; The orientation angle of the wall; The length of the doors and windows; and They are the projection points The horizontal and vertical coordinates.

10. The method for intelligent recognition and vectorization of irregular building structure primitives according to claim 2, characterized in that, The loss function for training a multi-task convolutional neural network is: in This represents the total loss value. For semantic segmentation loss; The loss is the regression loss for points of interest.