Method, encoder and decoder for encoding and decoding 3D point clouds

By adjusting the vertex position and centroid on the decoder side, the triangle modeling is refined, which solves the problem of low point cloud compression efficiency in the existing technology and achieves more efficient point cloud reconstruction and real-time transmission.

CN119586143BActive Publication Date: 2026-06-30BEIJING XIAOMI MOBILE SOFTWARE CO LTD

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
BEIJING XIAOMI MOBILE SOFTWARE CO LTD
Filing Date
2023-06-30
Publication Date
2026-06-30

Smart Images

  • Figure CN119586143B_ABST
    Figure CN119586143B_ABST
Patent Text Reader

Abstract

This paper provides a method for decoding the geometry of a 3D point cloud from a bitstream, preferably implemented in a decoder. The method includes: receiving a bitstream, wherein the bitstream includes vertex information, the vertex information including information about the position of vertices on the edges of a cube of leaf nodes of an octree structure of the point cloud; acquiring the vertex information to determine the position of a vertex, wherein at least one vertex is adjusted to the average position toward surface points; determining points of the point cloud based on at least one adjusted vertex, wherein the bitstream further includes planar information indicating whether the point cloud data is planar, and the determination of points of the point cloud is further based on the acquired planar information.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This invention relates to a method, encoder, and decoder for encoding and decoding 3D point clouds. Background Technology

[0002] Point clouds have recently gained attention as a format for representing 3D data due to their versatility in representing all types of 3D objects or scenes. However, for all compression schemes, the quality of point cloud reconstruction is crucial. Summary of the Invention

[0003] Therefore, the object of the present invention is to provide a method for decoding the geometry of a 3D point cloud from a bitstream and encoding a 3D point cloud into a bitstream, the method of which improves efficiency.

[0004] In a first aspect, a method is provided for decoding the geometry of a 3D point cloud from a bitstream, preferably implemented in a decoder. The method includes: receiving a bitstream, wherein the bitstream includes vertex information, the vertex information including information about the positions of vertices on the edges of a cube of leaf nodes of an octree structure of the point cloud; acquiring the vertex information to determine the positions of vertices, wherein at least one vertex is adjusted towards the average position of surface points; and determining points of the point cloud based on at least one adjusted vertex, characterized in that the bitstream further includes planar information indicating whether the point cloud data is planar, the determination of points of the point cloud further based on the acquired planar information.

[0005] Preferably, obtaining vertex information to determine the position of a vertex, wherein at least one vertex is adjusted toward the average position of surface points, includes: determining a virtual position based on the position of a cube's vertices associated with the leaf nodes of an octree structure; constructing a triangle based on the vertices of the cube and the virtual position; determining the normal vector of the cube based on the constructed triangle; determining the centroid position based on the normal vector; and adjusting at least one vertex toward the centroid position.

[0006] Therefore, in the first step, a bitstream is received, and this bitstream may contain information about the octree structure of the volume of the decodeable point cloud. Preferably, the geometry of the point cloud is GPCC encoded. Therefore, preferably, octree information about the volume of the point cloud can be provided by decoding from the bitstream. Furthermore, the bitstream includes vertex information, which includes the positions of vertices on the edges of the cubes associated with the leaf nodes in the octree structure. Preferably, it also includes information about the presence of vertices. Therefore, vertex information can be provided by receiving and / or decoding the bitstream. Preferably, the bitstream is encoded at the encoder using the TriSoup encoding scheme.

[0007] After decoding the vertex information and preferably also the octree information from the bitstream described in the previous step, triangles can be determined for each cube in the next step for reconstructing the point cloud geometry. Specifically, virtual positions can first be determined based on the vertex positions of a cube associated with the leaf nodes of the octree structure, preferably by averaging the positions of the cube's vertices. Then, based on the vertices and virtual positions of a cube, a set of triangles is constructed, preferably by connecting two consecutive vertices on the cube's edges clockwise with the determined virtual positions. Thus, the surface of the first set of triangles is determined by the vertex positions included in the bitstream and the determined virtual positions (i.e., the average positions of the vertices).

[0008] Subsequently, the normal vector of the cube can be determined based on the constructed triangles. Here, the normal vector is a vector perpendicular to a given object. Therefore, the normal vector of each triangle in the first set of triangles can be determined separately. Then, the normal vector of the cube can be determined based on the sum of the determined normal vectors of the first set of triangles. Then, the centroid position can be determined based on the virtual position of the cube and the normal vector. Through this adjustment, the determined centroid position may be closer to the original position of the point cloud in the leaf node. Furthermore, according to the invention, at least one vertex can be adjusted toward the corresponding average position of the surface points. Surface points are points constituting the outer surface of the 3D point cloud, specifically located on edges that converge at vertices within the original point cloud. Original vertex information is established using surface points derived from the original point cloud. Surface points are a subset of the 3D point cloud. In particular, vertices can be associated with a target surface; that is, different vertices can correspond to different surfaces. Surface points, for example... Figure 14 1405 could be a point corresponding to the target surface. Specifically, in Figure 14 In this context, vertex 1401 can correspond to surface point 1405. The corresponding surface point is an original point in the point cloud that is close to the edge to which at least one vertex belongs. Preferably, this refers to a subset of original points in the point cloud that are located within a distance of less than 1 / 4 of the width of a leaf node from their edge. More preferably, less than 1 / 8 (the width of a leaf node). Preferably, the at least one vertex can be adjusted toward the determined centroid position to obtain a better model of the original point cloud surface. In particular, if the point cloud data is planar according to planar information, along the vector... Vertices on the edges of axes with maximum values ​​can be adjusted along the edges they belong to to make the reconstructed surface more natural. Here, vectors... From The vector to C. More specifically, we can first determine an axis, the vector... The axis has a maximum value. Then, the vertices in the leaf nodes that lie on the edges parallel to the defined axis can be found. Finally, these vertices can be based on... Adjustments are made. Then, based on the adjusted vertices and centroids, triangles are constructed in a manner similar to that described earlier. Whether the point cloud data is planar can be determined using various methods or dedicated planarity determination algorithms. For example, planarity information can be obtained from features of the input point cloud data, or determined at the encoder based on the percentage of leaf nodes with a centroid residual of 0, or at the encoder based on the normal vector of each leaf node. The direction is determined automatically.

[0009] To reconstruct the points of the point cloud from the reconstructed triangles, voxelization is performed, preferably via a ray tracing process, where rays are emitted along three directions parallel to any one of the three axes. Their origins are integer coordinates corresponding to the sampling precision required for rendering. The intersection point of the ray with one of the reconstructed triangles (if any) is then determined and added to the list of rendered points, i.e., to the points of the point cloud. During voxelization, the surface of the reconstructed triangles is sampled using rays to determine the points of the point cloud.

[0010] Preferably, the at least one vertex is adjusted only when the difference between the virtual position and the centroid position exceeds a threshold. Preferably, the threshold is based on the leaf node size, and more preferably, it is a proportion of the leaf node size. ,in Preferably, s = 1 / 128. Therefore, before performing the method according to the invention, it is first determined whether the convexity of the modeling surface in the leaf node is sufficiently large. For example, by comparing the centroid residual with a threshold. This constraint reduces unnecessary adjustments, thereby improving computational efficiency and reducing complexity.

[0011] Preferably, at least one vertex is adjusted based on a fixed value, which is determined based on the type of point cloud and encoded into or from a bitstream. Specifically, the fixed value can be determined based on statistics of the input point cloud sequence indicating attributes, characteristics, or features of the 3D point cloud (e.g., ordinary or non-ordinary) sequence. Statistics of the input point cloud sequence can include various measures and attributes, such as: descriptive statistics: these include measures such as the mean, median, mode, variance, and standard deviation of point coordinates (x, y, z), as well as other attributes such as color, intensity, or normal vectors. Geometric attributes: features such as point density, spatial distribution, or point cloud extent can be quantified and summarized. Temporal attributes: for point cloud sequences that change over time, attributes such as motion or deformation of the point cloud can be analyzed and summarized. Topological attributes: the connectivity or relationships between points in the point cloud can be studied to provide a deeper understanding of the underlying structure of the data. This approach provides simplicity and consistency between similar point cloud types.

[0012] Preferably, at least one vertex is adjusted based on the difference between the virtual position and the centroid position, and more preferably, based on the boundaries of the edges of the leaf nodes. As previously mentioned, the vertex can be adjusted along certain defined edges toward the centroid position. Furthermore, the adjustment can be further based on the boundaries of the edges of the leaf nodes, thereby ensuring that the adjusted vertex does not extend beyond its respective leaf node. Therefore, this technique allows for more precise vertex adjustment, resulting in a more accurate representation of the point cloud geometry.

[0013] Preferably, the planar information is encoded into the geometric parameter set GPS.

[0014] In another aspect of the invention, a method for encoding a 3D point cloud into a bitstream is provided, preferably implemented in an encoder. The method for encoding a 3D point cloud includes: obtaining vertex information of each cube associated with a leaf node from the surface of the point cloud, wherein the vertex information includes position information of vertices on the edges of the cubes of the leaf nodes with respect to the octree structure of the point cloud; encoding the vertex information into a bitstream, wherein at least one vertex is adjusted toward the average position of surface points; and determining points of the point cloud based on the encoded vertex information.

[0015] The method is characterized by further comprising determining and encoding planar information indicating whether the point cloud data is planar, wherein the determination of points in the point cloud is further based on the encoded planar information.

[0016] Therefore, vertex information and preferably octree information are generated using the encoding method. This information can be encoded into a bitstream. Subsequently, a reconstruction step can be performed on the encoder side. In this reconstruction step, the point cloud geometry is reconstructed, wherein the reconstruction step is the same as the steps in the decoding method described above. Then, on the encoder side, the attributes (color, reflectivity, etc.) of the points in the point cloud are encoded using the reconstructed geometry of the point cloud, for example, by sequentially encoding the attributes of the points in the point cloud through RAHT (Region Adaptive Hierarchical Transform), predictive transform, or lifting transform.

[0017] Preferably, the geometric structure of the point cloud is encoded into a bit stream by geometry-based point cloud compression (G-PCC).

[0018] Preferably, the bitstream is a bitstream conforming to MPEG G-PCC.

[0019] Preferably, the encoding method is further constructed based on the features described above in conjunction with the decoding method.

[0020] In another aspect of the invention, an encoder for encoding 3D point clouds into a bitstream is provided. The encoder includes a memory and a processor, wherein instructions are stored in the memory, and when the instructions are executed by the processor, the steps of the aforementioned encoding method are performed.

[0021] In another aspect of the invention, a decoder for decoding 3D point clouds from a bitstream is provided. The decoder includes a memory and a processor, wherein instructions are stored in the memory, and when the instructions are executed by the processor, the steps of the aforementioned decoding method are performed.

[0022] In another aspect of the invention, a bitstream is provided, wherein the bitstream is encoded by the steps of the aforementioned encoding method.

[0023] In another aspect of the invention, a computer-readable storage medium is provided, comprising instructions for performing the steps of the method described above for encoding a 3D point cloud into a bit stream.

[0024] In another aspect of the invention, a computer-readable storage medium is provided, comprising instructions for performing the steps of the method described above for decoding a 3D point cloud from a bitstream. Attached Figure Description

[0025] The invention is described in more detail below with reference to the accompanying drawings.

[0026] The accompanying diagram shows:

[0027] Figure 1 A flowchart of a method for decoding 3D point cloud geometry according to the present invention is shown.

[0028] Figure 2 An example of generating an octree structure is shown.

[0029] Figure 3 Showing according to Figure 2 octree,

[0030] Figure 4 An example is shown for determining the vertices on the edges of a cube.

[0031] Figure 5 An example of generating a triangle is shown.

[0032] Figure 6 Examples of vertices on the edges of a cube are shown.

[0033] Figure 7 This shows how triangles are generated from vertices.

[0034] Figure 8 Showing according to Figure 7 An example of determining the order of triangles.

[0035] Figure 9 A schematic diagram of the voxelization process is shown.

[0036] Figure 10This example illustrates how to reconstruct a triangle using the centroid C as the pivot point.

[0037] Figure 11 An example of a normal vector is shown.

[0038] Figure 12 An example of a 1D residual along the normal vector is shown.

[0039] Figure 13 An example is shown with a smooth convex surface in a leaf node of a modeled triangle constructed from vertices and centroids.

[0040] Figure 14 An example of a refined surface is shown.

[0041] Figure 15 This example illustrates planar data modeled using the Trisoup method with vertex-free thinning.

[0042] Figure 16 This example illustrates planar data modeled using the Trisoup method with vertex refinement.

[0043] Figure 17 Show the decoder or encoder. Detailed Implementation

[0044] Point clouds have recently gained attention as a format for representing 3D data because of their versatility in representing all types of 3D objects or scenes.

[0045] Therefore, many use cases can be handled using point clouds, including

[0046] • Film post-production

[0047] • Real-time 3D immersive experience or virtual reality (VR) / augmented reality (AR) applications,

[0048] • Free-viewpoint video (e.g., for watching sports).

[0049] • Geographic Information System (also known as mapping).

[0050] • Cultural heritage (storing scans of rare items in digital form).

[0051] • Autonomous driving, including 3D environmental mapping and real-time LiDAR data acquisition.

[0052] A point cloud is a set of points in 3D space, and additional values ​​can be optionally assigned to each point. These additional values ​​are often referred to as point attributes. Therefore, a point cloud is a combination of geometry (the 3D position of each point) and attributes.

[0053] The attributes can be, for example, three-component color, material properties such as reflectivity, and / or the two-component normal vector of the surface associated with that point.

[0054] Point clouds can be captured by various types of devices, such as camera arrays, depth sensors, LiDAR, scanners, or can be generated by computers (e.g., in film post-production). Depending on the use case, a point cloud can have thousands to billions of points for mapping applications.

[0055] The raw representation of a point cloud requires a very high number of bits per point; each spatial component X, Y, or Z requires at least twelve bits, and optionally, attributes require even more bits, such as color requiring three times 10 bits. Practical deployment of point cloud-based applications requires compression techniques that enable the storage and distribution of point clouds through a suitable storage and transmission infrastructure.

[0056] For distribution to and visualization by end users, such as on AR / VR glasses or any other 3D-enabled device, compression can be lossy (e.g., in video compression). Other use cases do require lossless compression, such as medical applications or autonomous driving, to avoid altering decision-making outcomes derived from point cloud analysis during compression and transmission.

[0057] Until recently, point cloud compression (also known as PCC) had not been solved by the mass market, and there was no standardized point cloud codec available. In 2017, the standardization working group ISO / JCT1 / SC29 / WG11, also known as the Moving Picture Experts Group or MPEG, launched a working project on point cloud compression. This resulted in two standards, namely...

[0058] • MPEG-I Part 5 (ISO / IEC 23090-5) or Video-Based Point Cloud Compression (V-PCC)

[0059] • MPEG-I Part 9 (ISO / IEC 23090-9) or Geometry-Based Point Cloud Compression (G-PCC)

[0060] Both the V-PCC and G-PCC standards completed their first versions at the end of 2020.

[0061] The V-PCC encoding method compresses point clouds by performing multiple projections of the 3D object to obtain 2D patches that are packaged into an image (or, when processing moving point clouds, into video). The acquired image or video is then compressed using existing image / video codecs, allowing the utilization of already deployed image and video solutions. By its very nature, V-PCC is only effective on dense and continuous point clouds because image / video codecs cannot compress non-smooth patches, such as those obtained from projections of sparse geometry acquired from LiDAR.

[0062] The G-PCC coding method has two schemes for geometric compression. The first scheme is based on an occupancy tree (octree / quadtree / binary tree) representation of the point cloud geometry. Occupied nodes are segmented until a certain size is reached, and the occupied leaf nodes provide the location of points, typically at the center of these nodes. High levels of compression for dense point clouds can be achieved by using neighbor-based prediction techniques. Sparse point clouds are also compressed by directly encoding the locations of points within nodes with non-minimum sizes, addressing this by stopping tree construction when only isolated points exist in a node; this technique is called Direct Coding Mode (DCM).

[0063] The second approach is based on a prediction tree, where each node represents the 3D location of a point, and the relationship between nodes is a spatial prediction from parent to child. This method can only handle sparse point clouds and has the advantages of lower latency and simpler decoding compared to occupancy-based methods. However, compared to the first occupancy-based method, the compression performance is only slightly better, and the encoding is complex, involving densely searching for the best predictor (among a long list of potential predictors) when constructing the prediction tree.

[0064] In both schemes, attribute (solution) encoding is performed after geometry (solution) encoding is complete, resulting in two-pass encoding. Therefore, low latency is achieved by using slices that decompose the 3D space into independently coded sub-volumetric units, without requiring prediction between sub-volumetric units. However, using many slices can severely impact compression performance.

[0065] One important use case is the delivery of dynamic AR / VR point clouds. Dynamic means that the point cloud evolves over time. Furthermore, AR / VR point clouds are typically local 2D because they represent the surface of objects most of the time. Therefore, AR / VR point clouds are highly connected (or dense) because points are rarely isolated but have many neighbors.

[0066] Dense (or solid) point clouds represent continuous surfaces with a resolution that allows the volumes associated with points (small cubes called voxels) to contact each other without revealing any visible holes on the surface.

[0067] These point clouds are typically used in AR / VR environments and viewed by end users through devices such as TVs, smartphones, or headphones. They are either transmitted to the device or stored locally. Many AR / VR applications use dynamic point clouds, rather than static ones, which change over time. Therefore, the data volume is enormous and must be compressed. Currently, lossless compression based on octree representations of the point cloud's geometry can achieve slightly less than one bit per point (1 bpp). This may not be sufficient for real-time transmission, which can involve millions of points per frame at frame rates up to 50 frames per second (fps), resulting in hundreds of megabits of data per second.

[0068] Therefore, lossy compression can be used to maintain acceptable visual quality as is often required, while being sufficiently compressed to fit the bandwidth provided by the transmission channel, and to ensure real-time frame transmission. In many applications, bit rates as low as 0.1 bpp (compression up to 10 times higher than lossless coding compression) have made real-time transmission possible.

[0069] VPCC, a codec based on MPEG-I Part 5 (ISO / IEC 23090-5) or Video-Based Point Cloud Compression (V-PCC), can achieve such low bit rates by using lossy compression from a video codec that compresses 2D frames obtained from the projection of a point cloud onto a plane. The geometry is represented by a series of projected patches assembled into a frame, each patch being a small local depth map. However, VPCC is not universal and is limited to narrow types of point clouds that do not exhibit complex local geometries (such as trees or hair) because the resulting projected depth maps are not smooth enough to be effectively compressed by the video codec.

[0070] Pure 3D compression technology can handle any type of point cloud. Whether 3D compression can compete with VPCC (or any projection + image coding scheme) on dense point clouds remains an open question. Standardization is still moving towards providing an extension (revision) to GPCC that offers competitive lossy compression, compressing dense point clouds as well as VPCC intra-frame coding, while maintaining GPCC's versatility in handling any type of point cloud (dense point cloud, LiDAR, 3D map). This extension may use the so-called TriSoup scheme, which is applicable to octrees, and will be discussed in detail later. The ISO / IEC standardization working group JTC1 / SC29 / WG7 is exploring TriSoup.

[0071] refer to Figure 1 It shows a schematic diagram of a method for decoding geometric information of a 3D point cloud from a bitstream.

[0072] A preferred method for decoding the geometry of a 3D point cloud from a bitstream, implemented in a decoder, includes the following steps:

[0073] S11: Receive bit stream, wherein the bit stream contains vertex information, the vertex information including information about the vertex positions on the cube edges of the leaf nodes of the octree structure of the point cloud;

[0074] S12: Obtain vertex information to determine the position of the vertex, wherein at least one vertex is adjusted toward the average position of the surface points;

[0075] S13: Determine the points of the point cloud based on at least one adjusted vertex, characterized in that the bit stream further includes planar information indicating whether the point cloud data is planar, and the determination of the points of the point cloud is further based on the obtained planar information.

[0076] To determine octree information, the first step in the geometric coding process is to construct and encode the octree, such as... Figure 2 and Figure 3 As shown. The bounding box is the body 100 containing all points and is associated with the root node 112 (i.e., the single node at the top of the tree 110). This body 100 is first divided into eight sub-bodies 102 called octets, each represented by a node 114 in the tree 110. Then, octets 106 are recursively subdivided within the sub-bodies 104 until a target level is reached, where each octet 106 is occupied by at least one point. Figure 2 and Figure 3 The text is shaded. The target level is the level at which the volume is no longer segmented, and it can be determined based on the compression ratio.

[0077] Each octet (or node) is represented by an occupation byte containing one bit for each sub-octet. If the sub-octet is occupied by at least one node, the corresponding bit is set to 1; otherwise, it is set to zero. All octet occupation bytes, 118 bytes in total, such as "1000100 00001000 11000011", are serialized (in breadth-first order) and entropy-encoded using a binary arithmetic encoder.

[0078] Figure 4 The diagram shows a block representation of 3D surface 210 and an example of block 220 in Trisoup. Surface 210 intersects with block 220, which is therefore an occupied block, and block 220 exists among multiple blocks 200 in 3D space. Within block 220, the closed portion of surface 210 intersects the edges of the block at six illustrated vertices 232 of polygon 230. If an edge of block 220 contains a vertex, then that edge is said to be selected.

[0079] Figure 5 Block 220 in the Trisoup is shown. For clarity, surface 210 is omitted. Unselected edge 270, selected edge 260, and the i-th edge 250 are shown. Vertices on edge i are specified. Specify a scalar value to indicate the corresponding fraction of the length of edge 250.

[0080] As shown in Figures 4 and 5, within each octet 220 of the target level in the octree, Trisoup represents the original surface 210 as a set of triangles 245. This surface is encoded and used to obtain the positions of the reconstructed (or decoded) points. First, the intersections of the surface represented by the original points with the edges of the octets are estimated by averaging the positions of the points closest to those edges within the octets from the original points used to represent the surface. Second, all twelve edges of the octets and their associated intersections (if any) are stored as segments and vertices, respectively. Then, each (unique) segment is encoded as follows: The first single bit is arithmetically encoded, set to 1 if the segment is occupied by a vertex, and 0 otherwise. If it is occupied, the relative position of the vertex on the segment is also arithmetically encoded.

[0081] The vertices 310 of the triangle are encoded along the edges 320 of the body associated with the leaf nodes 300 of the tree, as shown in Figure 6. These vertices 310 on the edges 320 are shared among the leaf nodes 300 that share a common edge 320. This means that each edge belonging to at least one leaf node encodes at most one vertex. In this way, the continuity of the model is ensured through the leaf nodes.

[0082] As mentioned above, encoding TriSoup vertices may require two pieces of information for each edge: a vertex flag indicating whether a TriSoup vertex exists on the edge, and, if it does exist, the vertex position along the edge.

[0083] Therefore, the encoded data includes octree data and TriSoup data.

[0084] Vertex markers are encoded by an adaptive binary arithmetic encoder that uses a specific context to encode vertex markers. The position of a vertex on an edge of length n = 2^s can be encoded with unit precision by pushing s bits into a bitstream (bypass / non-entropy encoding).

[0085] Within a leaf node, if there are at least three vertices 310 on edge 320 of leaf node 300, then construct a triangle from the vertices of TriSoup. Figure 7 The reconstructed triangles 330 and 340 are depicted in the figure.

[0086] Clearly, other combinations of triangles 330° and 340° are also possible. The selection of the triangle involves a three-step process.

[0087] 1. Determine the dominant direction along one of the three axes.

[0088] 2. Sort the TriSoup vertices according to the dominant direction.

[0089] 3. Constructing triangles based on an ordered list of vertices

[0090] It is not necessary to know the exact position of the triangle in the current leaf node, and it can be derived from the vertices.

[0091] Figure 8 This will be used to explain the process. Each of the three axes is tested, and the axis that maximizes the total surface area of ​​the triangle will be chosen as the principal axis. For the sake of simplicity in the diagram, Figure 8 The test is described only on two axes.

[0092] The first test (top) along the vertical axis is performed by projecting the cube and TriSoup vertex 310 vertically onto the 2D plane. Vertices 310 are then ordered clockwise relative to the center of the projected node (square). Triangles 330, 340 are then constructed based on the ordered vertices according to fixed rules. Here, triangles 123 and 134 are systematically constructed when there are 4 vertices. When there are 3 vertices, the only possible triangle is 123. When there are 5 vertices, the fixed rule can be to construct triangles 123, 134, and 451. This continues until a maximum of 12 vertices are reached.

[0093] A second test (left side) is performed along the horizontal vertical axis by horizontally projecting the cube and Trisoup vertices onto a 2D plane.

[0094] The vertical projection shows the largest total 2D surface area of ​​the triangle; therefore, the principal axis is chosen as the vertical axis, and the constructed TriSoup triangles are obtained by following the order of the vertical projections, as follows: Figure 8 As shown, it is located inside the node. It's important to note that using the horizontal axis as the principal axis will result in a different construction of the triangle.

[0095] By maximizing the projection surface to fully select the principal axis, continuous reconstruction of hole-free point clouds can be achieved.

[0096] Rendering a TriSoup triangle into points can be done using ray tracing. The collection of all points rendered by ray tracing will form the decoded point cloud.

[0097] for Figure 9The ray tracing shown allows rays to be emitted along three directions parallel to the axis. Their origins are points with integer (voxed) coordinates corresponding to the sampling precision required for rendering. The intersection point with one of the Trisoup triangles (a dashed point, if any) is then voxelized (rounded to the nearest point with the required sampling precision) and added to the list of render points.

[0098] After applying Trisoup to all leaf nodes, i.e., constructing triangles and preferably obtaining points via ray tracing, discarding copies of identical points in the list of all rendered points (i.e., retaining only one voxel among all voxels sharing the same location and volume), a set of decoded (unique) points is obtained.

[0099] Based on the above concepts, Trisoup encoding can be improved. For example, by calculating the centroid, whose coordinates are the average coordinates of all (ordered) vertices Vi, see [link to relevant documentation]. Figure 10 The centroid C is depicted using a checkerboard pattern. The centroid is used as the pivot point. The triangle is formed by ordered vertices (V1, V2, …, V…). M By pivoting around the centroid C, construct the following M triangles: , , ..., , .

[0100] The vertices are sorted clockwise, and the choice of which vertex to use is... unimportant.

[0101] This structure preserves the model's natural symmetry without granting arbitrary privileges to the triangle. Furthermore, it provides additional degrees of freedom to improve the model's accuracy, specifically the position of the centroid C.

[0102] Therefore, the position of the centroid C can be further improved by encoding the residual positions in the bitstream, so that the position of the centroid C is closer to the original point of the point cloud.

[0103] For example, .

[0104] in, The average position is obtained by averaging the coordinates of all (ordered) vertices, and It is the encoding residual.

[0105] The encoded residual can be a 3D residual. However, it has been observed that 3D residuals are rarely advantageous because they require encoding many bits, and such a large number of bits is not fully compensated for by the improved accuracy of the model. Therefore, 1D residuals are preferred. Encode it.

[0106] For example, normal vector It can be constructed, such as Figure 11 As shown, the residual can be determined by the following formula:

[0107] = α*

[0108] Where α is the 1D signed scalar value encoded in the bitstream, see Figure 12. Normal vector You can export it in the following two steps.

[0109] 1.

[0110] 2. Then normalize

[0111] in It is the cross product (also called the vector (cross) product) between two vectors, and the edges yes:

[0112] .

[0113] It is understandable that vectors It can be parallel to the axis to simplify its determination and the calculation of its value α. Vector It can be parallel to the principal axis, serving as a good approximation of the vector calculated above.

[0114] The value α is determined by the encoder, encoded into the bitstream, and obtained by the decoder by decoding the bitstream. The value α can be binarized, and each bit can be encoded using a binary entropy encoder such as an arithmetic encoder or a context-adaptive binary encoder such as CABAC.

[0115] The value α can be binarized as

[0116] A flag indicating whether α is equal to 0

[0117] instruct >0 or The sign of <0,

[0118] instruct Flag indicating whether it equals 1 ,

[0119] Remainder encoded by the exponential Golomb encoder -2.

[0120] The value α can be obtained by the encoder by considering all points in the point cloud belonging to the current leaf node. To determine. For each point Its relationship with the straight line ( , distance From the following formula, we get

[0121] = || ‖,

[0122] Furthermore, if this distance Below a predefined threshold (For example, =2), use point Calculate the value α. Point Relative to the average point 1D residual Obtained through scalar product (also known as inner product or dot product)

[0123] = .

[0124] Therefore, the value α is obtained by the following formula.

[0125] α =

[0126] Where S is a point A set such that their distances Below the threshold And |S| is the number of points belonging to this set.

[0127] In the current trisoup encoding, as described in conjunction with the preceding figures, the determined centroid C and the vertices of the leaf nodes ( A ray can be used to construct a triangle, and then ray tracing methods can be applied to the constructed triangle to obtain the reconstructed points.

[0128] Point cloud data captured in natural scenes often has smooth surfaces, even in concave or convex regions. However, in current trisoup encoding, modeling concave or convex regions using triangles in leaf nodes results in centroid residuals. It may be large (e.g., (It could be about half the width of a leaf node), then use the triangle constructed inside the leaf node (by using the centroid C and vertices (V0,...,V)). i The modeled surface will have large protrusions, and the resulting surface will be far from naturally smooth, exhibiting artifact effects. For example... Figure 13As shown, point 1301 represents a vertex along the edge in a leaf node, and point 1303 represents the centroid C in a leaf node. Small point 1305 represents a point on the original surface (for clarity, only one of these small points is labeled 1305). It can be seen that a modeling surface with sharp protrusions cannot model the original surface well enough, and the reconstructed surface may not retain the smooth characteristics of the original surface. This can result in poor visual quality in the reconstructed point cloud, especially when the leaf node size is relatively large, at lower compression rates. Therefore, the compression efficiency of trisoup encoding in the current MPEG G-PCC is not optimal.

[0129] Therefore, the problem to be solved by the present invention is to improve the compression efficiency of point clouds by selectively reducing the non-smoothness of the reconstructed surface on the decoder side.

[0130] Therefore, according to the present invention, a method is proposed to refine the triangle modeling in leaf nodes based on information indicating whether the point cloud is planar data, so that the reconstructed surface on the decoder side is closer to the original surface. Specifically, for example, if the point cloud is not planar data, after obtaining the decoded vertices and decoded centroid C on the decoder side, the proposed method can be used to refine the vectors in the leaf nodes towards the convex or concave regions. The proposed method adjusts the position of the vertices in the direction of the directional adjustment, thereby reducing the non-smoothness of the reconstructed point cloud without increasing the bitstream size.

[0131] In some embodiments, the method according to the present invention follows the steps of:

[0132] On the decoder side, after decoding the vertices of each leaf node from the bitstream, each leaf node can be iterated to construct triangles for obtaining the reconstructed point cloud, preferably by a ray tracing method.

[0133] In detail, for each leaf node, for example, if the point cloud data is planar data...

[0134] If the current (i.e., the processed) leaf node has more than 3 vertices,

[0135] First, the centroid C can be determined to construct a modeling surface, which consists of triangles formed by vertices and the centroid C. Specifically, the vertices in the leaf nodes are first determined ( , () average point and the unit vector of the centroid residual. Then the magnitude of the centroid residual can be decoded from the bitstream. Finally, it can be passed Obtain the centroid C, where, From The vector to C can also be named .

[0136] Then, the convexity of the modeling surface in the leaf node can be determined, and in a preferred embodiment, the centroid residual can be determined. Is it large relative to the size of the leaf node?

[0137] If the centroid residual The size is relatively large compared to the leaf node size (more likely, based on the current vertex). , The surface constructed by the centroid C will have sharp protrusion artifacts in the generated modeled surface. Then, for all vertices V( in the leaf nodes) , These vertices in the vector are refined to be vectors along all three axes. For vertices on the edges of axes with maximum values, the thinning can be performed along the edges to which these vertices belong, within the boundaries of the edges in the leaf nodes, in a direction that makes the reconstructed surface more natural. (Detailed vertex thinning will be described later.)

[0138] Otherwise, all vertices in the leaf nodes ( , (This) may not be further refined.

[0139] Then, all vertices in the leaf node (including the thinned vertex V' and other non-thinned vertices V) are combined with the centroid C to construct triangles, as described above with reference to the attached figure, and ray tracing methods can be applied to each triangle to obtain the reconstructed points in the leaf node.

[0140] Otherwise, if a leaf node has 3 vertices, then there is no average point in that processed leaf node. And the centroid C, and there is only one triangle It can be constructed, and ray tracing methods can be applied to the triangle to obtain the reconstructed points in the leaf nodes.

[0141] Otherwise, if a leaf node has fewer than 3 vertices, no triangle will be constructed for the leaf node, and therefore no reconstruction point will be generated for the leaf node.

[0142] On the encoder side, after encoding the vertices of each leaf node into the bitstream, each leaf node can be iterated to encode information about the centroid position C. For subsequent attribute encoding, it may also be necessary to obtain the reconstructed geometry of the point cloud by constructing triangles based on the vertices and centroid positions C. Furthermore, to obtain a refined modeled surface in the leaf nodes, the leaf nodes can be iteratively processed using the same steps described above on the decoder side. Of course, information indicating whether the point cloud is planar is also encoded at the encoder.

[0143] In some embodiments, in order to determine the centroid residual The size of the centroid residual can be determined relative to the size of the leaf nodes. This is compared to a threshold Th, which is based on the leaf node size N. The threshold Th can be a proportion s of the leaf node size. And the ratio s can be ,For example, Furthermore, in actual implementations, to improve the accuracy of division / multiplication operations, It is multiplied by 64 (this can be achieved by...) This is achieved by using a left shift of 6 bits, therefore a comparison threshold needs to be set. Multiply by 64 to become ). And, for example, for leaf nodes, if If so, the vertices in the leaf nodes can be refined according to the proposed method; otherwise, the proposed surface refinement method will not be applied to the leaf nodes.

[0144] As mentioned above, if the centroid residual If the size is larger than that of the leaf node, then for all vertices in the leaf node ( , The vector in the three axes Vertices on the edges of axes with maximum values ​​are refined, either along the edges to which these vertices belong, within the boundaries of the edges in the leaf nodes, and in a direction that makes the reconstructed surface more natural. In a preferred embodiment, detailed vertex refinement may follow these steps:

[0145] First, determine the vector. Along the axis whose maximum value is axis_max;

[0146] Then, we can find the vertices (V0,…,V) on the edges parallel to axis_max in the leaf nodes. i ),

[0147] Then, based on the centroid residual vector of the leaf node Change the axis_max dimension of these vertices and obtain the vertices. It can be described as

[0148] Where, offset is relative to offset distance, It is a unit vector pointing in the direction of offset. In a preferred embodiment, Based on And the boundary of the edge, where, Represents the centroid residual vector along The projection vector of the direction, and the boundary of the edge are constraints that keep the vertex from exceeding its own leaf node. Furthermore, in such an embodiment, Can be along with Same direction, and offset can be based on The modulus value and the boundary of the edge of the leaf node are obtained. In a preferred embodiment, the offset can be... 1 / 3 of the modulus. Figure 14 The diagram illustrates a refined surface generated according to the present invention. Point 1401 on the edge is a vertex V used for modeling the surface according to the current trisoup method. The arrow line parallel to the X-axis represents the projection vector of the centroid residual, and point 1401a (only one of these points is labeled for clarity) is obtained by passing through along... Refined vertex obtained by moving vertex V in the direction As can be seen, the refined surface is closer to the original surface in the leaf node and reduces the artifacts of the sharp protrusions.

[0149] According to the present invention, point cloud data can be divided into two categories, planar data and non-planar data, based on whether planar regions predominate.

[0150] In planar data, since planar regions constitute the majority, the centroid residuals of many leaf nodes in Trisoup encoding are 0, such as... Figure 15 As shown, this is planar point cloud data (such as a building with 4 walls, i.e., 1501) modeled using the Trisoup method (i.e., 1503). Since the front wall is planar, the centroid residuals of leaf nodes M and N are 0. It's conceivable that when modeling planar data, many leaf nodes have centroid residuals of 0, thus requiring no refinement of their vertices. However, if we use the aforementioned vertex refinement method for all types of point cloud data, each leaf node will be determined to have its vertices refined, and many results will be incorrect. Therefore, performing such determinations for planar point cloud data is time-consuming.

[0151] Similarly, in the edge portion of planar data, there is often a right-angle corner, such as... Figure 15 As shown in the leaf node O, the centroid residual in such a leaf node is not zero. Therefore, according to the thinning method described above, the vertices in such a leaf node will be offset and thinned along the edge direction, for example, as... Figure 16 As shown, vertex and will to Refine the direction and obtain the refined vertex. and Then refine the vertices and When triangles are constructed with other non-refined vertices, the surface reconstructed from the triangles in leaf nodes N and O will have gaps with the original front surface of the point cloud. In this case, when using vertex refinement to process planar data, the reconstructed point cloud cannot well preserve the original planar features.

[0152] As mentioned above, applying vertex thinning to planar point cloud data results in wasted time if the centroid residuals are zero in many leaf nodes. Furthermore, if the centroid residuals are not zero in some leaf nodes modeling the surface in the corner portions of the original point cloud, it leads to a deterioration in visual quality. Therefore, vertex thinning is not suitable for planar point cloud data and results in a loss of compression performance for planar data. Moreover, experimental results demonstrate that vertex thinning leads to a loss of compression performance for planar data.

[0153] However, based on our experimental observations and analysis of the above refinement method, this method can achieve better compression performance when applied to non-planar data.

[0154] Therefore, when vertex refinement is introduced, the present invention can further improve the overall compression performance of point cloud encoding for both planar and non-planar point cloud data.

[0155] One possible implementation of this invention is to introduce a flag (e.g., TrisoupPlaneFlag) indicating whether the point cloud is planar data to enable / disable vertex thinning in Trisoup encoding. If the flag is true (the point cloud data is planar data), the vertex thinning method is disabled; otherwise, the vertex thinning method is enabled. This proposed method avoids the loss of planar data caused by vertex thinning and achieves optimal overall compression efficiency for point cloud encoding.

[0156] In some embodiments, the TRisoupPlaneFlag indicating whether the point cloud is planar data can be obtained from the features of the input point cloud data, and the flag is written into the GPS (geometric parameter set) and encoded into a bitstream at the encoder, which is then decoded from the bitstream and used to enable / disable the vertex thinning method.

[0157] In some embodiments, on the encoder side, the flag TrisoupPlaneFlag indicating whether the point cloud is planar data can be determined based on the percentage of leaf nodes with a centroid residual value of 0. Specifically, on the encoder side, before constructing triangles within each leaf node, each leaf node is iterated to calculate the percentage of leaf nodes with a centroid residual value of 0. Assume that the number of leaf nodes with zero centroid residual is N. ctrod_is_0, N total It is the total number of leaf nodes in the point cloud, therefore It can be obtained through the following formula

[0158] = N ctrod_is_0 / N total,

[0159] and percentage With threshold Compare, if If the condition is met, TrisoupPlaneFlag is set to true; otherwise, it is set to false. Threshold This can be defined by the user; for example, preferably, it is a number greater than 0.7 and less than 1. This flag is then written into the GPS (geometric parameter set) and encoded into a bitstream at the encoder, and decoded from the bitstream at the decoder, and then used to enable / disable the vertex thinning method.

[0160] In some embodiments, the flag TrisoupPlaneFlag indicating whether a point cloud is planar data can be based on the normal vector of each leaf node. The direction is determined automatically. In detail...

[0161] First, iterate through each leaf node to calculate the normal vector. The number N of leaf nodes pointing in the same direction same ,

[0162] Then, after iteration, the percentage is calculated using the following formula.

[0163] = N same / N total,

[0164] Where, N total It is the total number of leaf nodes in the point cloud, and This can be used to indicate whether the encoded point cloud is planar data. A threshold needs to be determined. ,in the case of If the threshold is met, the encoded point cloud is planar data; otherwise, it is not planar data. It can be defined by the user; for example, preferably, it is a number greater than 0.7 and less than 1.

[0165] In some embodiments, the present invention follows the steps of:

[0166] On the decoder side, firstly, the TrisoupPlaneFlag can be decoded from the bitstream, and it can be used in the subsequent Trisoup encoding process. During Trisoup encoding, after decoding the vertices of each leaf node from the bitstream, each leaf node can be iterated to construct triangles for obtaining the reconstructed point cloud using ray tracing methods.

[0167] In detail, for each leaf node,

[0168] If the leaf node being processed has more than 3 vertices

[0169] First, the centroid C can be determined to construct a modeling surface consisting of triangles formed by the vertices and the centroid C. Specifically, the vertices (V0,…,V) in the leaf nodes can be determined first. i The average point C) mean and the unit vector of the centroid residual Then, the magnitude C of the centroid residual can be decoded from the bitstream. res Then it can be done through C=C mean +C res * We obtain the centroid C, where C res * From C mean The vector to C can also be named .

[0170] Then, the TrisoupPlaneFlag flag can be used to determine whether the current point cloud is planar data, in order to decide whether the vertices in the current leaf node need to be refined.

[0171] If the TrisoupPlaneFlag flag is true, then vertex optimization methods are disabled;

[0172] Otherwise, the vertex optimization method is enabled, and then applied to the vertices in the leaf nodes. In one embodiment, the spur of the modeled surface in the leaf node can first be determined; for example, it can be determined whether the centroid residual Cres is large relative to the size of the leaf node.

[0173] If the centroid residual C res If the size is larger than that of the leaf node, then for all vertices V( in the leaf node) , The vector in the three axes Refine the vertices on the edges of the axes with the maximum values, along the edges to which these vertices belong, within the boundaries of the edges in the leaf nodes, and in a direction that makes the reconstructed surface more natural.

[0174] Otherwise, all vertices ( , In leaf nodes, refinement is not required.

[0175] Then, triangles are constructed by combining all vertices in the leaf node (including thinned vertex V' and other non-thinned vertices V) with the centroid C, and ray tracing methods can be applied to each triangle to obtain the reconstructed points in the leaf node.

[0176] Otherwise, if a leaf node has 3 vertices, then the processed leaf node will not have an average point C. mean With the centroid C, only one triangle V1V2V3 can be constructed. Ray tracing can be applied to this triangle to obtain the reconstructed points in the leaf nodes.

[0177] Otherwise, if a leaf node has fewer than 3 vertices, a triangle will not be constructed for the leaf node, and therefore no reconstruction point will be generated for the leaf node.

[0178] On the encoding side, after encoding the vertices of each leaf node into a bitstream, each leaf node can then be iterated to encode information about the centroid position C. Furthermore, for subsequent attribute encoding, it may be necessary to obtain the reconstructed geometry of the point cloud by constructing triangles based on the vertices and centroid positions C. Before constructing triangles at the encoder, the leaf nodes are iteratively processed following the same steps described above on the decoder side.

[0179] In some embodiments, the flag TrisoupPlaneFlag indicating whether a point cloud is planar data can be based on the normal vector of each leaf node. The orientation is automatically determined, as mentioned earlier. For point clouds with more than one plane, a variant method can be used. In detail,

[0180] First, we can determine this based on the normal vector. The three categories can be set to be parallel to the x-axis, y-axis, or z-axis.

[0181] Then, each leaf node can be iterated to calculate the normal vector separately. Number of leaf nodes parallel to the x-axis The number of leaf nodes whose normal vector is parallel to the y-axis And the number of leaf nodes whose normal vector is parallel to the z-axis. .

[0182] Then, after iteration, the percentage is calculated using the following formula.

[0183] = N x / N total ,

[0184] And calculate the percentage using the following formula

[0185] = N y / N total ,

[0186] And calculate the percentage using the following formula

[0187] = N z / N total ,

[0188] Where, N total It is the total number of leaf nodes in the point cloud, and , and These can be used together to indicate whether the encoded point cloud is planar data. A threshold may also need to be determined. ,if If the condition is met, the encoded point cloud is planar data; otherwise, it is not planar data. Furthermore, a threshold can be determined. Preferably, it is a number greater than 0.7 and less than 1.

[0189] Now for reference Figure 17 This diagram illustrates a simplified block diagram of an example embodiment of an encoder or decoder 300. The encoder or decoder 300 includes a processor 301 and a memory storage device 303. The memory storage device 303 may store a computer program or application containing instructions that, when executed, cause the processor 301 to perform operations such as those described herein. For example, the instructions may encode and output an encoded bitstream or decode a bitstream and output points of a point cloud according to the methods described herein. It should be understood that the instructions may be stored on a non-transient computer-readable medium, such as an optical disc, flash memory, random access memory, hard disk drive, etc. When the instructions are executed, the processor 301 performs the operations and functions specified in the instructions to operate as a dedicated processor implementing the processes. In some examples, such a processor may be referred to as a "processor circuit" or "processor circuitry."

[0190] It is understood that the decoder and / or encoder according to this application can be implemented in many computing devices, including but not limited to servers, appropriately programmed general-purpose computers, machine vision systems, and mobile devices. The decoder or encoder can be implemented by software containing instructions for configuring one or more processors to perform the functions described herein. The software instructions can be stored on any suitable non-transitory computer-readable storage medium, including CDs, RAM, ROM, flash memory, etc.

[0191] It should be understood that the decoders and / or encoders described herein, as well as modules, routines, procedures, threads, or other software components implementing the methods / processes for configuring the encoders or decoders, can be implemented using standard computer programming techniques and languages. This application is not limited to specific processors, computer languages, computer programming conventions, data structures, or other such implementation details. Those skilled in the art will recognize that the described processes can be implemented as part of computer-executable code stored in volatile or non-volatile memory, as part of an application-specific integrated circuit (ASIC), etc.

[0192] Certain adjustments and modifications can be made to the described embodiments. Therefore, the embodiments discussed above are considered illustrative rather than restrictive. In particular, the embodiments can be freely combined with each other.

Claims

1. A method for decoding the geometry of a 3D point cloud from a bitstream, implemented in a decoder, the method comprising: Receive a bit stream, wherein the bit stream includes vertex information, the vertex information including information about the position of vertices on the edges of the cube of the leaf nodes of the octree structure of the point cloud; Obtain the vertex information to determine the vertex position; The feature is that the bit stream further includes planar information indicating whether the point cloud data is planar. Based on the planar information, determine whether to adjust the average position of at least one vertex toward the surface point; If it is determined that the adjustment will be performed, then the adjustment is performed on the vertex; The points of the point cloud are determined based on at least one adjusted vertex.

2. The method of claim 1, wherein acquiring the vertex information to determine the position of the vertex, wherein at least one vertex is adjusted toward the average position of the surface points, comprises: The virtual position is determined based on the position of a vertex of a cube associated with the leaf node of the octree structure; Construct a triangle based on the vertices of a cube and the virtual positions; The normal vector of the cube is determined based on the constructed triangle; The centroid position is determined based on the normal vector; Adjust at least one of the vertices toward the centroid position.

3. The method of claim 2, wherein, At least one vertex is adjusted only when the difference between the virtual position and the centroid position exceeds a threshold, wherein the threshold is determined based on the size of the leaf node.

4. The method according to claim 3, wherein the threshold is a proportion of the size of the leaf node. ,in, 。 5. The method of claim 4, wherein, s=1 / 128。 6. The method of any one of claims 1 to 5, wherein, At least one of the vertices is adjusted according to a fixed value, wherein the fixed value is determined based on the type of the point cloud and is encoded into or from a bitstream.

7. The method according to claim 2, wherein, At least one vertex is adjusted based on the difference between the virtual position and the centroid position, and based on the boundary of the edge of the leaf node.

8. The method according to claim 1, wherein, The planar information is encoded into the geometric parameter set GPS.

9. A method for encoding a 3D point cloud into a bitstream, implemented in an encoder, the method comprising: Obtain vertex information for each cube associated with a leaf node from the point cloud surface, wherein the vertex information includes information about the position of vertices on the edges of the cubes of the leaf nodes in the octree structure of the point cloud. The vertex information is encoded into a bitstream, wherein the bitstream further includes planar information indicating whether the point cloud data is planar. Based on the planar information, determine whether to adjust the average position of at least one vertex toward the surface point; If it is determined that the adjustment will be performed, then the adjustment is performed on the vertex; The points of the point cloud are determined based on the encoded vertex information.

10. The method of claim 9, wherein acquiring the vertex information to determine the position of a vertex, wherein at least one vertex is adjusted toward the average position of surface points, comprises: The virtual position is determined based on the position of a vertex of a cube associated with the leaf node of the octree structure; Construct a triangle based on the vertices of a cube and the virtual positions; The normal vector of the cube is determined based on the constructed triangle; The centroid position is determined based on the normal vector; Adjust at least one of the vertices toward the centroid position.

11. The method according to claim 10, wherein, At least one vertex is adjusted only when the difference between the virtual position and the centroid position exceeds a threshold, wherein the threshold is determined based on the size of the leaf node.

12. The method of claim 11, wherein the threshold is a proportion of the size of the leaf node. ,in, 。 13. The method according to claim 12, wherein, s=1 / 128。 14. The method according to any one of claims 9 to 13, wherein, At least one of the vertices is adjusted according to a fixed value, wherein the fixed value is determined based on the type of the point cloud and is encoded into or from a bitstream.

15. The method according to claim 10, wherein, At least one vertex is adjusted based on the difference between the virtual position and the centroid position, and based on the boundary of the edge of the leaf node.

16. The method according to claim 9, wherein, The planar information is encoded into the geometric parameter set GPS.

17. An encoder for encoding 3D point clouds into a bitstream, the encoder comprising at least one processor and a memory, wherein, The processor stores instructions, which, when executed by the processor, perform the steps of the method according to any one of claims 9 to 16.

18. A decoder for decoding 3D point clouds from a bitstream, the decoder comprising at least one processor and a memory, wherein, The memory stores instructions, which, when executed by the processor, perform the steps of the method according to any one of claims 1 to 8.

19. A computer-readable storage medium comprising instructions that, when executed by a processor, perform the steps of the method according to any one of claims 1 to 8 or claims 9 to 16.

20. A method for transmitting a bit stream, characterized in that, include: Generate a bitstream by performing the point cloud encoding method according to claim 9; And the transmission of the bit stream.