Method for encoding and decoding 3D point clouds, encoder, and decoder

JP7881066B2Active Publication Date: 2026-06-26BEIJING XIAOMI MOBILE SOFTWARE CO LTD

Patent Information

Authority / Receiving Office
JP · JP
Patent Type
Patents
Current Assignee / Owner
BEIJING XIAOMI MOBILE SOFTWARE CO LTD
Filing Date
2022-10-17
Publication Date
2026-06-26

Smart Images

  • Figure 0007881066000015
    Figure 0007881066000015
  • Figure 0007881066000016
    Figure 0007881066000016
  • Figure 0007881066000017
    Figure 0007881066000017
Patent Text Reader

Abstract

A method for decoding geometry of a 3D point cloud from a bitstream, preferably implemented in a decoder, the method comprising the steps of receiving and decoding a bitstream, wherein the bitstream comprises octree information and vertex information, the octree information comprising information about the octree structure of a volume of the point cloud, the vertex information comprising information about the presence and location of vertices at edges of cubes of leaf nodes of the octree structure; determining triangles by connecting vertices of one cube associated with the leaf nodes of the octree structure; and voxelizing the triangles to determine points of the point cloud, the method further comprising the steps of determining whether additional information included in the bitstream satisfies a predefined condition, the additional information being determined based on the density of the point cloud, preferably a sampling distance d of the point cloud. sampl and if the predefined condition is met, the sampling distance d sampl and expanding at least one triangle along at least one side based on the calculated voxelization vector to perform voxelization.
Need to check novelty before this filing date? Find Prior Art

Description

[Technical Field]

[0001] The present invention relates to a method for decoding a 3D point cloud from a bitstream. Furthermore, the present invention aims to provide a method for encoding a 3D point cloud into a bitstream. In addition, the present invention aims to provide an encoder and decoder, a bitstream encoded by the present invention, and software. In particular, the present invention aims to provide a method for improving the accuracy of the 3D point cloud decoding or reconstruction process and enabling better overall compression performance. [Background technology]

[0002] Point clouds are the most noteworthy format for representing 3D data. This is because they have the ability to represent all types of 3D objects or scenes. Therefore, many use cases, such as the following, can be solved with point clouds. • Post-production of films, Real-time 3D immersive remote presentation or virtual reality (VR) / augmented reality (AR) applications, • Free-viewpoint video (for example, for watching sports), • Geographic Information System (also known as mapmaking system) • Cultural heritage (digital preservation of scans of rare items) • Autonomous driving including 3D mapping of the environment and real-time laser radar (Lidar) data acquisition.

[0003] A point cloud is a set of points in 3D space, to which additional values ​​can be selectively assigned. These additional values ​​are typically called point attributes. Therefore, a point cloud is a combination of geometry (the 3D location of each point) and attributes.

[0004] The attributes may be, for example, material attributes such as three-component color, reflectance, and / or two-component normal vectors of a surface associated with a point.

[0005] Point clouds can be captured using a variety of devices, such as camera arrays, depth sensors, laser radar, and scanners, or they can be generated on a computer (for example, during post-production of a film). Depending on the use case, point clouds can contain thousands or even billions of points, which can be used in drafting applications.

[0006] The original representation of a point cloud requires a very high number of bits per point, with each spatial component X, Y, or Z requiring at least 12 bits, and optionally, attributes requiring even more bits; for example, color requires three times 10 bits. To actually implement point cloud applications, compression techniques are needed that allow the point cloud to be stored and distributed via appropriate storage and transmission infrastructure.

[0007] For example, in AR / VR glasses and other 3D-capable devices, compression may be lossy (such as video compression) in order to distribute and enable end-user visualization. In other use cases such as medical applications and autonomous driving, lossless compression is required so as not to alter the decision-making results obtained from point cloud analysis of compression and transmission.

[0008] To date, point cloud compression (also known as PCC) has not yet gained mainstream market attention, and there are no standardized point cloud codecs available. In 2017, the standardization working group ISO / JCT1 / SC29 / WG11 (also known as the Video Professional Working Group or MPEG) launched a working project on the subject. As a result, two standards were established, namely • MPEG-I Part 5 (ISO / IEC 23090-5) or video-based point cloud compression (V-PCC) • MPEG-I Part 9 (ISO / IEC 23090-9) or Geometry-Based Point Cloud Compression (G-PCC).

[0009] The V-PCC and G-PCC standards completed their first versions at the end of 2020 and were quickly put into market.

[0010] The V-PCC encoding method compresses a point cloud by performing multiple projections of a 3D object to obtain 2D patches packaged as images (or as videos when processing a moved point cloud). The acquired images or videos can then be compressed using existing image / video codecs, leveraging already implemented image and video solutions. Essentially, V-PCC is only effective for densely packed, continuous point clouds because image / video codecs cannot compress non-smooth patches, such as those obtained from projections of sparse geometric data collected from Lidar.

[0011] There are two types of geometry compression methods in the G-PCC encoding method.

[0012] The first method is based on an occupation tree (octiv / quadiv / binary) representation of the point cloud geometry. Occupied nodes are divided until they reach a certain size, and the occupied leaf nodes provide the location of the points, usually located at the center of these nodes. By using neighbor-based prediction techniques, a high level of compression can be achieved for dense point clouds. Sparse point clouds are also resolved by directly encoding the location of points in nodes that do not have a minimum size, and stopping tree construction when only isolated points exist in a node; this technique is called Direct Coding Mode (DCM).

[0013] The second method is based on a prediction tree, where each node represents the 3D location of a single point, and the relationships between nodes are spatial predictions from parent nodes to child nodes. This method can only process sparse point clouds and has the advantages of lower latency and easier decoding than the occupation tree. However, compared to the occupation-based first method, its compression performance is only slightly better, the encoding is more complex, and when constructing the prediction tree, it is necessary to intensively search for the optimal predictor (among a long sequence of latent predictors).

[0014] In both methods, attribute coding (decoding) is performed after geometry coding (decoding) is complete, resulting in two-pass coding. Therefore, by using slices that decompose the 3D space into independently coded subvolumes, low latency is achieved without prediction between subvolumes. Using many slices can significantly impact compression performance.

[0015] One important use case is the transmission of dynamic AR / VR point clouds. Dynamic means that the point cloud evolves over time. Also, since AR / VR point clouds represent the surface of an object for most of the time, they are usually local 2D. Therefore, AR / VR point clouds are highly connected (i.e., densely packed), because points are rarely isolated and are surrounded by many neighboring points.

[0016] A dense (or solid) cluster of points represents a continuous surface, and its resolution allows the volumes associated with the points (small cubes called voxels) to be in contact with each other without displaying any visible cavities on the surface.

[0017] This point cloud is typically used in AR / VR environments, where end users view it from devices such as TVs, smartphones, and headphones. It is either transmitted to the device or stored locally. Many AR / VR applications use moving point clouds that change over time, rather than static ones. Therefore, the amount of data is enormous and must be compressed. Currently, lossless compression of octree representations based on point cloud geometry can be slightly below 1 bpp (bits per point). This may be insufficient for real-time transmission. Real-time transmission can involve millions of points per frame, with frame rates reaching 50 frames per second (fps), potentially leading to hundreds of megabits of data per second.

[0018] Therefore, lossy compression can be used with the usual requirements of maintaining acceptable visual quality while compressing sufficiently to fit the bandwidth provided by the transmission channel, while still maintaining real-time transmission of frames. In many applications, real-time transmission is possible at bitrates as low as 0.1 bpp (10 times higher than lossless encoding).

[0019] The MPEG-I Part 5 (ISO / IEC 23090-5) or Video-Based Point Cloud Compression (V-PCC) based codec VPCC can achieve this low bitrate through lossy compression using a video codec that compresses 2D frames obtained from projecting point clouds onto a plane. Geometry is represented by a series of projected patches assembled in the frame, each patch being a small local depth map. However, VPCC is not general-purpose and is limited to narrow types of point clouds that do not exhibit locally complex geometric shapes (e.g., trees or hair). This is because the resulting projected depth maps are not smooth and cannot be efficiently compressed by the video codec.

[0020] Pure 3D compression techniques can handle all types of point clouds. Whether 3D compression techniques can compete with VPCC (or arbitrary projection + image coding schemes) for dense point clouds remains a concern. Standardization is moving towards providing an extension (modification) of GPCC. This extension would offer competitive lossy compression, with the same compression effect on dense point clouds as VPCC intra, while maintaining GPCC's versatility and accommodating any type of point cloud (dense, laser radar, 3D maps). This extension may utilize a so-called TriSoup (triangle soup) scheme suitable for octrees. The ISO / IEC standardization working group JTC1 / SC29 / WG7 is considering TriSoup. For more information on TriSoup coding, please refer to "Adaptive multi-level triangle soup for geometry-based point cloud coding" by A. DRICOT et al. at the 21st IEEE International Symposium on Multimedia Signal Processing (MMSP) in 2019, "report on triangle soup decoding" by Nakagami O., and m52279 and US10,192,353 from ISO / IEC JTC1 / SC29-WG11 in 2020.

[0021] However, in all lossy compression methods, the quality of point cloud reconstruction is crucial. [Overview of the Initiative]

[0022] Therefore, the present invention aims to provide a method for decoding the geometry of a 3D point cloud from a bitstream, and a method for encoding a 3D point cloud into a bitstream, wherein the method improves accuracy.

[0023] This problem is solved by the decoding method described in claim 1, the encoding method described in claim 2, the encoder described in claim 16, the decoder described in claim 17, the bitstream described in claim 18, and the software described in claim 19.

[0024] In the first embodiment, a method for decoding 3D point cloud geometry from a bitstream is provided, preferably implemented with a decoder. This method is A step of receiving and decoding a bitstream, wherein the bitstream includes octvine information and vertex information, the octvine information includes information about the octvine structure of the volume of the point cloud, and the vertex information includes information about the existence and position of vertices at the edges of the cube of the leaf nodes of the octvine structure. The steps include determining a triangle by connecting the vertices of a cube associated with a leaf node of the aforementioned octree structure, The steps include: voxelizing the triangle and determining the points of the point cloud; The aforementioned method, A step of determining whether the additional information contained in the bitstream satisfies predefined conditions, wherein the additional information is determined based on the density of the point cloud, preferably the sampling distance d of the point cloud. sampl The steps evaluated by, If the aforementioned predefined conditions are met, the sampling distance d sampl The method further includes the step of performing voxelization by extending at least one triangle along at least one side based on the above.

[0025] Therefore, in the first step, a bitstream is received, which contains information about the octree structure of the decoded point cloud volume. Preferably, the geometry of the point cloud is GPCC encoded. Thus, octree information about the point cloud volume is provided by decoding from the bitstream. Furthermore, the bitstream further contains vertex information, which includes information about the presence and location of vertices on the edges of the cube related to the leaf nodes in the octree structure. Thus, vertex information is provided by decoding from the bitstream. Here, it is preferable that the bitstream is encoded by the encoder using the Trisoup coding scheme.

[0026] After decoding the octree and vertex information from the bitstream as described in the previous step, the next step to reconstruct the point cloud geometry is to determine triangles for each cube by connecting vertices along the edges of the cube. Thus, the surface of a triangle is determined by the positions of the vertices contained in the bitstream. To reconstruct the points of the point cloud from the triangles, voxelization is performed by ray tracing. In ray tracing, rays are emitted in three directions parallel to one of the three axes. The origins of these are integer coordinate points corresponding to the sampling precision required for rendering. Next, the intersection points of the rays with any of the triangles are determined (if any) and added to the list of rendering points, i.e., added to the points of the point cloud. During the voxelization process, the surface of the triangles is sampled by the rays to determine the points of the point cloud.

[0027] In this context, according to the present invention, there are different methods for determining a triangle based on additional information contained in the bitstream.

[0028] Method 1. Adaptive halo: For / during voxelization, the sampling distance d of the point cloud. sampl Based on this, at least one triangle is extended along at least one side so as to extend the surface of the triangle along at least one direction. Here, the sampling distance is an attribute of the initial point cloud data, and assuming no points were lost during data acquisition, the sampling distance correlates with the actual distance between sampling points in the point cloud in units of sampling resolution. Here, d samplThe points are set, for example, by a device that acquires points from the point cloud (e.g., LiDAR). Therefore, by expanding the triangle during voxelization, it is possible to reliably determine the additional points of the original point cloud that would otherwise be ignored during voxelization, thereby improving the accuracy of the voxelization process. Since the triangle is sampled with a constant precision and sampling resolution, by expanding the triangle along at least one side to expand the surface of the triangle, it is possible to capture just the points of the point cloud outside the triangle. Note that the expansion of the triangle is based on the sampling distance of the point cloud, so the expansion is applicable to any point cloud regardless of the sampling distance. It is preferable that the expansion is directly proportional to the sampling distance of the point cloud. Therefore, as the sampling distance of the point cloud increases, the triangle is also expanded more. Details of the adaptive halo method are described in the dependent claims. Therefore, in many cases, higher accuracy is achieved in reconstructing the 3D point cloud and the number of sampling errors during voxelization is reduced. Also, the complexity of the encoding and / or decoding algorithms is maintained. However, this method does not work well for all types of point clouds. In some cases, it may lead to a loss of compression results.

[0029] Method 2. Fixed or Other Value Halo: Compared to adaptive halo, the expansion of the triangle in this method is based on a fixed value independent of the sampling distance of the point cloud. It should be understood that Method 2 may be another approach to determining the triangle, for example, not expanding the triangle at all.

[0030] Therefore, according to the present invention, additional information contained in the bitstream is used to select between an adaptive halo scheme or a non-adaptive halo scheme. By introducing such an indicator, each aspect of the present invention can be applied to an appropriate use case and can achieve better overall compression performance than a solution that implements only a single aspect.

[0031] Preferably, at least one triangle is extended by one or more sides to further enlarge the surface of the corresponding triangle. Thus, a triangle can be extended by one side, two sides, or all three sides so as to include the points of the original point cloud, the points just beyond the triangle determined by the vertices on the sides of the cube.

[0032] Preferably, if one cube of a leaf node in the octree structure contains one or more triangles, each triangle in this cube is extended along at least one edge and used for voxelization. Thus, the extension of the surface of a triangle can be applied to all triangles in the cube. Alternatively, or in addition to this, in each cube of the octree structure, at least one triangle is extended along at least one edge and used for voxelization. Alternatively, the extension of one or more edges of a triangle is applied only to a subset of leaf nodes in the octree structure, where this subset may be determined, for example, by the application, the density of points in the leaf nodes of the point cloud, or the requirements for accuracy and decoding speed. More preferably, one or more edges of a triangle are extended based on a local sampling distance. Thus, the triangles in each subset of leaf nodes can be extended to achieve locally optimal performance.

[0033] Preferably, the extension of each side is the same. Therefore, in order to enlarge the surface of the triangle, the triangle is extended by the same amount in at least two directions. More preferably, the amount of extension in all three directions is the same. Alternatively, the extensions along at least two directions are different. Therefore, by treating different directions differently, the decoding accuracy is improved.

[0034] Preferably, each leaf node of the octree structure has the same or different expansions. If there is a side larger than one of the triangles or different expansions on each side in one leaf node of the octree structure, this may be the same or different in other leaf nodes of the octree structure. Here, the expansion may be selectable in advance, for example, determined by an application, the density or accuracy of points in the leaf node of the point cloud, or the requirements for the decoding speed.

[0035] Preferably,

Number

[0036] Preferably,

Number

Number

[0037] Preferably, the requirement for the convex hull is -ε u_a ≤u, -ε v_a ≤v and -ε w_a Set to ≤w, where ε u_a ,ε v_a ,ε w_a ≥ 0, where u, v, and w are the centroid coordinates of the triangle, and ε u_a , ε v_a , ε w_a At least one of them is the sampling distance d of the point cloud sampl It is determined based on . Therefore, the extension of the triangles under consideration can be individually controlled by providing separate convex hull requirements for different directions. Here, ε u_a ≠ε w_a In place of this, or in addition to this, ε u_a ≠ε v_a In place of this, or in addition to this, ε v_a ≠ε w_a Therefore, expansion in one or more directions can be selected individually from other directions, and the expansion can be determined separately.

[0038] Preferably, the extension is provided by the adaptive halo (halo) parameter. Here,

number

[0039] Preferably, the adaptive halo parameter is 1 / 4d sampl It is set to a smaller value. More preferably, the adaptive halo parameter is 1 / 8d sampl It is set to a smaller value. Therefore, by selecting the adaptive halo parameter, the amount of expansion can be customized to obtain the optimal result, where the larger this value, the more points are determined during the voxelization process. The preferred range for the adaptive halo parameter is 0 to d sampl The result is between these two extremes. As the sampling distance increases, the adaptive halo parameter also increases, and the amount of expansion increases. Therefore, even if the sampling distance changes, the present invention can provide an adaptive solution for expanding the triangle, ensuring that a reasonable number of points are always covered by the expanded triangle.

[0040] Preferably, the adaptive halo parameters are pre-set. Therefore, the encoder and decoder may agree on the adaptive halo parameters, and these parameters remain constant for each point cloud generated by the encoder and reconstructed by the decoder. Information regarding the adaptive halo parameters does not need to be encoded into the bitstream.

[0041] Alternatively, the adaptive halo parameters are encoded into a bitstream, preferably into a bitstream geometry parameter set (GPS). This can be done all at once when setting adaptive halo parameters for point clouds to be subsequently decoded. Alternatively, each adaptive halo parameter or adaptive halo parameter set can be encoded individually for each point cloud.

[0042] Alternatively, the halo parameter also depends on the size of the cube's volume, i.e., the level of the octvine of the current leaf node.

[0043] Preferably, the sampling distance d of the point cloud.sampl teeth

number

[0044] Preferably, at least one triangle has a weighted halo parameter ε a_t Based on this, voxelization is performed by extending along at least one edge, where the weighted halo parameter ε a_t is ε a_t =ε a * Determined by t, where ε aThe sampling distance d of the point cloud is sampl The adaptive halo parameter is based on the following, providing an extension of at least one triangle, where t is a corresponding weight associated with the sampling distance, preferably set to 2. In some embodiments, t is selected between 1 and 4, more preferably between 1.5 and 2.5. Here, the value of t can be determined using a heuristic method. A heuristic method is an optimization method that attempts to find the globally optimal feasible solution for a particular problem under consideration. A heuristic method is inherently iterative. In each iteration, a feasible solution for a particular problem is determined. If the heuristic method terminates after a certain period or a certain number of iterations, the output solution is the optimal solution found in any of the iterations. More preferably, the weights tried in each iteration are integers selected from the range of 1 to 4, where the adaptive halo parameter is less than 1. If the weights are too large, it may affect the overall accuracy of the TriSoup model. Therefore, the upper limit can be set to 4. For example, if it is determined that the best results are obtained by setting the adaptive halo parameter to 1 / 4 and assigning a weight of 2 to the sampling distance, then if the adaptive halo parameter is directly proportional to the sampling distance, the updated adaptive halo parameter may be 1 / 4 * 2 = 1 / 2. Therefore, by providing a suitable range for setting the weights, the overall efficiency and accuracy of the algorithm can be further improved. It should also be understood that different weights can be determined for each of the different directions of the triangle.

[0045] Preferably, the additional information is a flag for enabling or disabling the functionality of the encoding or decoding method, and is preferably one bit. In its simplest form, the additional information can be a flag indicating whether or not to enable the adaptive halo scheme. It will be understood that the additional information may be multiple bits, as long as they can indicate the necessary information according to the present invention.

[0046] Preferably, the additional information is encoded in the bitstream's geometry parameter set (GPS).

[0047] In another aspect of the present invention, a method for encoding a 3D point cloud into a bitstream is provided, preferably implemented with an encoder. The method for encoding a 3D point cloud is: A step of obtaining octree information, wherein the octree information includes the octree structure of a volume, and the volume includes a plurality of cubes, A step of obtaining vertex information from the surface of the point cloud of each cube associated with a leaf node, wherein the vertex information includes information about the existence of vertices on the edges of the cube and the position of the vertices. The steps include encoding the octvine information and the vertex information into a bitstream, The process includes the step of reconstructing the geometry data of the point cloud based on the octvine information and vertex information obtained during the encoding process described above, Here, the step of reconstructing the geometry data of the point cloud is: The steps include determining a triangle by connecting the vertices of a cube associated with a leaf node of the aforementioned octree structure, The steps include: voxelizing the triangle and determining the points of the point cloud; The aforementioned method, A step of determining additional information based on the density of the point cloud, preferably the sampling distance d of the point cloud. sampl The steps to evaluate by, The steps include encoding the additional information into the bitstream, The steps include determining whether the aforementioned additional information satisfies predefined conditions, If the aforementioned predefined conditions are met, the sampling distance d sampl The method further includes the step of performing voxelization by extending at least one triangle along at least one side based on the above.

[0048] Therefore, the above encoding method generates octree information and vertex information. Additional information is determined and generated based on the density of the point cloud, for example, based on the sampling distance of the point cloud. The density can be determined by other methods, which will not be described in detail here. The above information is encoded into a bitstream. Subsequently, the encoder performs a reconstruction step. In this reconstruction step, the point cloud geometry information is reconstructed, and here the reconstruction step is the same as the steps of the decoding method described above. Then, the encoder uses the reconstructed geometry of the point cloud to encode the attributes of the points in the point cloud (color, reflectivity, etc.), for example, using RAHT (Region Adaptive Hierarchy Transform), predictive transformation, or lifting transformation to encode the attributes of the points in the point cloud.

[0049] Preferably, the geometry of the point cloud is encoded into a bitstream by geometry-based point cloud compression (G-PCC).

[0050] Preferably, the bitstream is a bitstream compliant with MPEG G-PCC.

[0051] Preferably, an encoding method is constructed in relation to a decoding method according to the features described above.

[0052] In another aspect of the present invention, an encoder for encoding a 3D point cloud into a bitstream is provided. The encoder includes memory and a processor, where instructions are stored in memory, and when the instructions are executed by the processor, the steps of the encoding method described above are performed.

[0053] In another aspect of the present invention, a decoder is provided for decoding a 3D point cloud from a bitstream. The decoder includes a memory and a processor, where instructions are stored in the memory, and when the instructions are executed by the processor, the steps of the above decoding method are performed.

[0054] In another aspect of the present invention, a bitstream encoded based on the steps of the above encoding method is provided.

[0055] In another aspect of the present invention, a computer-readable storage medium is provided which includes instructions for performing the steps of the method for encoding the 3D point cloud into a bitstream as described above.

[0056] In another aspect of the present invention, a computer-readable storage medium is provided which includes instructions for performing the steps of the method for decoding a 3D point cloud from a bitstream as described above.

[0057] In another aspect of the present invention, a computer-readable storage medium is provided, comprising instructions for performing the steps of the method for encoding the 3D point cloud into a bitstream as described above, and further comprising a configuration file specifying the type of point cloud, which indicates the density of the point cloud. Here, the type of point cloud may be, for example, solid, dense, sparse, and scant. However, it should be understood that these types can essentially be distinguished by the sampling distance of the point cloud as described above. In any of the above embodiments, additional information may also be determined based on the type of point cloud (for example, by obtaining information from a configuration file). [Brief explanation of the drawing]

[0058] The present invention will be described in more detail below with reference to the attached drawings. These drawings show the following: [Figure 1a] This is a flowchart of the method for decoding 3D point cloud geometry according to the present invention. [Figure 1b] This is a simplified flowchart of the decryption method according to the present invention. [Figure 2] This example demonstrates how to generate an octree structure. [Figure 3] Figure 2 shows an octree. [Figure 4] This example shows how to determine a vertex at an edge of a cube. [Figure 5] An example of generating a triangle is shown. [Figure 6]An example of a vertex at the edge of a cube is shown. [Figure 7] This demonstrates how to generate triangles using vertices. [Figure 8] An example of determining the order of triangles using Figure 7 is shown. [Figure 9] This is a schematic diagram of the voxelization step. [Figure 10] This shows a 2D representation of a triangle in a leaf node of an octree. [Figure 11] An example of voxelization of a triangle is shown in Figure 10. [Figure 12] The centroid coordinates and definition of the triangle in Figure 10 are shown. [Figure 13] This shows a comparison between the triangle at the vertices and the original point cloud. [Figure 14a] Figure 10 shows that the triangle extends along one direction in the centroid coordinate system based on a fixed halo parameter. [Figure 14b] Figure 10 shows that the triangle expands along one direction in the centroid coordinate system based on the adaptive halo parameter. [Figure 15a] Figure 10 shows that the triangle extends along all three directions based on the fixed halo parameter. [Figure 15b] Figure 10 shows that the triangle expands along all three directions based on the adaptive halo parameters. [Figure 16] We show a triangular representation that extends in three directions based on the weighted halo parameter εa_t and the sampling distance of the point cloud. [Figure 17a] This shows a triangular representation that extends in three directions based on the sampling distance of the point cloud. [Figure 17b] This shows a representation of a triangle that expands in three directions with a certain amount of force. [Figure 17c] When the sampling distance is set to 1, the representation shows a triangle that expands in three directions by a constant amount. [Figure 18] This is a schematic flowchart of the encoding method. [Figure 19a]This demonstrates the performance of longdress data based on different halo parameters. [Figure 19b] This demonstrates the performance of the house_without_roof data based on different halo parameters. [Figure 19c] This shows the performance of ulb_unicorn data based on different halo parameters. [Modes for carrying out the invention]

[0059] Referring to Figure 1a, it is a schematic diagram of how to decode 3D point cloud geometry information from a bitstream.

[0060] A method for decoding 3D point cloud geometry from a bitstream is preferably implemented with a decoder and includes the following steps: In step S01, the bitstream is received and decoded, where the bitstream includes octvine information and vertex information, the octvine information includes information about the octvine structure of the volume of the point cloud, and the vertex information includes information about the existence and position of vertices at the edges of the cube of the leaf nodes of the octvine structure. In step S02, a triangle is determined by connecting the vertices of a cube associated with a leaf node of the octvine structure. In step S03, the triangle is voxelized to determine the points in the point cloud. The system determines whether the additional information contained in the bitstream satisfies predefined conditions, and the additional information is determined based on the density of the point cloud, preferably the sampling distance d of the point cloud. sampl Evaluated by and if the predefined conditions are met, at least one triangle is at the sampling distance d sampl Based on this, it is extended along at least one edge and voxelized.

[0061] To determine the octvine information, the first step in the geometric coding process is to construct and code the octvine, as shown in Figures 2 and 3. The bounding box is the main volume 100, which contains all the points and is associated with the root node 112 (i.e., the single node at the top of tree 110). This main volume 100 is first divided into eight subvolumes 102 called octivities, each subvolume 102 represented by a node 114 in tree 110. Next, the subvolumes 104 are recursively divided until the target level is reached, where each octivity 106 is occupied by at least one point, which is shown as hatching in Figures 2 and 3.

[0062] Each octectonic (or node) is represented by an occupation byte, which contains one bit for each sub-octectonic. If a sub-octectonic is occupied by at least one point, this corresponding bit is set to 1; otherwise, it is set to 0. All octectonic byte 118 is serialized and entropy encoded (with wide priority) using a binary arithmetic encoder.

[0063] Figure 4 shows a block representation of a 3D surface 210 and an example of a block 220 in TriSoup. Surface 210 intersects with block 220, and therefore block 220 is an occupied block, existing between multiple blocks 200 in 3D space. Within block 220, the closed portion of surface 210 intersects with the edges of the block at six illustrated vertices of polygon 230. If an edge of block 220 contains a vertex, that edge is said to be selected.

[0064] Figure 5 shows block 220 in TriSoup, with surface 210 omitted for clarity, showing the unselected edge 270, the selected edge 260, and the i-th edge 250. Assume the i-th edge 250 is selected. Vertex v at edge i. i To specify this, you specify a scalar value that indicates the corresponding fraction of a side length of 250.

[0065] As shown in Figures 4 and 5, within each octivization 220 at the target level of the octvine, trisoup represents the original surface 210 as a set of triangles 245. This surface is coded and used to obtain the positions of the reconstructed (or decoded) points. First, the intersections between the surface represented by the original points and the edges of the octivization are estimated by averaging the positions of the points closest to those edges within the octivization. Next, all 12 edges of the octivization and their associated intersections (if any) are stored as segments and vertices, respectively. Each (unique) segment is then coded as follows: The first bit is arithmetic coded, set to 1 if the segment is occupied by a vertex, and to 0 otherwise. If occupied, the relative position of the vertex in the segment is also arithmetic coded.

[0066] As shown in Figure 6, the vertices 310 of a triangle are encoded along the edges 320 of a volume associated with the leaf nodes 300 of the tree. These vertices 310 along the edges 320 are shared among multiple leaf nodes 300 that share the common edge 320. This means that each edge belonging to at least one leaf node encodes at most one vertex. In this way, the leaf nodes ensure the continuity of the model.

[0067] As mentioned earlier, encoding TriSoup vertices requires two pieces of information for each edge: • A vertex flag indicating whether or not a TriSoup vertex exists on the edge, • If present, the position of the vertex along the edge.

[0068] Therefore, the encoded data consists of octree data and TriSoup data.

[0069] The vertex flags are encoded by an adaptive binary arithmetic encoder, which encodes the vertex flags using a specific context. Length N=2 sThe position of a vertex on an edge can be encoded with single precision by pushing (bypassing / non-entropic encoding) s bits into the bitstream.

[0070] Within a leaf node, if there are at least three vertices 310 on edge 320 of leaf node 300, a triangle is constructed from the TriSoup vertices. Figure 7 shows the reconstructed triangles 330 and 340.

[0071] Of course, other combinations of triangles 330 and 340 are also possible. There are three steps to selecting a triangle: 1. Determine the direction of movement along one of the three axes, 2. Sort the TriSoup vertices based on the direction of execution. 3. Construct triangles based on an ordered list of vertices.

[0072] Knowledge of the exact position of the triangle within the current leaf node is not essential; it can be derived from the vertices.

[0073] Figure 8 is used to illustrate this process. Each of the three axes is tested, and the axis that maximizes the total surface area of ​​the triangle is designated as the principal axis. For simplicity of illustration, only the testing of two axes is shown in Figure 8.

[0074] The first test (top) along the vertical axis is performed by projecting the cube and TriSoup vertex 310 perpendicularly onto a 2D plane. Then, vertex 310 is rearranged clockwise with respect to the center of the projection node (square). Based on the ordered vertices, triangles 330 and 340 are constructed according to the fixed rules. Here, if there are four vertices, triangles 123 and 134 are constructed systematically. If there are three vertices, only triangle 123 is possible. If there are five vertices, the fixed rules may also construct triangles 123, 134 and 451. This continues up to 12 vertices.

[0075] The second test (left side) is performed along the horizontal and vertical axes by projecting the cube and Trisoup vertices horizontally onto a 2D plane.

[0076] Since the vertical projection displays the maximum 2D total surface area of ​​the triangle, the principal axis is selected as the vertical axis, and the constructed TriSoup triangles are obtained in the order of vertical projection, as shown in the node in Figure 8. Note that using the horizontal axis as the principal axis will result in a different construction of the triangle.

[0077] By maximizing the projection plane and appropriately selecting the principal axis, a point cloud without holes is continuously reconstructed.

[0078] Rendering TriSoup triangles as points is achieved through ray tracing. The set of all points rendered by ray tracing constitutes a decoded point cloud.

[0079] In the ray tracing shown in Figure 9, rays are emitted in three directions parallel to the axis. Their origins are points with integer (voxelized) coordinates corresponding to the sampling precision required for rendering. Then, the intersections with any of the Trisoup triangles (dotted line points, if any) are voxelized (rounded to the nearest point with the required sampling precision) and added to the rendering point list.

[0080] After applying Trisoup to all leaf nodes, i.e., after constructing triangles and obtaining points by ray tracing, discard copies of the same points in the list of all rendered points (i.e., only one voxel remains for every voxel that shares the same position and volume) to obtain one set of decoded (unique) points.

[0081] For simplicity, from here on, Figures 10 to 16 will depict 2D volumes (squares) associated with leaf nodes, rather than 3D volumes (cubes). Note that all methods described in this invention can be applied to 3D space.

[0082] Referring to Figure 10, it shows an example of an N×N×N volume, where N=2 s = 8. This volume has at least three vertices V1, V2, and V3 along its 410-degree edge (although depicted as a square in the diagram, it is actually a cube).

[0083] The edges of a leaf node are located at positions -0.5 and N-0.5, ensuring the continuity of the TriSoup model when passed from one “volume” to an adjacent volume. In effect, this means that the faces of the cube are shared between adjacent volumes. In this way, the positions of vertices on an edge are independent of the cube to which that edge belongs.

[0084] The vertex is located at position p along its corresponding edge. k These are quantized positions 400, which are encoded into a bitstream. These positions 400 are p k The values ​​can be quantized with a unit step size such that they become integers in the interval [0, N-1]. In the example in Figure 10, p1=4, p2=2, and p3=2.

[0085] TriSoup triangle 440 is constructed with vertices V1, V2, and V3, and the triangular soup belonging to the volume models the point cloud enclosed by the volume.

[0086] The process of reconstructing point 430 (of the decoded point cloud) from triangle 440 is called triangular voxelization. Figure 11 shows the voxelization of the TriSoup triangle in Figure 10. Rays are emitted along all integer coordinates 420 (white and black dots), and some decoded points (black dots) are generated by rays intersecting the triangle. Here, at the origin of the rays, there is an interval D that sets the sampling resolution for voxelization.

[0087] As shown in Figure 12, the position of the intersection point between the ray and the triangle is determined using the centroid coordinate system.

number

[0088] Any point P in 3D space can be uniquely represented by the centroid coordinates of any non-degenerate 3D triangle ABC (equivalent to any triangle V1V2V3 from the TriSoup model).

[0089] Any point P in 3D space can be uniquely represented as follows: P = uA + vB + wC Here, u+v+w=1 There is a condition.

[0090] The points of the triangle correspond to the convex hull. Therefore, 0 ≤ u, v, w.

[0091]

number

number

number

[0092] The intersection point P between the ray and the unique plane passing through points A, B, and C can be found by the following calculation.

number

[0093] This intersection point P belongs to a triangle only if 0 ≤ u, v, w.

[0094] As shown in Figure 13, there is a slight shift between the position of Trisoup triangle V1V2V3, determined by the vertices from the bitstream, and the natural position of this triangle 450 in the current volume. This position is natural. This is because the encoder is located at the nearest point (relative to an edge) of the original point cloud to vertex V k This is because we derived the position of vertex V. Therefore, vertex V k Voxelized points adjacent to a point are likely to be points in the point cloud. These points are natural candidate points that construct the "natural" triangles that model the point cloud.

[0095] This shift is due to the constraint of continuity of adjacent volumes. As a result, ray tracing involves several points 460 (P in Figure 13) that do not belong to the Trisoup triangle (compared to triangle 440 determined by the vertices provided by the bitstream as shown in Figure 11). miss ) is missing. The direct consequences are a decrease in the quantized geometry metric and a decrease in the rate-distortion performance of this scheme.

[0096] Therefore, by slightly relaxing the convex hull condition 0 ≤ u, v, w, we can construct a "halo" around the TriSoup triangle. This slightly increases the size of the triangle, and the ray trace intersects the increased-size triangle, resulting in fewer points P. miss You will miss out.

[0097] Let ε>0 be the halo parameter. As shown in Figure 14a, we relax the condition 0≦u to -ε≦u, where u is the centroid weight associated with point A, and increase the size of the triangle along side BC opposite point A, as shown in the dotted region 470.

[0098] By changing the convex hull 0≦u,v,w to -ε≦u,v,w, the relaxation of the condition can be applied to the three centroid weights u,v, and w.

[0099] The halo 480 obtained around triangle 440 is shown in Figure 15a. In the first-order approximation, the size of the halo is directly proportional to the parameter ε.

[0100] The halo parameter may depend on the centroid weights of the triangle, for example, -ε u ≤u, -ε v ≤v and -ε w ≤ w. Here, ε u , ε v and ε w These are the three halo parameters.

[0101] Figure 17b shows the effect on voxelization, where several points P miss It is now part of the "halo" and is decoded as a point in the decoded point cloud. Therefore, it does not leak like the original algorithm.

[0102] Of course, the halo parameter ε (or ε instead) u , ε v and ε w The value of ε needs to be set so that the halo is large enough. If ε is too small, the halo will be very small and have little effect, reverting to the conventional leak point problem. If ε is too large, the halo will be large, affecting the overall accuracy of the Trisoup model. In either case, the distortion of the decoded point cloud is not optimal.

[0103] The appropriate value of the halo parameter ε is

number

[0104] However, setting the halo parameters to fixed values ​​has the drawback that the constructed "halo" does not necessarily yield the best results.

[0105] To demonstrate that an arbitrary, fixed halo value does not yield the best results across a set of data, we tested the performance of different halo values ​​ε on three test point clouds named "longdress_viewdep_vox12", "house_without_roof_00057_vox12", and "ulb_unicorn_vox13" used in MPEG G-PCC. In the test experiment, the halo parameter values ​​used in the G-PCC code were obtained by multiplying ε by 256 to improve calculation accuracy, and were set to 16, 32, 64, and 128, so that the corresponding values ​​of ε were 1 / 16, 1 / 8, 1 / 4, and 1 / 2. For each data set, encoding performance was obtained using these four halo values ​​ε at the same compression ratio r02.

[0106] Figures 19a, b, and c show the relationship between the quality of the decoded point cloud (geometry PSNR) and the halo ε value. A high PSNR indicates good quality. It was observed that different data sets could achieve the maximum PSNR quality with different halo values ​​ε. For example, the optimal halo value for the longdress data is 128, the optimal halo value for the house_without_roof data is 128, and the optimal halo value for the ulb_unicorn data is 32. Therefore, the value of the halo parameter ε may not be constant to achieve optimal compression performance across different datasets.

[0107] As shown in Figure 17c, there are two natural points (represented by all black dots) on some edges close to the volume, and the TriSoup triangle V1V2V3 is derived from these points. A shift was observed between the TriSoup triangle V1V2V3 and its natural position in the current volume. In Figure 17c, the sampling distance of the points is 1, and the enlarged triangle obtained using the current fixed halo parameter ε is P missExcept for a few points, many natural points can be covered. However, as the sampling distance increases, as shown in Figure 17b, the further the natural points are from the triangle, the more natural points will be missed when using the current fixed halo parameters than when the sampling distance is 1. Therefore, to reduce the point reconstruction error, a large halo parameter is required for point data with a large sampling distance.

[0108] Therefore, by slightly relaxing the convex hull condition 0 ≤ u,v,w, an adaptive "halo" is constructed based on the sampling distance of the point cloud around the TriSoup triangle. This makes the triangle slightly larger, and the ray tracing intersects with the larger triangle, resulting in fewer points P. miss You will miss out.

[0109] The advantages of the adaptive halo method are as follows: • The decoded point cloud exhibits less distortion. In fact, quantitative metrics (BDBR) show that the method proposed in this invention can achieve a bitrate gain of 2.6% (equivalent quality) compared to non-adaptive methods (i.e., fixed halo parameters). • The overall algorithm remains unchanged, thus maintaining complexity.

[0110] ε a >0 is the adaptive halo parameter determined based on the sampling distance of the point cloud. As shown in Figure 14b, the condition 0 ≤ u is ε a Relax to ≤u, where u is the centroid weight associated with point A, and increases the size of the triangle along side BC opposite point A, as shown in dotted region 472.

[0111] The convex hull 0 ≤ u, v, w is -ε a By setting ≤ u,v,w, the condition can be relaxed to three centroid weights u, v, and w.

[0112] The halo 482 obtained around the triangle 440 is shown in Fig. 15b. In a first approximation, the size of the halo is proportional to the adaptive halo parameter ε a and may thus also be proportional to the sampling distance of the point cloud.

[0113] In one embodiment, the adaptive halo parameter may depend on each centroid weight of the triangle, for example -ε u_a ≦ u, -ε v_a ≦ v and -ε w_a ≦ w. Here, ε u_a , ε v_a and ε w_a are three adaptive halo parameters.

[0114] Figs. 17a and 17b show the influence on voxelization. Compared with Fig. 17b, when the sampling distance is large and the halo parameter is constant (the smaller the sampling distance, the more appropriate), there are many missing points P miss . In Fig. 17a, since the adaptive halo parameter according to the present invention is applied, some missing points P miss are now part of the halo and are thus decoded as points of the decoded point cloud. Therefore, there is no leakage like the original algorithm.

[0115] Referring to Fig. 16, compared with Fig. 17a, more P miss are now part of the halo. A larger halo is provided by the weighted halo parameter. Here, the weighting parameter associates not only the sampling distance of the point cloud but also the weight t and the sampling distance. In Fig. 16, the weight t is set to 2. This further improves the accuracy of the decoding or reconstruction process of the 3D point cloud.

[0116] Of course, the value of the adaptive halo parameter ε a (alternatively ε u_a , ε v_a and ε w_a ) needs to be set so as to form a halo having a sufficient size. ε aIf it is too small, the halo is very small and has little effect, returning to the problem of leakage points in the prior art. If ε is too large, the halo becomes large, affecting the overall accuracy of the Trisoup model. In either case, the distortion of the decoded point cloud is not optimal.

[0117] Halo parameter ε a The appropriate value of

Number

[0118] When the sampling distance is fixed, the adaptive halo parameter ε a (instead of ε u_a , ε v_a and ε w_a ) may be a fixed value. In one variant, the halo parameter ε a (instead of ε u_a , ε v_a and ε w_a ) is encoded in a bitstream, for example, a geometry parameter set (GPS). In another variant, the halo parameter ε a (instead of ε u_a , ε v_a and ε w_a ) further depends on the size N of the volume. In yet another variant, for a set of volumes representing a point cloud, the adaptive halo parameter ε a (instead of ε u_a , ε v_a and ε w_a ) is locally notified.

[0119] While adaptive glow methods offer many advantages, they don't work well with all types of MPEG point cloud datasets. When used directly with MPEG G-PCC software, they result in a loss of overall compression performance in the D2 (point-to-plane distortion) metric, and the overall performance gain in the D1 (point-to-point distortion) metric is not very significant (around 2%, less than 5%). Therefore, simply implementing a single adaptive halo method does not provide optimal overall encoding performance for all types of MPEG point cloud datasets.

[0120] In particular, the MPEG G-PCC standard allows AR / VR datasets to be divided into four categories: solid, dense, sparse, and scant. The solid category consists of voxelized point clouds with continuous surfaces, the dense category consists of discontinuous voxelized point clouds, the sparse category is not dense (sparser than the dense category), and the scant category consists of very sparse data. The adaptive halo method of the Trisoup model has already been tested on data in all of the above categories, and as previously mentioned, BDBR has two metrics for evaluating the quality of the reconstructed point cloud: one is the D1 (point-to-point distortion) metric, and the other is the D2 (point-to-plane distortion) metric. Detailed experimental results for each category are as follows: • Solid Category: The adaptive halo method does not affect the compression efficiency of solid category data in terms of D1 and D2 metrics. • Dense Categories: According to the D1 metric, the adaptive halo method is very effective in improving the compression efficiency of dense category data (actually, there is a 5% gain). According to the D2 metric, this method has little impact on the compression efficiency of dense category data. • Sparse and rare categories: The adaptive halo method can slightly improve the compression efficiency of sparse and rare categories in terms of the D1 metric, but it causes losses in terms of the D2 metric.

[0121] Therefore, in the last step S03 of FIG. 1a, by selectively executing the adaptive halo method, better overall performance of encoding is achieved. Referring to FIG. 1b, a simplified flow according to the proposed method is shown. Here, the bitstream at the encoder can include one flag (e.g., adaptive_halo_enabled_flag), and the decoder can determine whether to enable the adaptive halo method based on this flag. Preferably, this flag may be included in the geometry parameter set (GPS) of the bitstream. The above GPS includes parameters specifying features and activation tools used in the encoded geometry information bitstream of the point cloud slice, and the GPS is arranged in the slice header of the geometry information stream. For example, when the flag is set to "true", the adaptive halo method is enabled, and otherwise, the adaptive halo method is not used for trisoup encoding. When the adaptive halo method is not used, triangles for voxelization can be extended along at least one side based on a fixed value. As described above, how to set the flag may be based on the category of the point cloud data, which can be evaluated by the sampling distance of the point cloud data. For example, when the sampling distance d of the point cloud satisfies the condition 1 < d < 4 (i.e., dense), the value of the flag is set to true, and otherwise, the value of the flag is set to false.

[0122] In some embodiments, the value (true / false) of the flag adaptive_halo_enabled_flag can be determined by reading from the configuration file of the G-PCC encoder, where the data information (including the data category) is indicated in the configuration file.

[0123] Referring to FIG. 18, it is a schematic flowchart of a method for encoding a 3D point cloud according to the present invention into a bitstream. This method includes the following steps.

[0124] In step S11, octree information is determined, and this octree information includes the octree structure of the volume, and this volume contains multiple cubes.

[0125] In step S12, vertex information is obtained from the surface of the point cloud of each cube associated with the leaf node, where the vertex information includes information about the presence and position of vertices on the edges of the cube.

[0126] In step S13, the octvine information and vertex information are encoded into a bitstream.

[0127] In step S14, the point cloud data is reconstructed based on the octvine information and vertex information obtained during the encoding process described above. Reconstructing the point cloud data here includes the following steps 141 to 142.

[0128] In step 141, a triangle is determined by connecting the vertices of a cube associated with a leaf node of the octvine structure.

[0129] In step 142, the triangle is voxelized to determine the points of the point cloud, and additional information is determined based on the density of the point cloud (the sampling distance d of the point cloud). sampl (This can be evaluated by) encoding the additional information into a bitstream, determining whether the additional information satisfies predefined conditions, and if the predefined conditions are met, the sampling distance d sampl Based on this, voxelization is performed by extending at least one triangle along at least one side.

[0130] Here, steps S11-S13 relate to Trisoup coding, for example, see "Adaptive multi-level triangle soup for geometry-based point cloud coding" by A. DRICOT et al. at the 21st IEEE International Symposium on Multimedia Signal Processing (MMSP) in 2019, "report on triangle soup decoding" by Nakagami O., m52279 of ISO / IEC JTC1 / SC29-WG11 in 2020, and US10,192,353. In addition to normal point cloud coding, this method further includes a reconstruction step which includes the same or similar steps as those of the decoding method described earlier with reference to Figure 1. The reconstructed point cloud is then used for interpolation of attributes (e.g., color), and the attributes of the points in the point cloud can be coded based on the reconstructed geometry.

Claims

1. A method for decoding 3D point cloud geometry from a bitstream, the method being implemented by a decoder, A step of receiving and decoding a bitstream, wherein the bitstream includes octvine information and vertex information, the octvine information includes information about the octvine structure of the volume of the point cloud, and the vertex information includes information about the existence and position of vertices at the edges of the cube of the leaf nodes of the octvine structure. The steps include determining a triangle by connecting the vertices of a cube associated with a leaf node of the aforementioned octree structure, The steps include: voxelizing the triangle and determining the points of the point cloud; The aforementioned method, A step of determining whether the additional information contained in the bitstream satisfies predefined conditions, wherein the value of the additional information is determined based on whether the density of the point cloud satisfies pre-set conditions, and the density of the point cloud is the sampling distance d of the point cloud. sampl The steps determined by, If the aforementioned predefined conditions are met, the sampling distance d sampl The process further includes the step of extending at least one triangle along at least one side to perform voxelization, based on the above, A method characterized by the following:

2. A method for encoding a 3D point cloud into a bitstream, which is implemented by an encoder, and the method is A step of obtaining octree information, wherein the octree information includes the octree structure of a volume, and the volume includes a plurality of cubes, A step of obtaining vertex information from the surface of the point cloud of each cube associated with a leaf node, wherein the vertex information includes information about the existence of vertices on the edges of the cube and the position of the vertices, The steps include encoding the octvine information and the vertex information into a bitstream, The process includes the step of reconstructing the geometry data of the point cloud based on the octvine information and vertex information obtained during the encoding process described above, The step of reconstructing the geometry data of the point cloud is: The steps include determining a triangle by connecting the vertices of a cube associated with a leaf node of the aforementioned octree structure, The steps include: voxelizing the triangle and determining the points of the point cloud; The aforementioned method, A step of determining additional information based on the density of the point cloud, wherein the value of the additional information is determined based on whether or not the density of the point cloud satisfies a preset condition, and the density of the point cloud is the sampling distance d of the point cloud sampl The steps determined by, The steps include encoding the additional information into the bitstream, The steps include determining whether the aforementioned additional information satisfies predefined conditions, If the aforementioned predefined conditions are met, the sampling distance d sampl The process further includes the step of extending at least one triangle along at least one side to perform voxelization, based on the above, A method characterized by the following:

3. The encoding is Trisoup encoding. The method according to feature 2.

4. The aforementioned at least one triangle is extended by two or three sides to perform voxelization. The method according to any one of 1 to 3, characterized by the features described herein.

5. Each triangle within the cube is expanded, and at least one triangle within each cube in the point cloud containing triangles is expanded. The method according to any one of 1 to 3, characterized by the features described herein.

6. The extension is the same for each side, or different for at least two sides. The method according to any one of 1 to 3, characterized by the features described herein.

7. Voxelization [Math 1] The voxelization of the point is obtained by using an algorithm and / or by rounding the coordinates of the point to the nearest integer. The method according to any one of 1 to 3, characterized by the features described herein.

8. The requirement for the convex hull is -ε a ≤ u, v, w, ε a > 0, u, v, w are the centroid coordinates of the triangle, ε a The sampling distance d of the point cloud is sampl Determined based on, The method according to feature 7.

9. The requirements for the convex hull are -ε u_a ≦ u, -ε v_a ≦ v and -ε w_a ≦ w, and ε u_a , ε v_a , ε w_a ≧ 0, where u, v, and w are the barycentric coordinates of the triangle, and ε u_a ≠ ε w_a and / or ε u_a ≠ ε v_a and / or ε v_a ≠ ε w_a and ε u_a , ε v_a and ε w_a at least one of which is determined based on the sampling distance d of the point cloud sampl ​ The method according to feature 7.

10. The extension is provided by a halo parameter, and the halo parameter of the extension is d sampl Less than / 4 The method according to any one of 1 to 3, characterized by the features described herein.

11. The extension is provided by adaptive halo parameters, and the extension is pre-configured. The method according to any one of 1 to 3, characterized by the features described herein.

12. The extension is provided by adaptive halo parameters, which are encoded in the bitstream. The method according to any one of 1 to 3, characterized by the features described herein.

13. The sampling distance d of the point group sampl teeth [Math 2] Determined by, N leaf This is the number of leaf nodes, N total is the number of points in the aforementioned point cloud, and N is the size of the corresponding cube of the leaf node. or the sampling distance d of the point cloud sampl This is determined by the circular method. The method according to any one of 1 to 3, characterized by the features described herein.

14. The aforementioned at least one triangle has a weighted halo parameter ε a_t Based on this, voxelization is performed by extending along at least one edge, and the weighted halo parameter ε a_t is ε a_t =ε a * Determined by t, where 1 < t < 4, and ε a The sampling distance d of the point cloud is sampl Adaptive halo parameters based on the above, which provide an extension of the at least one triangle, where t is a corresponding weight associated with the sampling distance, and t is set to 2. The method according to any one of 1 to 3, characterized by the features described herein.

15. The additional information is a flag for enabling or disabling the function of the encoding or decoding method, and is 1 bit in number. The method according to any one of 1 to 3, characterized by the features described herein.

16. An encoder for encoding a 3D point cloud into a bitstream, The encoder includes at least one processor and a memory, the memory storing instructions, and when an instruction is executed by the processor, the steps of the method according to any one of claims 2 to 3 are performed. An encoder characterized by the following features.

17. A decoder for decoding a 3D point cloud from a bitstream, The decoder includes at least one processor and a memory, the memory storing an instruction, and when the instruction is executed by the processor, the steps of the method according to claim 1 are performed. A decoder characterized by the following features.

18. A computer-readable storage medium containing instructions, wherein when the instructions are executed by a processor, the steps of the method according to any one of claims 1 to 3 are performed. A computer-readable storage medium characterized by the following features.