Information processing apparatus and method

By setting reference points based on the centroid or distribution of points in point cloud coding, the problem of insufficient coding efficiency in existing technologies is solved, achieving a more efficient coding process and more accurate prediction.

CN114902284BActive Publication Date: 2026-06-12SONY GROUP CORP

Patent Information

Authority / Receiving Office
CN · China
Patent Type
Patents(China)
Current Assignee / Owner
SONY GROUP CORP
Filing Date
2020-12-24
Publication Date
2026-06-12

AI Technical Summary

Technical Problem

Existing point cloud encoding methods are inefficient, especially the method described in non-patent literature 4, which fails to effectively suppress the reduction in encoding efficiency.

Method used

By layering the attribute information of each point in the point cloud, a reference point is set based on the centroid or distribution pattern of the point, the difference between the attribute information and the predicted value is derived, and the setting information of the reference point is encoded.

Benefits of technology

It effectively suppressed the decrease in coding efficiency, improved the prediction accuracy of prediction points, reduced the amount of information, and achieved a more efficient coding process.

✦ Generated by Eureka AI based on patent content.

Smart Images

  • Figure CN114902284B_ABST
    Figure CN114902284B_ABST
Patent Text Reader

Abstract

The present disclosure relates to an information processing apparatus and method capable of suppressing a reduction in coding efficiency. Attribute information relating to each point in a point cloud in which a three-dimensional shaped object is represented as a set of points is hierarchically separated by recursively repeating classification of a prediction point and a reference point for deriving a difference between the attribute information and a prediction value with respect to the reference point for the prediction value, and at this time, the reference point is set based on a barycenter of the points. The present disclosure is suitable for, for example, an information processing apparatus, an image processing apparatus, an encoding apparatus, a decoding apparatus, an electronic instrument, an information processing method, and a program.
Need to check novelty before this filing date? Find Prior Art

Description

Technical Field

[0001] This disclosure relates to information processing apparatus and methods, and more specifically to information processing apparatus and methods capable of suppressing reductions in coding efficiency. Background Technology

[0002] Conventionally, methods for encoding 3D data representing three-dimensional structures, such as point clouds, have been considered (see, for example, Non-Patent Document 1). Point cloud data includes geometric data (also referred to as positional information) and attribute data (also referred to as attribute information) for each point. Therefore, the point cloud is encoded for each piece of geometric data and attribute data. Various methods have been proposed as methods for encoding attribute data. For example, a technique called lifting has been proposed (see, for example, Non-Patent Document 2). Furthermore, a method capable of scalable decoding of attribute data has been proposed (see, for example, Non-Patent Document 3).

[0003] In this enhancement scheme, the process of recursively setting points as either reference points or prediction points is repeated to hierarchically structure the attribute data. Then, based on this hierarchical structure, predicted values ​​for the attribute data of the prediction points are derived using the attribute data of the reference points, and the difference between the predicted values ​​and the attribute data is encoded. In this hierarchical structure of attribute data, the following method has been proposed: alternating between the first and last points in Morton order from the candidates for reference points in each level (see, for example, Non-Patent Document 4).

[0004] [List of Citations]

[0005] [Non-patent literature]

[0006] Non-patent document 1: R. Mekuria, IEEE Student Member, K. Blom, P. Cesar., IEEE Member, "Design, Implementation and Evaluation of a Point Cloud Codec for Tele-ImmersiveVideo", tcsvt_paper_submitted_february.pdf.

[0007] Non-Patent Document 2: Khaled Mammou, Alexis Tourapis, Jungsun Kim, Fabrice Robinet, Valery Valentin, Yeping Su, "Lifting Scheme for Lossy Attribute Encoding in TMC1", ISO / IEC JTC1 / SC29 / WG11 MPEG2018 / m42640, April 2018, San Diego, USA.

[0008] Non-patent document 3: Ohji Nakagami, Satoru Kuma, "[G-PCC] Spatial scalability support for G-PCC", ISO / IEC JTC1 / SC29 / WG11MPEG2019 / m47352, March 2019, Geneva, Switzerland.

[0009] Non-patent document 4: Hyejung Hur, Sejin Oh, "[G-PCC][New Proposal] on improved spatial scalable lifting", ISO / IEC JTC1 / SC29 / WG11MPEG2019 / M51408, October 2019, Geneva, Switzerland. Summary of the Invention

[0010] [The problem this invention aims to solve]

[0011] However, the method described in Non-Patent Document 4 is not always optimal, and other methods are required.

[0012] This disclosure is made in view of such circumstances and can suppress the reduction in coding efficiency.

[0013] [Solution to the problem]

[0014] According to one aspect of the present technology, an information processing apparatus is an information processing apparatus comprising: a layering unit that, for attribute information of each point in a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of predicted points and reference points relative to a reference point, wherein the predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value, wherein the layering unit sets the reference points based on the centroid of each point.

[0015] According to one aspect of the present technology, an information processing method is an information processing method comprising: when performing hierarchical classification of attribute information for each point of a point cloud representing an object having a three-dimensional shape as a set of points by recursively repeating the classification of predicted points and reference points, setting a reference point based on the centroid of the point, the predicted point being used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point being used to derive the predicted value.

[0016] According to another aspect of the present technology, an information processing apparatus is an information processing apparatus comprising: a layering unit that, for attribute information of each point in a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of predicted points and reference points relative to reference points, wherein the predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value, wherein the layering unit sets the reference points based on the distribution pattern of each point.

[0017] According to another aspect of the present technology, an information processing method is an information processing method comprising: when performing hierarchical classification of attribute information for each point of a point cloud representing an object having a three-dimensional shape as a set of points by recursively repeating the classification of prediction points and reference points relative to reference points, setting reference points based on the distribution of points, using prediction points to derive the difference between attribute information and predicted values ​​of attribute information, and using reference points to derive predicted values.

[0018] According to another aspect of the present technology, an information processing apparatus is an information processing apparatus comprising: a layering unit that, for attribute information of each point in a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of prediction points and reference points relative to reference points, wherein prediction points are used to derive the difference between the attribute information and the predicted value of the attribute information, and reference points are used to derive the predicted value; and an encoding unit that encodes information related to the setting of reference points by the layering unit.

[0019] According to another aspect of the present technology, the information processing method is an information processing method comprising: for attribute information of each point in a point cloud representing an object having a three-dimensional shape as a set of points, stratifying the attribute information by recursively repeating the classification of prediction points and reference points relative to a reference point, wherein prediction points are used to derive the difference between the attribute information and the predicted value of the attribute information, reference points are used to derive the predicted value, and encoding information related to the setting of the reference points.

[0020] According to another aspect of the present invention, an information processing apparatus includes: a layering unit that, for attribute information of each point in a point cloud representing an object having a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of predicted points and reference points relative to reference points, wherein predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and reference points are used to derive the predicted value, wherein the layering unit alternately selects points closer to the center of the bounding box and points farther away from the center of the bounding box as reference points from candidates for reference points in each layer.

[0021] According to another aspect of the present invention, an information processing method is an information processing method comprising: when performing hierarchical classification of attribute information for each point of a point cloud representing an object having a three-dimensional shape as a set of points by recursively repeating the classification of predicted points and reference points relative to reference points, in each level, alternately selecting points closer to the center of the bounding box and points farther from the center of the bounding box as reference points from candidates for reference points, the predicted points being used to derive the difference between the attribute information and the predicted values ​​of the attribute information, and the reference points being used to derive the predicted values ​​relative to the reference points.

[0022] In an information processing apparatus and method according to one aspect of the present technology, when the attribute information of each point in a point cloud representing an object having a three-dimensional shape as a set of points is hierarchically classified by recursively repeating the classification of the predicted point and the reference point relative to a reference point, the reference point is set based on the centroid of the point, the predicted point is used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value relative to the reference point.

[0023] In another aspect of the information processing apparatus and method according to the present technology, when performing hierarchical classification of attribute information by recursively repeating the classification of predicted points and reference points relative to reference points for the attribute information of each point in a point cloud that represents an object having a three-dimensional shape as a set of points, a reference point is set based on the distribution pattern of the points, the predicted value is used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value relative to the reference point.

[0024] In another aspect of the information processing apparatus and method according to the present technology, for the attribute information of each point in a point cloud that represents an object having a three-dimensional shape as a set of points, the attribute information is hierarchically classified by recursively repeating the classification of prediction points and reference points relative to reference points, the prediction points are used to derive the difference between the attribute information and the predicted value of the attribute information, the reference points are used to derive the predicted value, and information related to the setting of the reference points is encoded.

[0025] In another aspect of the information processing apparatus and method according to the present technology, when performing hierarchical classification of attribute information for each point of a point cloud that represents an object having a three-dimensional shape as a set of points, by recursively repeating the classification of predicted points and reference points relative to reference points, points closer to the center of the bounding box and points farther away from the center of the bounding box are alternately selected as reference points from the candidates for reference points at each level, the predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value. Attached Figure Description

[0026] Figure 1 This is a diagram illustrating an example of a promoted state.

[0027] Figure 2 This is a diagram illustrating an example of a method for setting reference points based on the Morton sequence.

[0028] Figure 3 This is a diagram illustrating an example of a method for setting reference points based on the Morton sequence.

[0029] Figure 4 This is a diagram illustrating an example of a method for setting reference points based on the Morton sequence.

[0030] Figure 5 This is a diagram illustrating an example of a method for setting reference points based on the Morton sequence.

[0031] Figure 6 This is a diagram illustrating an example of how to set a reference point.

[0032] Figure 7 This is a diagram illustrating an example of a method for setting a reference point based on the centroid.

[0033] Figure 8 This is a diagram illustrating an example of a method for deriving the centroid.

[0034] Figure 9 This is a diagram illustrating an example of a method for deriving the centroid.

[0035] Figure 10 This is a diagram showing an example of the region from which the centroid is derived.

[0036] Figure 11 This is a diagram illustrating an example of a method for selecting points.

[0037] Figure 12 This is a diagram illustrating examples of the same conditions.

[0038] Figure 13 This is a block diagram illustrating a main configuration example of the encoding device.

[0039] Figure 14This is a block diagram showing a main configuration example of the attribute information encoding unit.

[0040] Figure 15 This is a block diagram illustrating a primary configuration example of a hierarchical processing unit.

[0041] Figure 16 This is a flowchart illustrating an example of the encoding process.

[0042] Figure 17 This is a flowchart illustrating an example of the process for encoding and processing attribute information.

[0043] Figure 18 This is a flowchart illustrating an example of a layered processing flow.

[0044] Figure 19 This is a flowchart illustrating an example of the reference point setting process.

[0045] Figure 20 This is a block diagram illustrating a main configuration example of the decoding device.

[0046] Figure 21 This is a block diagram showing a main configuration example of the attribute information decoding unit.

[0047] Figure 22 This is a flowchart illustrating an example of the decoding process.

[0048] Figure 23 This is a flowchart illustrating an example of the attribute information decoding process.

[0049] Figure 24 This is a flowchart illustrating an example of the de-stratification process.

[0050] Figure 25 This is a diagram showing an example of a table.

[0051] Figure 26 This is a diagram illustrating examples of table descriptions and signaling transmission.

[0052] Figure 27 This is a flowchart illustrating an example of the reference point setting process.

[0053] Figure 28 This is a diagram showing an example of information to be notified using signaling.

[0054] Figure 29 This is a diagram illustrating an example of a target notified via signaling.

[0055] Figure 30 This is a diagram showing an example of information to be notified using signaling.

[0056] Figure 31 This is a diagram showing an example of information to be notified using signaling.

[0057] Figure 32 This is a diagram illustrating an example of the syntax for communicating information in fixed-length bits using signaling.

[0058] Figure 33 This is a diagram illustrating an example of the syntax for communicating information in fixed-length bits using signaling.

[0059] Figure 34 This is a diagram illustrating an example of a variable-length bit message to be communicated via signaling.

[0060] Figure 35 This is a diagram illustrating an example of a variable-length bit message to be communicated via signaling.

[0061] Figure 36 This is a diagram illustrating an example of the syntax for notifying variable-length bits using signaling.

[0062] Figure 37 This is a diagram illustrating an example of the syntax for notifying variable-length bits using signaling.

[0063] Figure 38 This is a flowchart illustrating an example of the reference point setting process.

[0064] Figure 39 This is a diagram illustrating an example of the search order.

[0065] Figure 40 This is a flowchart illustrating an example of the reference point setting process.

[0066] Figure 41 This is a block diagram illustrating a typical configuration example of a computer. Detailed Implementation

[0067] The following describes the manner in which this disclosure is carried out (hereinafter referred to as implementation). Note that the description will proceed in the following order.

[0068] 1. Setting up reference points

[0069] 2. First Implementation Method (Method 1)

[0070] 3. Second Implementation Method (Method 2)

[0071] 4. Third Implementation Method (Method 3)

[0072] 5. Fourth Implementation Method (Method 4)

[0073] 6. Appendix

[0074] <1. Setting up reference points>

[0075] Supporting documents on technical content and terminology, etc.

[0076] The scope of this technology includes not only what is described in the embodiments, but also what is described in the following non-patent documents known at the time of filing.

[0077] Non-patent literature 1: (as described above)

[0078] Non-patent literature 2: (as described above)

[0079] Non-patent literature 3: (as described above)

[0080] Non-patent literature 4: (as described above)

[0081] In other words, the content described in the aforementioned non-patent documents, as well as the content of other documents referenced in the aforementioned non-patent documents, are also used as the basis for determining the supporting requirements.

[0082] <Point Cloud>

[0083] Conventionally, there exist 3D data such as point clouds that represent 3D structures through point location information, attribute information, and meshes composed of vertices, edges, and faces, and use polygon representations to define 3D shapes.

[0084] For example, in the case of point clouds, a 3D structure (3D object) is represented as a collection of a large number of points. Point cloud data (also known as point cloud data) includes the location information (also known as geometric data) and attribute information (also known as attribute data) for each point. Attribute data can include any information. For example, the attribute data can include the color information, reflectivity information, normal information, etc., for each point. As mentioned above, point cloud data has a relatively simple data structure and can represent any 3D structure with sufficient accuracy using a sufficiently large number of points.

[0085] <Using voxel quantization of location information>

[0086] Because such point cloud data has a relatively large data volume, a voxel encoding method has been conceived to compress the data volume through encoding and other means. A voxel is a three-dimensional region used to quantize geometric data (positional information).

[0087] In other words, the 3D region including the point cloud (also known as the bounding box) is divided into smaller 3D regions called voxels, and for each voxel, an indication is given as to whether a point is included. In this way, the position of each point is quantized on a voxel-by-voxel basis. Therefore, by converting point cloud data into such voxel data (also known as voxel data), the increase in information content can be suppressed (generally, the amount of information can be reduced).

[0088] Octree

[0089] Furthermore, for geometric data, it has been envisioned to construct octrees using such voxel data. An octree is obtained by converting the voxel data into a tree structure. The value of each bit in the lowest node of the octree indicates whether a point is present or absent in each voxel. For example, a value "1" indicates that the voxel contains a point, while a value "0" indicates that the voxel does not contain a point. In an octree, one node corresponds to eight voxels. That is, each node of the octree includes 8 bits of data, and these 8 bits indicate whether a point is present or absent in eight voxels.

[0090] Then, the higher nodes of the octree indicate whether a point exists or not in the region where the eight voxels corresponding to the lower nodes belonging to that node are combined into a single voxel. In other words, higher nodes are generated by collecting information from the voxels of the lower nodes. Note that nodes with a value of "0," meaning that all eight corresponding voxels do not include the point, are deleted.

[0091] In this way, a tree structure (octree) is constructed, comprising nodes with values ​​other than "0". That is, the octree indicates whether a point is present or absent in a voxel at each resolution. By performing octree transformation and encoding, the location information is decoded from the highest resolution (highest level) to the desired level (resolution), allowing the recovery of point cloud data at that resolution. In other words, decoding can be easily performed at any resolution without decoding information at unnecessary levels (resolutions). In other words, voxel (resolution) scalability can be achieved.

[0092] Furthermore, as mentioned above, by omitting nodes with a value of "0", the resolution of voxels in regions where no points exist can be reduced, thereby further suppressing the increase in information content (generally, reducing information content).

[0093] <Enhancement>

[0094] On the other hand, when attribute data (attribute information) is encoded, it is assumed that geometric data (location information), including the degradation caused by encoding, is known, and encoding is performed using the positional relationships between points. As a method for encoding such attribute data, methods using Region Adaptive Hierarchical Transformation (RAHT) or a transformation called lifting, as described in Non-Patent Document 2, have been considered. By applying these techniques, attribute data can be hierarchically structured like an octree of geometric data.

[0095] For example, in the case of the improvement described in Non-Patent Document 2, the attribute data is hierarchically structured by recursively repeating the process of setting points as either reference points or prediction points. Then, based on this hierarchical structure, the predicted values ​​of the attribute data of the prediction points are derived using the attribute data of the reference points, and the difference between the predicted values ​​and the attribute data is encoded.

[0096] For example, in Figure 1 In this case, assume point P5 is chosen as the reference point. Then, a search for prediction points is performed within a circular region of radius R centered at point P5. Since point P9 is located within this region, point P9 is set as the prediction point (using point P5 as the reference point from which prediction values ​​are derived).

[0097] Through such processing, for example, the corresponding differences between points P7 to P9 indicated by white circles, the corresponding differences between points P1, P3 and P6 indicated by diagonal lines, and the corresponding differences between points P0, P2, P4 and P5 indicated by gray circles are derived as differences at different levels.

[0098] Note that although the point cloud is arranged in three-dimensional space and the above processing is actually performed in three-dimensional space, in Figure 1 For ease of description, a two-dimensional plane is used to schematically illustrate three-dimensional space. That is, referring to... Figure 1 The descriptions used can be similarly applied to processing and phenomena in three-dimensional space.

[0099] In the following description, three-dimensional space is appropriately described using a two-dimensional plane. Unless otherwise stated, the description can be applied similarly to treatments, phenomena, etc., in three-dimensional space.

[0100] For example, the selection of reference points in this hierarchy has already been performed according to the Morton order. For example, as... Figure 2 In the tree structure shown, when a reference point is selected from multiple nodes at a certain level and set as a node at a higher level, a search of multiple nodes is performed according to Morton's order, and the first node to appear is selected as the reference point. Figure 2 In the diagram, each circle represents a node, and a black circle represents a node selected as a reference point (i.e., the node is selected as a higher-level node). Figure 2 In this context, the corresponding nodes are sorted from left to right according to Merton's order. That is to say, in... Figure 2 In the example case, the leftmost node is always selected.

[0101] On the other hand, in such a hierarchical structure of attribute data, non-patent literature 4 proposes the following method: alternating between the first and last points in the Morton order from the candidates for reference points at each level. That is, as... Figure 3 In the example, at the LoD N level, the first node in the Morton order is chosen as the reference point, and at the next level (LoD N-1), the last node in the Morton order is chosen as the reference point.

[0102] Figure 4 An example of how to choose a reference point in three-dimensional space is shown using a two-dimensional plane. Figure 4 Each square in A indicates a voxel in a certain level. Additionally, circles indicate candidate reference points as processing targets. For example, in... Figure 4 In the case of selecting a reference point from the 2×2 points shown in Figure A, the first point of the Morton sequence (the gray point) is selected as the reference point. In the case of... Figure 4 In the higher level shown in B, the last point (gray point) in the 2×2 points according to the Morton order is selected as the reference point. Furthermore, in... Figure 4 In the higher level shown in C, the first point (gray point) in the 2×2 points according to the Morton order is selected as the reference point.

[0103] Figure 4 The corresponding arrows shown in A to C indicate the movement of the reference point. In this case, the range of movement of the reference point is limited as described by... Figure 4 The narrow range indicated by the dashed box shown in C suppresses the reduction in prediction accuracy.

[0104] However, when like in Figure 4 In the same situation Figure 5 When a reference point is selected at the location shown, the position of the reference point is like... Figure 5 They move in the same way from A to C. Figure 4 Using a two-dimensional plane illustrates another example of how a reference point can be chosen in three-dimensional space. That is, as shown by... Figure 5 The dashed box in Figure C indicates the range of movement of the reference point compared to... Figure 4 In such cases, the reference point can move a wide range, and the prediction accuracy may decrease.

[0105] As mentioned above, in the method described in Non-Patent Document 4, the prediction accuracy decreases depending on the location of the point, and the coding efficiency may also decrease.

[0106] <Methods for setting reference points>

[0107] Therefore, for example, as in Figure 6In Method 1, shown in the top row of the table, the centroid of a point can be obtained within the hierarchy of attribute data, and a reference point can be set based on the centroid. For example, a point close to the derived centroid can be selected as the reference point.

[0108] In addition, for example, as in Figure 6 In Method 2, shown in the second row from the top of the table, reference points can be selected based on the distribution pattern (distribution method) of the points in the hierarchical structure of the attribute data.

[0109] In addition, for example, as in Figure 6 In method 3, shown in the third row from the top of the table, information about the setting of the reference point can be transferred from the encoding side to the decoding side in the hierarchical structure of the attribute data.

[0110] In addition, for example, as in Figure 6 In Method 4, shown in the fourth row from the top of the table, in the hierarchical structure of the attribute data, points close to the center of the bounding box and points far from the center of the bounding box can be alternately selected as reference points for each level.

[0111] By applying any of these methods, the reduction in encoding efficiency can be suppressed. Note that the methods described above can be applied in any combination. Furthermore, each of the methods described above can be applied to the encoding or decoding of attribute data compatible with scalable decoding, and can also be applied to the encoding or decoding of attribute data incompatible with scalable decoding.

[0112] <2. First Implementation Method>

[0113] Method 1

[0114] The application of "Method 1" described above will be described. In the case of "Method 1", the centroid of the point is derived, and a reference point is selected based on the centroid. Any point can be set as the reference point relative to the derived centroid. For example, a point closer to the derived centroid (e.g., a point located closer to the centroid) can be selected as the reference point.

[0115] Figure 7 Figure A shows an example of a target area where a reference point is set. Figure 7 In A, squares indicate voxels, and circles indicate points. That is to say, Figure 7 A is a diagram illustrating an example of a voxel structure in three-dimensional space using a two-dimensional plane. For example, when assuming as Figure 7 When points A to C, arranged as shown in Figure A, are selected as candidates for reference points, the following can be chosen: Figure 7 Point B, which is close to the centroid of these candidates, is shown as a reference point. Figure 7 B shows the relationship with Figure 2This is a hierarchical structure of similar attribute data, with black circles indicating reference points. In other words, point B is chosen as the reference point for the route from point A to point C.

[0116] By referencing points close to the centroid in this way, points close to even more other points can be set as reference points. Therefore, in short, reference points can be set to suppress the decrease in prediction accuracy from more prediction points, and also to suppress the decrease in coding efficiency.

[0117] <Methods for exporting centroids>

[0118] The method for deriving the centroid is arbitrary. For example, the centroid of any point can be used to select a reference point. For instance, the centroid of points within a predetermined range can be derived, and this centroid can be used to select a reference point. In this way, an increase in the number of points used to derive the centroid can be suppressed, and an increase in load can be suppressed.

[0119] The range of points used to derive the centroid (also known as the centroid deriving target range) can be any range. For example, it can be like... Figure 8 The table “Methods for Deriving Centroids” shown above, starting from the second row, derives candidate centroids of reference points in the same way as method (1). That is, for example, as... Figure 9 As shown in A, a voxel region containing 2×2×2 voxels that contains candidate points to be used as reference points can be set as the centroid-derived target range. Figure 9 In A, a 2×2×2 voxel region in three-dimensional space is schematically shown on a two-dimensional plane (as a 2×2 square). In this case, the derivation is... Figure 9 The centroids of the three points in A, indicated by circles, are used to set the reference point.

[0120] In this way, since it is sufficient to derive the centroid of the point (reference point candidate) that is the processing target, there is no need to search for other points, and the centroid can be easily derived.

[0121] Note that the voxel region to be set as the centroid-derived target range is arbitrary and not limited to 2×2×2. For example, the centroid of a point located in a voxel region of N×N×N (N>=2) can be derived. That is, an N×N×N voxel region can be set as the centroid-derived target range.

[0122] For example, such as Figure 10 As shown in A, the voxel region to be set as the centroid-derived target range (in Figure 10 The voxel region indicated by the thick line in A) and the voxel region that serves as the target from which the reference point is derived (in Figure 10The voxel regions (indicated by gray in A) can be in the same location (the two ranges can match perfectly). Note that in Figure 10 The image schematically illustrates voxel regions actually configured in three-dimensional space on a two-dimensional plane. Furthermore, in... Figure 10 In A, for ease of description, the target range derived from the centroid and the voxel region of the target as the reference point from which it is derived are shown as slightly offset from each other, but are shown such that the two ranges actually match perfectly.

[0123] In addition, for example, such as Figure 10 As shown in B, the voxel region to be set as the centroid-derived target range (in Figure 10 The voxel region indicated by the thick line in B can be compared to the target voxel region from which the reference point is derived (in Figure 10 (The voxel region indicated in gray in B) is wide. Figure 10 In the example in B, a 4×4×4 voxel region is set as the centroid-derived target range.

[0124] In addition, for example, such as Figure 10 As shown in C, the voxel region to be set as the centroid-derived target range (in Figure 10 The center of the voxel region (in C, indicated by the thick line) can be aligned with the target voxel region (in the C) that serves as the reference point from which it is derived. Figure 10 The centers of the voxel regions (indicated by gray in C) are not uniform. That is, the centroid-derived target range can extend unevenly in a predetermined direction relative to the voxel region from which the target is derived. For example, near the edge of a bounding box, the extension of the centroid-derived target range can be biased in this way to prevent it from protruding from the bounding box.

[0125] In addition, for example, such as Figure 8 In the table “Methods for Deriving Centroids” shown above, in the third row from the top, method (2) can obtain the centroids of N nearby points. That is, for example, as... Figure 9 As shown in B, N points can be searched from one side of the center coordinates of a voxel region comprising 2×2×2 voxels that is closer to the point where the candidate reference point exists, and the centroids of the N points can be derived. Figure 9 In Figure B, the distribution of points actually arranged in three-dimensional space is schematically shown on a two-dimensional plane. Furthermore, the black circles indicate the center coordinates of the voxel regions from which the reference points are derived. That is, N points (white circles) are selected sequentially from the side closer to the black circles, and their centroids are derived.

[0126] In this way, the number of points to be searched can be limited to N, and thus the increase in load caused by the search can be suppressed.

[0127] In addition, for example, such as Figure 8 In method (3) shown in the fourth row from the top of the table “Methods for Deriving the Centroid”, candidate reference points (points existing in a voxel region including 2×2×2 voxels) can be excluded from the N nearby points derived by method (2). That is, as Figure 9 As shown in C, the target range can be derived from the centroid to exclude a 2×2×2 voxel region, and the centroid of points located outside the 2×2×2 voxel region can be derived. Figure 9 In C, the distribution of points that are actually arranged in three-dimensional space is schematically shown on a two-dimensional plane.

[0128] In addition, for example, such as Figure 8 In the method (4) shown in the fifth row from the top of the table "Methods for Deriving Centroids", the centroid of a point in a region of radius r centered on the center coordinates of a voxel region comprising 2×2×2 voxels in which candidate points of reference points exist can be derived. That is, in this case, for example, as Figure 9 As shown in D, derive the centroid of the point located in a region of radius r centered at the center coordinates of a voxel region comprising 2×2×2 voxels, indicated by the dashed box. Note that in Figure 9 In D, the distribution of points actually arranged in three-dimensional space is schematically shown on a two-dimensional plane. Furthermore, black circles indicate the center coordinates of the voxel region from which the target is derived, and white circles indicate points in a region of radius r centered on the center coordinates of a voxel region comprising 2×2×2 voxels.

[0129] In this way, this technology can also be applied to scalable, incompatible lifts that do not use voxel structures.

[0130] <Methods for selecting reference points>

[0131] In the case of "Method 1" as described above, for example, a point close to the derived centroid can be set as the reference point. If multiple points are present, any one of these points can be selected as the reference point. The method of selection is arbitrary. For example, it can be done according to... Figure 11 Each of the methods shown in the table "Methods for Selecting Reference Points from Multiple Candidates" in the document is used to set a reference point.

[0132] For example, such as Figure 12 As shown in A, there may be multiple points that are equidistant from each other's centroid. Furthermore, for example, as... Figure 12 As shown in B, to suppress the increase in load caused by computation, for example, all points located sufficiently close to each other can be assumed to be "points close to the centroid". Figure 12 In case B, all points within a radius Dth from the centroid are considered "points close to the centroid". In this case, there may be multiple "points close to the centroid".

[0133] In such cases, for example, Figure 11 In the method (1) shown in the second row from the top of the table “Method for Selecting Reference Points from Multiple Candidates”, the first point to be processed can be selected according to a predetermined search order.

[0134] In addition, for example, such as Figure 11 In method (2) shown in the third row from the top of the table “Methods for Selecting Reference Points from Multiple Candidates”, it is possible to select either the first or last point to be processed according to a predetermined search order. For example, it is possible to switch between selecting the first or last point to be processed according to a predetermined search order for each level.

[0135] In addition, for example, such as Figure 11 In the method (3) shown in the fourth row from the top of the table “Method for Selecting Reference Points from Multiple Candidates”, the point to be processed can be selected in the middle (number / 2) according to a predetermined search order.

[0136] In addition, for example, such as Figure 11 In method (4) shown in the fifth row from the top of the table "Method for Selecting Reference Points from Multiple Candidates", points to be processed can be selected in a predetermined search order. That is, the Nth point to be processed can be selected in a predetermined search order. This predetermined search order (N) can be predetermined or can be set by the user, application, etc. Furthermore, if a rank can be set, information about the predetermined order (N) can be notified (transmitted) using signaling.

[0137] As in methods (1) to (4) above, when there are multiple candidates with substantially the same conditions relative to the centroid, a reference point can be set from the multiple candidates based on a predetermined search order.

[0138] Note that the search order is arbitrary. For example, the order can be Morton's order or an order different from Morton's order. Furthermore, the search order can be predefined by standards, or it can be set by users, applications, etc. Where the search order can be set, information about the search order can be communicated via signaling (transmission).

[0139] In addition, for example, such as Figure 11In the method (5) shown in the sixth row from the top of the table "Methods for Selecting Reference Points from Multiple Candidates", the target range for centroid export can be set to a wider range, and centroids of points within the target range of the centroid export can be exported over a wider range. Points can be selected using the newly exported centroids. In other words, centroids can be exported again by changing the conditions.

[0140] <Encoding device>

[0141] Next, the apparatus for applying this technology will be described. Figure 13 This is a block diagram illustrating an example configuration of an encoding device as an aspect of an information processing apparatus for applying the present technology (“Method 1”). Figure 13 The encoding device 100 shown is an apparatus for encoding point clouds (3D data). The encoding device 100 encodes the point cloud by applying the technique described in this embodiment.

[0142] Note that, although Figure 13 It shows key elements such as processing units and data flows, but Figure 13 The elements shown may not necessarily include all elements. That is, in the encoding device 100, there may be... Figure 13 Processing units not shown as blocks, or may exist within them. Figure 13 Processing or data flow not shown as arrows, etc.

[0143] like Figure 13 As shown, the encoding device 100 includes a location information encoding unit 101, a location information decoding unit 102, a point cloud generation unit 103, an attribute information encoding unit 104, and a bit stream generation unit 105.

[0144] The location information encoding unit 101 encodes the geometric data (location information) of the point cloud (3D data) input to the encoding device 100. The encoding method is arbitrary, as long as it is compatible with scalable decoding. For example, the location information encoding unit 101 hierarchically divides the geometric data to generate an octree and encodes the octree. Furthermore, for example, processing such as filtering or quantization for noise suppression (denoising) can be performed. The location information encoding unit 101 provides the encoded geometric data to the location information decoding unit 102 and the bitstream generation unit 105.

[0145] The location information decoding unit 102 acquires encoded data from the geometric data provided by the location information encoding unit 101 and decodes the encoded data. The decoding method is arbitrary, as long as it corresponds to the encoding method of the location information encoding unit 101. For example, processing such as filtering for denoising or inverse quantization can be performed. The location information decoding unit 102 provides the generated geometric data (decoded result) to the point cloud generation unit 103.

[0146] The point cloud generation unit 103 acquires attribute data (attribute information) of the point cloud input to the encoding device 100 and geometric data (decoding result) provided by the position information decoding unit 102. The point cloud generation unit 103 performs a process (recoloring process) to match the attribute data with the geometric data (decoding result). The point cloud generation unit 103 provides the attribute data (decoding result) corresponding to the geometric data to the attribute information encoding unit 104.

[0147] The attribute information encoding unit 104 acquires geometric data (decoding result) and attribute data provided by the point cloud generation unit 103. The attribute information encoding unit 104 uses the geometric data (decoding result) to encode the attribute data and generates encoded attribute data.

[0148] At this time, the attribute information encoding unit 104 applies the technique (method 1) described above to encode the attribute data. The attribute information encoding unit 104 provides the encoded attribute data to the bitstream generation unit 105.

[0149] The bitstream generation unit 105 acquires encoded data from the geometric data provided by the position information encoding unit 101. Additionally, the bitstream generation unit 105 acquires encoded data from the attribute data provided by the attribute information encoding unit 104. The bitstream generation unit 105 generates a bitstream including the encoded data. The bitstream generation unit 105 outputs the generated bitstream to the outside of the encoding device 100.

[0150] Using this configuration, the encoding device 100 can obtain the centroid of points in the hierarchical structure of the attribute data and set reference points based on these centroids. By referencing points close to the centroid in this way, points close to even more other points can be set as reference points. Therefore, in short, reference points can be set to suppress the decrease in prediction accuracy for more prediction points and to suppress the decrease in encoding efficiency.

[0151] Note that each of these processing units (position information encoding unit 101 to bit stream generation unit 105) in the encoding device 100 has an arbitrary configuration. For example, each processing unit can be configured by logic circuitry to implement the above-described processing. Furthermore, each processing unit may include, for example, a central processing unit (CPU), read-only memory (ROM), random access memory (RAM), etc., and use them to execute programs to implement the above-described processing. Of course, each processing unit can have two configurations, and can implement one part of the above-described processing by logic circuitry and another part by executing a program. The configurations of the processing units can be independent of each other, and for example, one part of a processing unit can implement one part of the above-described processing by logic circuitry, another part of a processing unit can implement the above-described processing by executing a program, and yet another processing unit can implement the above-described processing by both program execution and logic circuitry.

[0152] <Attribute Information Encoding Unit>

[0153] Figure 14 This shows the attribute information encoding unit 104 ( Figure 13 A block diagram of the main configuration example. Note that, although Figure 14 It shows key elements such as processing units and data flows, but Figure 14 The elements shown may not necessarily include all elements. That is, within the attribute information encoding unit 104, there may be... Figure 14 Processing units not shown as blocks, or may exist within them. Figure 14 Processing or data flow not shown as arrows, etc.

[0154] like Figure 14 As shown, the attribute information encoding unit 104 includes a hierarchical processing unit 111, a quantization unit 112, and an encoding unit 113.

[0155] The layering processing unit 111 performs processing related to the layering of attribute data. For example, the layering processing unit 111 acquires attribute data and geometric data (decoding results) provided by the point cloud generation unit 103. The layering processing unit 111 uses the geometric data to layer the attribute data. At this time, the layering processing unit 111 performs layering by applying the above-described technique (method 1). That is, the layering processing unit 111 derives the centroid of the points in each layer and selects a reference point based on the centroid. Then, the layering processing unit 111 sets the reference relationship in each layer of the layered structure, derives the predicted value of the attribute data of each prediction point based on the reference relationship using the attribute data of the reference point, and derives the difference between the attribute data and the predicted value. The layering processing unit 111 provides the layered attribute data (difference) to the quantization unit 112.

[0156] At this time, the hierarchical processing unit 111 can also generate control information about the hierarchical structure. The hierarchical processing unit 111 can also provide the generated control information together with the attribute data (difference) to the quantization unit 112.

[0157] Quantization unit 112 acquires attribute data (differences) and control information provided by hierarchical processing unit 111. Quantization unit 112 quantizes the attribute data (differences). The quantization method is arbitrary. Quantization unit 112 provides the quantized attribute data (differences) and control information to encoding unit 113.

[0158] Encoding unit 113 acquires the quantized attribute data (difference) and control information provided by quantization unit 112. Encoding unit 113 encodes the quantized attribute data (difference) and generates encoded attribute data. The encoding method is arbitrary. Furthermore, encoding unit 113 includes control information in the generated encoded data. In other words, it generates encoded attribute data that includes control information. Encoding unit 113 provides the generated encoded data to bitstream generation unit 105.

[0159] By performing layering as described above, the attribute information encoding unit 104 can set points close to the centroid as reference points, and thus can set points close to even more other points as reference points. Therefore, in short, reference points can be set to suppress the decrease in prediction accuracy for more prediction points, and can also suppress the decrease in coding efficiency.

[0160] Note that these processing units (layered processing units 111 to encoding units 113) have arbitrary configurations. For example, each processing unit can be configured with logic circuitry to implement the above-described processing. Furthermore, each processing unit may include, for example, a CPU, ROM, RAM, etc., and use them to execute programs to implement the above-described processing. Of course, each processing unit can have two configurations, and can implement one part of the above-described processing through logic circuitry and another part through program execution. The configurations of the processing units can be independent of each other, and for example, one part of a processing unit can implement a portion of the above-described processing through logic circuitry, another part of a processing unit can implement the above-described processing through program execution, and yet another processing unit can implement the above-described processing through both program execution and logic circuitry.

[0161] <Hierarchical Processing Unit>

[0162] Figure 15 This shows the hierarchical processing unit 111 ( Figure 14 A block diagram of the main configuration example. Note that, although Figure 15 It shows key elements such as processing units and data flows, but Figure 15The elements shown may not necessarily include all elements. That is, within the hierarchical processing unit 111, there may be... Figure 15 Processing units not shown as blocks, or may exist within them. Figure 15 Processing or data flow not shown as arrows, etc.

[0163] like Figure 15 As shown, the layered processing unit 111 includes a reference point setting unit 121, a reference relationship setting unit 122, an inversion unit 123, and a weighted value derivation unit 124.

[0164] The reference point setting unit 121 performs processing related to the setting of reference points. For example, the reference point setting unit 121 classifies a set of points that are the processing target into reference points for the attribute data and prediction points for deriving predicted values ​​of the attribute data based on the geometric data of each point. That is, the reference point setting unit 121 sets reference points and prediction points. The reference point setting unit 121 recursively repeats this processing relative to the reference points. That is, the reference point setting unit 121 uses the reference points set in the previous level that is the processing target to set reference points and prediction points in the level that is the processing target. In this way, a hierarchical structure is constructed. That is, the attribute data is hierarchically structured. The reference point setting unit 121 provides information indicating the setting of reference points and prediction points for each level to the reference relationship setting unit 122.

[0165] The reference relationship setting unit 122 performs processing related to setting reference relationships at each level based on information provided from the reference point setting unit 121. That is, the reference relationship setting unit 122 sets a reference point (i.e., a reference destination) to be referenced for each prediction point at each level to derive predicted values. Then, the reference relationship setting unit 122 derives predicted values ​​of the attribute data for each prediction point based on the reference relationships. In other words, the reference relationship setting unit 122 uses the attribute data of the reference point set as the reference destination to derive predicted values ​​of the attribute data for the prediction points. Furthermore, the reference relationship setting unit 122 derives the difference between the derived predicted values ​​and the attribute data of the predicted points. The reference relationship setting unit 122 provides the derived difference (the hierarchical attribute data) to the inversion unit 123 at each level.

[0166] Note that the reference point setting unit 121 can generate control information such as hierarchical information about attribute data as described above, provide the control information such as to the quantization unit 112, and transmit the control information such as to the decoding side.

[0167] The inversion unit 123 performs processing related to the inversion of levels. For example, the inversion unit 123 obtains hierarchical attribute data provided by the reference relationship setting unit 122. In the attribute data, the information of each level is hierarchically arranged according to the order of generation. The inversion unit 123 inverts the hierarchy of the attribute data. For example, the inversion unit 123 assigns a level number (a number used to identify the level, where the value increments by 1 whenever the highest level decreases by 0 or 1 and the lowest level has the maximum value) to each level of the attribute data in the reverse order of generation, such that the generation order is from the lowest level to the highest level. The inversion unit 123 provides the hierarchically inverted attribute data to the weighted value derivation unit 124.

[0168] The weighted value derivation unit 124 performs weighting-related processing. For example, the weighted value derivation unit 124 acquires attribute data provided by the inversion unit 123. The weighted value derivation unit 124 derives the weighted values ​​of the acquired attribute data. The method of deriving the weighted values ​​is arbitrary. The weighted value derivation unit 124 provides the attribute data (difference) and the derived weighted values ​​to the quantization unit 112. Figure 14 Furthermore, the weighted value derivation unit 124 can provide the derived weighted values ​​as control information to the quantization unit 112 and transmit the weighted values ​​to the decoding side.

[0169] In the aforementioned layered processing unit 111, the present technology can be applied to the reference point setting unit 121. That is, the reference relationship setting unit 122 can apply "Method 1" to derive the centroid of the point and set reference points based on the centroid. In this way, the reduction in prediction accuracy and coding efficiency can be suppressed.

[0170] Note that this layering process is arbitrary. For example, the processing of the reference point setting unit 121 and the processing of the reference relationship setting unit 122 can be performed in parallel. For example, the reference point setting unit 121 can set reference points and prediction points for each layer, and the reference relationship setting unit 122 can set reference relationships.

[0171] Note that these processing units (reference point setting unit 121 to weighted value derivation unit 124) have arbitrary configurations. For example, each processing unit can be configured by logic circuitry to implement the above-described processing. Furthermore, each processing unit may include, for example, a CPU, ROM, RAM, etc., and use them to execute a program to implement the above-described processing. Of course, each processing unit can have two configurations, and can implement one part of the above-described processing by logic circuitry and another part by executing a program. The configurations of the processing units can be independent of each other, and for example, one part of a processing unit can implement one part of the above-described processing by logic circuitry, another part of a processing unit can implement the above-described processing by executing a program, and yet another processing unit can implement the above-described processing by both program execution and logic circuitry.

[0172] <Encoding Process>

[0173] Next, the processing performed by the encoding device 100 will be described. The encoding device 100 encodes the point cloud data by performing encoding processing. (Refer to...) Figure 16 The flowchart below illustrates an example of the encoding process.

[0174] When the encoding process begins, in step S101, the position information encoding unit 101 of the encoding device 100 encodes the geometric data (position information) of the input point cloud and generates encoded data of the geometric data.

[0175] In step S102, the location information decoding unit 102 decodes the encoded data of the geometric data generated in step S101 and generates location information.

[0176] In step S103, the point cloud generation unit 103 uses the attribute data (attribute information) of the input point cloud and the geometric data (decoding result) generated in step S102 to perform recoloring processing and associate the attribute data with the geometric data.

[0177] In step S104, the attribute information encoding unit 104 performs attribute information encoding processing to encode the attribute data that underwent recoloring processing in step S103, and generates encoded attribute data. At this time, the attribute information encoding unit 104 performs the processing by applying the aforementioned technique (method 1). For example, in the layering of attribute data, the attribute information encoding unit 104 derives the centroid of the points and sets a reference point based on the centroid. Details of the attribute information encoding processing will be described later.

[0178] In step S105, the bitstream generation unit 105 generates and outputs a bitstream, which includes encoded data of the geometric data generated in step S101 and encoded data of the attribute data generated in step S104.

[0179] When step S105 is completed, the encoding process ends.

[0180] By performing each step of the processing in this manner, the encoding device 100 can suppress the decrease in prediction accuracy and the decrease in encoding efficiency.

[0181] <Attribute Information Encoding Process>

[0182] Next, we will refer to Figure 17 The flowchart is described in Figure 16 An example of the attribute information encoding process performed in step S104.

[0183] When attribute information encoding processing begins, the layering processing unit 111 of the attribute information encoding unit 104 layers the attribute data by executing the layering processing in step S111. That is, reference points and prediction points are set for each layer, and reference relationships are also established. At this time, the layering processing unit 111 performs layering by applying the above-described technique (method 1). For example, in the layering of attribute data, the attribute information encoding unit 104 derives the centroids of the points and sets reference points based on the centroids. Details of the layering processing will be described later.

[0184] In step S112, the hierarchical processing unit 111 derives the predicted value of the attribute data of each prediction point in each level of the attribute data hierarchically processed in step S111, and derives the difference between the attribute data of the prediction point and the predicted value.

[0185] In step S113, the quantization unit 112 quantizes each difference derived in step S112.

[0186] In step S114, the encoding unit 113 encodes the difference that was quantized in step S112 and generates encoded data of the attribute data.

[0187] When step S114 is completed, the attribute information encoding process ends, and the process returns to... Figure 16 .

[0188] By performing each step of the processing in this manner, the hierarchical processing unit 111 can apply "Method 1" described above to derive the centroids of points in the hierarchical structure of the attribute data and set reference points based on the centroids. Therefore, the hierarchical processing unit 111 can hierarchically process the attribute data to suppress the reduction in prediction accuracy and thus suppress the reduction in coding efficiency.

[0189] <Hierarchical Processing Flow>

[0190] Next, we will refer to Figure 18The flowchart is described in Figure 17 An example of the hierarchical processing flow performed in step S111.

[0191] When the layering process begins, in step S121, the reference point setting unit 121 of the layering process unit 111 sets the value of the variable LoD index (LoD Index) indicating the layer as the processing target to an initial value (e.g., "0").

[0192] In step S122, the reference point setting unit 121 performs reference point setting processing and sets the reference points in the hierarchy as processing targets (i.e., prediction points are also set). The details of the reference point setting processing will be described later.

[0193] In step S123, the reference relationship setting unit 122 sets the reference relationship as the level of the processing target (which reference point to refer to when deriving the predicted value of each prediction point).

[0194] In step S124, the reference point setting unit 121 increments the LoD index and sets the processing target to the next level.

[0195] In step S125, the reference point setting unit 121 determines whether all points have been processed. If it is determined that there are unprocessed points, that is, if it is determined that the layering is not complete, the processing returns to step S122 and repeats the processing of step S122 and subsequent steps. As described above, the processing of steps S122 to S125 is performed for each layer, and if it is determined in step S125 that all points have been processed, the processing proceeds to step S126.

[0196] In step S126, the inversion unit 123 reverses the hierarchy of the attribute data generated as described above and assigns a hierarchy number to each hierarchy in the opposite direction of the generation order.

[0197] In step S127, the weighted value derivation unit 124 derives the weighted values ​​of the attribute data for each level.

[0198] When the processing in step S127 is completed, the process returns to... Figure 14 .

[0199] By performing each step of the processing in this manner, the hierarchical processing unit 111 can apply "Method 1" described above to derive the centroids of points in the hierarchical structure of the attribute data and set reference points based on the centroids. Therefore, the hierarchical processing unit 111 can hierarchically process the attribute data to suppress the reduction in prediction accuracy and thus suppress the reduction in coding efficiency.

[0200] <Flowchart for Setting and Handling Reference Relationships>

[0201] Next, we will refer to Figure 19 The flowchart is described in Figure 18 An example of the reference point setting process performed in step S122.

[0202] When the reference point setting process begins, in step S141, the reference point setting unit 121 specifies a set of points for deriving the centroid, and derives the centroid of the set of points that are the processing target. As described above, the method for deriving the centroid is arbitrary. For example, it can be used... Figure 8 Derivation of the centroid using any of the methods shown in the table.

[0203] In step S142, the reference point setting unit 121 selects a point close to the centroid derived in step S141 as a reference point. The method for selecting the reference point is arbitrary. For example, it can use... Figure 11 Derivation of the centroid using any of the methods shown in the table.

[0204] When step S142 is completed, the reference point setting process ends, and the process returns to... Figure 18 .

[0205] By performing each step of the processing in this manner, the reference point setting unit 121 can apply "Method 1" described above to derive the centroids of the points in the hierarchy of the attribute data and set reference points based on the centroids. Therefore, the hierarchy processing unit 111 can hierarchically process the attribute data to suppress the reduction in prediction accuracy and thus suppress the reduction in coding efficiency.

[0206] <Decoding device>

[0207] Next, another example of a device that applies this technology will be described. Figure 20 This is a block diagram illustrating an example configuration of a decoding device as an aspect of an information processing apparatus to which this technology is applied. Figure 20 The decoding apparatus 200 shown is an apparatus for decoding encoded data of point cloud (3D data). The decoding apparatus 200 decodes the encoded data of the point cloud by applying the present technology (method 1) described in this embodiment.

[0208] Note that, although Figure 20 It shows key elements such as processing units and data flows, but Figure 20 The elements shown may not necessarily include all elements. That is, in the decoding device 200, there may be... Figure 20 Processing units not shown as blocks, or may exist within them. Figure 20 Processing or data flow not shown as arrows, etc.

[0209] like Figure 20As shown, the decoding device 200 includes an encoded data extraction unit 201, a location information decoding unit 202, an attribute information decoding unit 203, and a point cloud generation unit 204.

[0210] The encoded data extraction unit 201 acquires and maintains the bitstream input to the decoding device 200. The encoded data extraction unit 201 extracts encoded data of geometric data (position information) and attribute data (attribute information) from the maintained bitstream. The encoded data extraction unit 201 provides the encoded data of the extracted geometric data to the position information decoding unit 202. The encoded data extraction unit 201 provides the encoded data of the extracted attribute data to the attribute information decoding unit 203.

[0211] The location information decoding unit 202 acquires the encoded data of the geometric data provided by the encoded data extraction unit 201. The location information decoding unit 202 decodes the encoded data of the geometric data and generates geometric data (decoding result). The decoding method is arbitrary, as long as it is similar to the method used in the location information decoding unit 102 of the encoding device 100. The location information decoding unit 202 provides the generated geometric data (decoding result) to the attribute information decoding unit 203 and the point cloud generation unit 204.

[0212] The attribute information decoding unit 203 acquires the encoded attribute data provided by the encoded data extraction unit 201. The attribute information decoding unit 203 acquires the geometric data (decoding result) provided by the position information decoding unit 202. The attribute information decoding unit 203 uses the position information (decoding result) to decode the encoded attribute data using the method described above (Method 1), and generates attribute data (decoding result). The attribute information decoding unit 203 provides the generated attribute data (decoding result) to the point cloud generation unit 204.

[0213] Point cloud generation unit 204 acquires geometric data (decoding result) provided by position information decoding unit 202. Point cloud generation unit 204 acquires attribute data (decoding result) provided by attribute information decoding unit 203. Point cloud generation unit 204 uses geometric data (decoding result) and attribute data (decoding result) to generate a point cloud (decoding result). Point cloud generation unit 204 outputs the generated point cloud (decoding result) data to the outside of decoding device 200.

[0214] Using this configuration, the decoding device 200 can select a point close to the centroid of the nearest point as a reference point in the de-stratification process. Therefore, for example, the decoding device 200 can correctly decode the encoded data of the attribute data encoded by the encoding device 100 described above. Thus, the reduction in prediction accuracy and the reduction in encoding efficiency can be suppressed.

[0215] Note that these processing units (encoded data extraction unit 201 to point cloud generation unit 204) have arbitrary configurations. For example, each processing unit can be configured with logic circuitry to implement the above-described processing. Furthermore, each processing unit may include, for example, a CPU, ROM, RAM, etc., and use them to execute programs to implement the above-described processing. Of course, each processing unit can have two configurations, and can implement one part of the above-described processing through logic circuitry and another part through program execution. The configurations of the processing units can be independent of each other, and for example, one part of a processing unit can implement a portion of the above-described processing through logic circuitry, another part of a processing unit can implement the above-described processing through program execution, and yet another processing unit can implement the above-described processing through both program execution and logic circuitry.

[0216] <Attribute Information Decoding Unit>

[0217] Figure 21 This shows the attribute information decoding unit 203 ( Figure 20 A block diagram of the main configuration example. Note that, although Figure 21 It shows key elements such as processing units and data flows, but Figure 21 The elements shown may not necessarily include all elements. That is to say, within the attribute information decoding unit 203, there may be... Figure 21 Processing units not shown as blocks, or may exist within them. Figure 21 Processing or data flow not shown as arrows, etc.

[0218] like Figure 21 As shown, the attribute information decoding unit 203 includes a decoding unit 211, an inverse quantization unit 212, and a de-layering processing unit 213.

[0219] Decoding unit 211 performs processing related to decoding the encoded data of the attribute data. For example, decoding unit 211 acquires the encoded data of the attribute data provided to attribute information decoding unit 203.

[0220] Decoding unit 211 decodes the encoded attribute data and generates attribute data (decoding result). The decoding method is arbitrary, as long as it is compatible with the encoding unit 113 of encoding device 100. Figure 14 The encoding method can be the same as the method used for encoding. Furthermore, the generated attribute data (decoding result) corresponds to the attribute data before encoding; it is the difference between the attribute data and its predicted value and is quantized. Decoding unit 211 provides the generated attribute data (decoding result) to inverse quantization unit 212.

[0221] Note that when the encoded attribute data includes control information about weighting values ​​and control information about the hierarchical structure of the attribute data, the decoding unit 211 will also provide control information to the inverse quantization unit 212.

[0222] The inverse quantization unit 212 performs processing related to the inverse quantization of the attribute data. For example, the inverse quantization unit 212 acquires the attribute data (decoding result) and control information provided by the decoding unit 211.

[0223] The inverse quantization unit 212 performs inverse quantization on the attribute data (decoding result). At this time, while receiving control information about the weighted values ​​from the decoding unit 211, the inverse quantization unit 212 also acquires control information and performs inverse quantization on the attribute data (decoding result) based on the control information (using the weighted values ​​derived from the control information).

[0224] Furthermore, when the decoding unit 211 provides control information about the hierarchical structure of the attribute data, the inverse quantization unit 212 also obtains control information.

[0225] The inverse quantization unit 212 provides the inverse-quantized attribute data (decoding result) to the de-stratification processing unit 213. Furthermore, if control information regarding the layering of the attribute data is obtained from the decoding unit 211, the inverse quantization unit 212 also provides this control information to the de-stratification processing unit 213.

[0226] The de-stratification processing unit 213 acquires the dequantized attribute data (decoding result) provided by the inverse quantization unit 212. As described above, the attribute data is a difference. Furthermore, the de-stratification processing unit 213 acquires the geometric data (decoding result) provided by the position information decoding unit 202. The de-stratification processing unit 213 performs de-stratification using the geometric data; this de-stratification is performed on the acquired attribute data (difference) by the de-stratification processing unit 111 of the encoding device 100. Figure 14 The layered inverse processing is executed.

[0227] De-stratification will be described here. For example, the de-stratification processing unit 213, based on the geometric data provided from the location information decoding unit 202, de-stratifies the attribute data using a method similar to that of the encoding device 100 (stratification processing unit 111). That is, the de-stratification processing unit 213 sets reference points and prediction points for each layer based on the decoded geometric data, and sets the hierarchical structure of the attribute data. The de-stratification processing unit 213 also uses the reference points and prediction points to set the reference relationships (reference destinations for each prediction point) for each layer of the hierarchical structure.

[0228] Then, the de-stratification processing unit 213 de-stratifies the acquired attribute data (differences) using the hierarchical structure and the reference relationship of each level. That is, the de-stratification processing unit 213 derives the predicted value of the prediction point from the reference point according to the reference relationship, and recovers the attribute data of each prediction point by adding the predicted value to the difference. The de-stratification processing unit 213 performs this processing for each level from higher to lower levels. In other words, the de-stratification processing unit 213 uses the prediction point obtained by recovering the attribute data in a level higher than the level described above as the processing target as a reference point to recover the attribute data of the prediction point in the level as the processing target.

[0229] In the de-stratification process performed in this way, the de-stratification processing unit 213 sets reference points by applying the aforementioned technique (method 1) when de-stratifying the attribute data based on the decoded geometric data. That is, the de-stratification processing unit 213 derives the centroid of the points and selects points close to the centroid as reference points. The de-stratification processing unit 213 provides the de-stratified attribute data to the point cloud generation unit 204. Figure 20 () as the decoding result.

[0230] By performing de-stratification as described above, the de-stratification processing unit 213 can set points close to the centroid as reference points, and thus can stratify the attribute data to suppress the decrease in prediction accuracy. In other words, the attribute information decoding unit 203 can correctly decode encoded data encoded using a similar method. For example, the attribute information decoding unit 203 can correctly decode encoded data of attribute data encoded by the attribute information encoding unit 104 described above. Therefore, the decrease in encoding efficiency can be suppressed.

[0231] Note that these processing units (decoding unit 211 to de-layering processing unit 213) have arbitrary configurations. For example, each processing unit can be configured with logic circuitry to implement the above-described processing. Furthermore, each processing unit may include, for example, a CPU, ROM, RAM, etc., and use them to execute programs to implement the above-described processing. Of course, each processing unit can have two configurations, and can implement one part of the above-described processing through logic circuitry and another part through program execution. The configurations of the processing units can be independent of each other, and for example, one part of a processing unit can implement a portion of the above-described processing through logic circuitry, another part of a processing unit can implement the above-described processing through program execution, and yet another processing unit can implement the above-described processing through both program execution and logic circuitry.

[0232] <Decoding Process>

[0233] Next, the processing performed by the decoding device 200 will be described. The decoding device 200 decodes the encoded data of the point cloud by performing decoding processing. (Refer to...) Figure 22 The flowchart below is an example of the decoding process.

[0234] When the decoding process begins, in step S201, the encoding data extraction unit 201 of the decoding device 200 acquires and holds the bit stream, as well as the encoded data for extracting geometric data and attribute data from the bit stream.

[0235] In step S202, the location information decoding unit 202 decodes the encoded data of the extracted geometric data and generates geometric data (decoding result).

[0236] In step S203, the attribute information decoding unit 203 performs attribute information decoding processing, decoding the encoded data of the attribute data extracted in step S201, and generating attribute data (decoding result). At this time, the attribute information decoding unit 203 performs the processing by applying the aforementioned technique (method 1). For example, in the layering of attribute data, the attribute information decoding unit 203 derives the centroid of the points and sets points close to the centroid as reference points. Details of the attribute information decoding processing will be described later.

[0237] In step S204, the point cloud generation unit 204 uses the geometric data (decoding result) generated in step S202 and the attribute data (decoding result) generated in step S203 to generate and output the point cloud (decoding result).

[0238] The decoding process ends when step S204 is completed.

[0239] By performing each step of the processing in this manner, the decoding device 200 can correctly decode the encoded data of attribute data encoded using a similar method. For example, the decoding device 200 can correctly decode the encoded data of attribute data encoded by the encoding device 100 described above. Therefore, the reduction in prediction accuracy and the reduction in encoding efficiency can be suppressed.

[0240] <Attribute Information Decoding Process>

[0241] Next, we will refer to Figure 23 The flowchart is described in Figure 22 An example of the attribute information decoding process performed in step S203.

[0242] When the attribute information decoding process begins, in step S211, the decoding unit 211 of the attribute information decoding unit 203 decodes the encoded data of the attribute data and generates attribute data (decoding result). This attribute data (decoding result) is quantized as described above.

[0243] In step S212, the inverse quantization unit 212 performs inverse quantization on the attribute data (decoding result) generated in step S211 by performing inverse quantization processing.

[0244] In step S213, the de-stratification processing unit 213 performs de-stratification processing to de-stratify the attribute data (differences) that were inversely quantized in step S212 and derive the attribute data for each point. At this time, the de-stratification processing unit 213 performs de-stratification by applying the aforementioned technique (method 1). For example, in the stratification of attribute data, the de-stratification processing unit 213 derives the centroids of the points and sets points close to the centroids as reference points. Details of the de-stratification processing will be described later.

[0245] When step S213 is completed, the attribute information decoding process ends, and the process returns to... Figure 22 .

[0246] By performing each step of the processing in this manner, the attribute information decoding unit 203 can apply "Method 1" as described above and set the centroid of the nearest point as the reference point in the hierarchical structure of the attribute data. Therefore, the de-stratification processing unit 213 can hierarchically process the attribute data to suppress the decrease in prediction accuracy. In other words, the attribute information decoding unit 203 can correctly decode encoded data encoded using a similar method. For example, the attribute information decoding unit 203 can correctly decode the encoded data of the attribute data encoded by the attribute information encoding unit 104 described above. Therefore, the decrease in encoding efficiency can be suppressed.

[0247] <De-layering process>

[0248] Next, we will refer to Figure 24 The flowchart is described in Figure 23 An example of the de-stratification process performed in step S213.

[0249] When the de-stratification process begins, in step S221, the de-stratification processing unit 213 performs de-stratification processing of the attribute data (decoding result) using the geometric data (decoding result), recovering the reference points and prediction points set on the encoding side for each level, and also recovering the reference relationships for each level. That is, the de-stratification processing unit 213 performs a process similar to the de-stratification processing performed by the de-stratification processing unit 111, setting the reference points and prediction points for each level, and also setting the reference relationships for each level.

[0250] For example, the de-layering processing unit 213 applies the above-described "method 1" similarly to the layering processing unit 111, derives the centroid of the point, and sets the point close to the centroid as the reference point.

[0251] In step S222, the de-stratification processing unit 213 uses the hierarchical structure and reference relationship to de-stratify the attribute data (decoding result) and recover the attribute data of each point. That is, the de-stratification processing unit 213 derives the predicted value of the attribute data of the prediction point from the attribute data of the reference point based on the reference relationship, and adds the difference between the predicted value and the attribute data (decoding result) to recover the attribute data.

[0252] When step S222 is completed, the de-stratification process ends, and the process returns to... Figure 23 .

[0253] By performing each step of the processing in this manner, the de-layering processing unit 213 can achieve layering similar to that during encoding. That is, the attribute information decoding unit 203 can correctly decode the encoded data encoded using a similar method. For example, the attribute information decoding unit 203 can correctly decode the encoded data of the attribute data encoded by the attribute information encoding unit 104 described above. Therefore, the reduction in encoding efficiency can be suppressed.

[0254] <3. Second Implementation Method>

[0255] Method 2

[0256] Next, the application of the above reference will be described. Figure 6 The description refers to "Method 2". In "Method 2", reference points are selected based on the distribution pattern (distribution method) of the points in the hierarchical structure of the attribute data.

[0257] For example, in Figure 25 The table shown (table information) indicates that the distribution pattern (distribution mode) of points in the processing target region where reference points are set is associated with the information (index) of the selected point in this case. For example, in the second row from the top of the table, it is shown that when the distribution pattern of points in a 2×2×2 voxel region where reference points are set is "10100001", the point with index "2", i.e., the second point to appear, is selected. Each bit value of the distribution pattern "10100001" indicates whether a point is present or absent in each voxel of the 2×2×2 region, with a value "1" indicating the presence of a point in the voxel to which bits are allocated, and a value "0" indicating the absence of a point in the voxel to which bits are allocated.

[0258] Similarly, this table indicates the index of the point to be selected for each distribution pattern. That is, in the hierarchical structure of the attribute data, this table is referenced, and the points whose indices correspond to the distribution patterns of the points in the processing target area where reference points are set are selected as reference points.

[0259] This method makes it easier to select reference points.

[0260]

[0261] The table information can be any information, as long as it associates the distribution pattern of points with information indicating the points to be selected. For example, as in Figure 26 In the method (1) shown in the second row from the top of the "table" shown in A, table information can be used to select points close to the centroid for each point distribution pattern. That is, the index of the point close to the centroid position can be associated with each distribution pattern.

[0262] In addition, for example, such as Figure 26 In method (2) shown in the third row from the top of the "Table" shown in A, table information for selecting arbitrary points for each point distribution method can be used. Furthermore, for example, in method (3) shown in the fourth row from the top of the "Table", the table to be used can be selected from multiple tables. For example, the table to be used can be switched according to the hierarchy (depth of LoD).

[0263] <Signaling transmission of table information>

[0264] Note that this table information can be prepared in advance. For example, predefined table information can be defined by a standard. In this case, the signaling transmission of the table information (transmission from the encoding side to the decoding side) is unnecessary.

[0265] Furthermore, the location information decoding unit 102 can derive table information from the geometric data. Additionally, the location information decoding unit 202 can also derive table information from the geometric data (decoding result). In this case, signaling transmission of the table information (transmission from the encoding side to the decoding side) is unnecessary.

[0266] Of course, table information can be generated (or updated) by users, applications, etc. In this case, the generated (or updated) table information can be notified by signaling. That is, for example, encoding unit 113 can perform encoding of information about the table information and include its encoded data in the bit stream, etc., in order to perform signaling transmission.

[0267] Furthermore, as mentioned above, table information can be switched based on the hierarchy (depth of the LoD). In this case, it is possible to perform actions such as... Figure 26In the method (1) shown in the second row of the table shown in B, the switching method is predefined by standards and other means and the information indicating the switching method is not indicated by signaling.

[0268] In addition, such as in Figure 26 In method (2) shown in the third row from the top of the "table" shown in B, the index (identification information) of the selected table can be notified by signaling. For example, the index can be notified by signaling in the attribute parameter set.

[0269] In addition, for example, as in Figure 26 In method (3) shown in the fourth row from the top of the "table" shown in B, the selected table information itself can be notified by signaling. For example, the table information can be notified by signaling in the attribute brick header.

[0270] In addition, such as in Figure 26 In method (4) shown in the fifth row from the top of the "table" shown in B, a portion of the selected table information can be notified by signaling. That is, the table information may be able to be partially updated. For example, the table information can be notified by signaling in the attribute brick header.

[0271] Similarly, when applying method 2, the configuration of the encoding device 100 and the decoding device 200 is substantially similar to that when applying method 1. Therefore, the encoding device 100 can perform each process, such as encoding processing, attribute information encoding processing, and layering processing, in a process similar to that in the first embodiment.

[0272] <Reference Point Setting Process>

[0273] Reference Figure 27 The flowchart describes an example of the reference point setting process in this case. When the reference point setting process begins, the reference point setting unit 121 refers to the table information and selects a reference point according to the point distribution pattern in step S301.

[0274] In step S302, the reference point setting unit 121 determines whether to notify information about the table being used via signaling. If it is determined that signaling notification should be used, the process proceeds to step S303.

[0275] In step S303, the reference point setting unit 121 notifies the user of information about the table being used via signaling. When the processing in step S303 ends, the reference point setting process ends, and the process returns to... Figure 18 .

[0276] Since the encoding device 100 transmits the table information in this manner, the decoding device 200 can use the table information to perform decoding.

[0277] Note that the decoding device 200 can perform each process in a similar manner to that in the case of the first embodiment, such as decoding process, attribute information decoding process, and de-layering process.

[0278] <4. Third Implementation Method>

[0279] Method 3

[0280] Next, the application of the above reference will be described. Figure 6 The described scenario is "Method 3". In "Method 3", signaling can be used to notify the reference point set in the hierarchy of attribute data.

[0281] For example, such as Figure 28 In the method (1) shown in the second row from the top of the table “Target Notified by Signaling”, signaling can be used to indicate whether all nodes (all points) are referenced, i.e., whether a node is set as a reference point or a prediction point. For example, as Figure 29 As shown in A, all nodes in all levels can be sorted according to the Merton order, and an index (index 0 to index K) can be assigned to each node. In other words, each node (and the information assigned to each node) can be identified by index 0 to index K.

[0282] In addition, for example, as in Figure 28 In method (2) shown in the third row from the top of the table “Target Notified by Signaling”, signaling can be used to notify a portion of the hierarchy of which node (point) is referenced and selected as the reference point. For example, as Figure 29 As shown in B, the hierarchy (LoD) to be used as the target for signaling can be specified, all nodes of the hierarchy can be sorted according to Morton order, and an index can be assigned to each node.

[0283] For example, suppose the attribute data has the following characteristics: Figure 30 The hierarchical structure shown in A. That is, points are selected one by one from LoD2's #0, LoD2's #1, and LoD2's #2 as reference points to form each point of LoD1's #0. In this case, when following... Figure 30 When the search order shown in B assigns the index to points in LoD2, as... Figure 30The index of LoD2 in C indicates each point of LoD1#0. In other words, the distribution of points of LoD1#0 can be expressed by specifying "Lod20, Lod21, Lod20".

[0284] In this way, a 2×2×2 voxel region of LoD N can be specified by the index of LoD N-1. That is, a 2×2×2 voxel can be specified by LoD (hierarchical specification) and level (m-th). In this way, by specifying the level and index, signaling transmission can be performed only for a part of the level as needed, thus suppressing the increase in code size and the decrease in coding efficiency compared to method (1).

[0285] In addition, for example, as in Figure 28 In the method (3) shown in the fourth row from the top of the table "Target Notified by Signaling" shown, the point to be notified by signaling may be limited by the number of points in an N×N×N voxel region at a lower level. That is, information about setting reference points for points that meet predetermined conditions can be notified by signaling. In this way, as Figure 29 As shown in C, the number of nodes to be notified via signaling can be further reduced. Therefore, the reduction in coding efficiency can be suppressed.

[0286] For example, suppose the attribute data has the following characteristics: Figure 31 The hierarchical structure shown in A. In this case, the target notified by signaling is limited to a 2×2×2 voxel region comprising three or more points. When according to Figure 31 When the search order shown in B assigns an index to a point in LoD2, the voxel region shown to the right of LoD2 is excluded from the target notified by signaling. Therefore, no index is assigned to this voxel region. Thus, as... Figure 31 As shown in C, with Figure 30 Compared to the case of C, it can reduce the amount of data notified by signaling and suppress the increase in code size.

[0287] Note that methods (2) and (3) can be used in combination, for example, as follows: Figure 28 The method (4) is shown in the fifth row from the top of the table “Target Notified by Signaling” shown.

[0288] Fixed-length signaling transmission

[0289] Signaling transmission as described above can be performed on data of fixed length. For example, it can be achieved using methods such as... Figure 32 The syntax shown in A is used to perform signaling transmission. Figure 32In the syntax of A, num_Lod is a parameter indicating the number of the LoD to be signaled. lodNo[i] is a parameter indicating the Lod number. voxelType[i] is a parameter indicating the type of voxel to be signaled. By specifying this parameter, the transmission target in the Lod can be restricted. num_node is a parameter indicating the number to be actually signaled. This num_node can be derived from the geometry data. node[k] represents the signaled information for each voxel region of 2×2×2. k represents the node number according to Morton order.

[0290] Note that when parsing is required before obtaining geometric data for parallel processing, etc., this can be utilized. Figure 32 The syntax shown in B is used to perform signaling transmission. In cases where parsing is required as described above, only the signaling needs to be communicated to num_node.

[0291] Furthermore, when performing signaling transmission using fixed-length data, the following can be applied: Figure 33 The syntax shown in A. In this case, the flag Flag[k] used for control signaling is used to notify. Similarly, in this case, when parsing is required before obtaining geometric data for parallel processing, etc., it can be utilized... Figure 33 The syntax shown in B is used to perform signaling transmission. In cases where parsing is required as described above, only the num_node signal needs to be notified via signaling.

[0292] <Variable Length Signaling Transmission>

[0293] Furthermore, signaling transmissions as described above can be performed using variable-length data. For example, such as... Figure 34 In the example, the location of a node in a 2×2×2 voxel region can be notified using signaling. In this case, it can be based on, for example,... Figure 34 The information in table A shows the bit length for signaling transmission based on the node number in the 2×2×2 voxel region.

[0294] On the decoding side, the node numbers included in the 2×2×2 voxel region can be determined from the geometric data. Therefore, as... Figure 34 In the example shown in B, even if a bit string such as “10111010…” is input, the information of each voxel region can be correctly obtained by performing a partition with an appropriate bit length, since the node numbers included in the 2×2×2 voxel region can be obtained from the geometric data.

[0295] In addition, such as in Figure 35 In the example, signaling can be used to notify the index of the table information to be used. In this case, for example, such as... Figure 35As shown in A, the bit length of the signaling transmission can be varied according to the node number in the 2×2×2 voxel region.

[0296] For example, based on Figure 35 In the case where the table information node number shown in A is five to eight, two bits are allocated, and selection is made. Figure 35 The information in table B. In this case, bit string "00" indicates that the first node was selected according to the predetermined search order (e.g., Morton's order). Furthermore, bit string "01" indicates that the first node was selected in the reverse order (opposite) of the predetermined search order (e.g., Morton's order). Furthermore, bit string "10" indicates that the second node was selected in the predetermined search order (not reversed). Furthermore, bit string "11" indicates that the second node was selected in the reverse order (opposite) of the predetermined search order.

[0297] Furthermore, for example, in based on Figure 35 In the case where the information node number shown in A is three or four, one bit is allocated, and selection is made. Figure 35 The information in the table in C. In this case, the bit string "0" indicates that the first node was selected according to the predetermined search order (e.g., Merton order). Furthermore, the bit string "1" indicates that the first node was selected in the reverse order (opposite) of the predetermined search order (e.g., Merton order).

[0298] On the decoding side, the node numbers included in the 2×2×2 voxel region can be determined from the geometric data. Therefore, as... Figure 35 In the example shown in D, even if the input is a bit string such as “10111010…”, the information of each voxel region can be correctly obtained by performing a partition with an appropriate bit length because the node numbers included in the 2×2×2 voxel region can be obtained from the geometric data.

[0299] Figure 36 An example of the syntax for the variable-length case is shown in [the document / reference]. Figure 36 In the syntax, bitLength is a parameter indicating the bit length. Additionally, signalType[i] is a parameter indicating the variable-length encoding method for each LoD.

[0300] Note that, also in the case of this variable length, when parsing is required before obtaining the geometric data for parallel processing, etc., such as... Figure 37 In the syntax shown, either num_node or flag[j] can be notified via signaling.

[0301] <Reference Point Setting Process>

[0302] Reference Figure 38The flowchart illustrates an example of the reference point setting process in this scenario. When the reference point setting process begins, reference point setting unit 121 selects a reference point in step S321.

[0303] In step S322, the reference point setting unit 121 uses signaling to notify information about the reference point set in step S321. When the processing of step S322 ends, the reference point setting process ends, and the process returns to... Figure 18 Since the encoding device 100 transmits the table information in this manner, the decoding device 200 can use the table information to perform decoding.

[0304] <5. Fourth Implementation Method>

[0305] Method 4

[0306] Next, the application of the above reference will be described. Figure 6 The case described is "Method 4". In the case of "Method 4", points closer to the center of the bounding box and points farther away from the center of the bounding box can be alternately selected as reference points for each level.

[0307] For example, in Figure 39 When selecting a reference point at the location shown, as in Figure 39 A to Figure 39 For C, for each level, alternately select points closer to the center of the bounding box and points farther away from the center of the bounding box. In this way, as... Figure 39 As shown in C, the range of movement of the reference point is limited to a narrow range as indicated by the dashed box, and thus the reduction in prediction accuracy is suppressed.

[0308] In this way, by using the selected direction of the point as a reference for the center of the bounding box, the decrease in prediction accuracy can be suppressed, regardless of the position of the bounding box.

[0309] Note that such point selection can be achieved by changing the search order of points based on the position of the bounding box. For example, the search order could be based on distance from the center of the bounding box. In this way, the search order can be changed according to the position of the bounding box. Furthermore, for example, the search order can be changed for each of the eight partitioned regions obtained by dividing the bounding box into eight (two along each of the x, y, and z directions).

[0310] <Reference Point Setting Process>

[0311] Reference Figure 40The flowchart describes an example of the reference point setting process in this case. When the reference point setting process begins, in step S341, the reference point setting unit 121 determines whether a point closer to the center of the bounding box has already been selected as the reference point of the previous level.

[0312] Once it is determined that a closer point has been selected, the process proceeds to step S342.

[0313] In step S342, the reference point setting unit 121 selects the point farthest from the center of the bounding box as the reference point from the reference point candidates. When the processing of step S342 ends, the reference point setting process ends, and the process returns to... Figure 18 .

[0314] Furthermore, if it is determined in step S341 that no closer point has been selected, the process proceeds to step S343.

[0315] In step S343, the reference point setting unit 121 selects the point closest to the center of the bounding box from the reference point candidates as the reference point. When the processing of step S343 ends, the reference point setting process ends, and the process returns to... Figure 18 .

[0316] By selecting reference points in this manner, the encoding device 100 can suppress the decrease in the prediction accuracy of the reference points. Therefore, the decrease in encoding efficiency can be suppressed.

[0317] <6. Appendix>

[0318] <Methods for layering and de-layering>

[0319] In the above description, promotion has been described as an example of a method for hierarchical and de-hierarchical attribute information; however, this technology can be applied to any technology used for hierarchical attribute information. That is, the method for hierarchical and de-hierarchical attribute information can be any method other than promotion. Furthermore, the method for hierarchical and de-hierarchical attribute information can be a non-scalable method or a scalable method as described in Non-Patent Document 3.

[0320] <Control Information>

[0321] Control information regarding the present technology described in each of the above embodiments can be transmitted from the encoding side to the decoding side. For example, control information (e.g., enabled_flag) that controls whether the application of the present technology is allowed (or prohibited) can be transmitted. Furthermore, control information that specifies the scope of the application of the present technology that is allowed (or prohibited) can be transmitted (e.g., upper or lower limit of block size or both, slice, image, sequence, component, view, layer, etc.).

[0322] <Surroundings and vicinity>

[0323] Note that in this description, positional relationships such as “nearby” or “around” can include not only spatial positional relationships but also temporal positional relationships.

[0324] Computer

[0325] The aforementioned series of processes can be performed either by hardware or by software. In the case where the processes are performed by software, the program constituting the software is installed in the computer. Here, "computer" includes computers integrated into dedicated hardware, such as general-purpose personal computers that can perform various functions by installing various programs.

[0326] Figure 41 This is a block diagram illustrating an example configuration of computer hardware that performs the above series of processes through a program.

[0327] exist Figure 41 In the computer 900 shown, the central processing unit (CPU) 901, read-only memory (ROM) 902 and random access memory (RAM) 903 are interconnected via bus 904.

[0328] The input-output interface 910 is also connected to the bus 904. The input unit 911, output unit 912, storage unit 913, communication unit 914, and driver 915 are connected to the input-output interface 910.

[0329] Input unit 911 includes, for example, a keyboard, mouse, microphone, touchpad, input terminals, etc. Output unit 912 includes, for example, a display, speaker, output terminals, etc. Storage unit 913 includes, for example, a hard disk, RAM disk, non-volatile memory, etc. Communication unit 914 includes, for example, a network interface. Driver 915 drives removable media 921, such as a magnetic disk, optical disk, magneto-optical disk, or semiconductor memory.

[0330] In the computer configured as described above, the CPU 901 loads a program, for example, stored in storage unit 913, into RAM 903 via input-output interface 910 and bus 904 and executes the program to perform the aforementioned series of processes. RAM 903 also appropriately stores data, etc., required by the CPU 901 to perform various processes.

[0331] For example, a program executed by a computer can be applied by being recorded in a removable medium 921, such as a packaging medium. In this case, the program can be installed in the storage unit 913 via the input-output interface 910 by attaching the removable medium 921 to the drive 915.

[0332] Furthermore, the program can be provided via wired or wireless transmission media such as a local area network, the Internet, or digital satellite broadcasting. In this case, the program can be received by the communication unit 914 and installed in the storage unit 913.

[0333] Alternatively, the program can be pre-installed in ROM 902 or storage unit 913.

[0334] <Target audience for this technology>

[0335] Although the application of this technique to the encoding and decoding of point cloud data has been described above, this technique is not limited to these examples and can be applied to the encoding and decoding of any standard 3D data. That is, various types of processing, such as encoding and decoding methods, and various types of data specifications, such as 3D data and metadata, are arbitrary, as long as they do not contradict this technique. Furthermore, some of the aforementioned processing and specifications can be omitted, provided they do not contradict this technique.

[0336] Furthermore, in the above description, the encoding device 100 and the decoding device 200 have been described as application examples of this technology, but this technology can be applied to any configuration.

[0337] For example, this technology can be applied to various electronic devices, such as transmitters and receivers for satellite broadcasting, cable broadcasting such as cable television, distribution over the Internet, and distribution to terminals via cellular communication (e.g., television receivers and mobile phones), or devices for recording images on media such as optical discs, magnetic disks, and flash memory, or for reproducing images from storage media (e.g., hard disk recorders and cameras).

[0338] Furthermore, for example, this technology can also be implemented as a configuration of a device, such as a processor (e.g., a video processor) in a system-wide integrated circuit (LSI), a module (e.g., a video module) using multiple processors, a unit (e.g., a video unit) using multiple modules, or a collection obtained by further adding other functions to the unit (e.g., a video collection).

[0339] Furthermore, this technology can also be applied to network systems comprising multiple devices. For example, this technology can be implemented as cloud computing, where multiple devices share and collaborate over a network. For example, this technology can be implemented in cloud services that provide image (moving image) related services to any terminal such as a computer, audiovisual (AV) device, portable information processing terminal, or Internet of Things (IoT) device.

[0340] Note that in this specification, "system" means a collection of multiple components (devices, modules (parts), etc.), and it is not important whether all components are housed in the same housing. Therefore, multiple devices housed in different housings and connected via a network, as well as a device in which multiple modules are housed in a single housing, are all systems.

[0341] <Fields and Applications for which this technology is applicable>

[0342] Note that systems, devices, processing units, etc., applying this technology can be used in any field, such as transportation, medical care, crime prevention, agriculture, animal husbandry, mining, beauty, factories, home appliances, weather, nature monitoring, etc. Furthermore, its use is arbitrary.

[0343] <Other>

[0344] Note that in this specification, a "flag" is information used to identify multiple states, and includes not only information for identifying two states, true (1) or false (0), but also information that can identify three or more states. Therefore, the value of the "flag" can be, for example, two values, 1 and 0, or three or more values. That is, the number of bits constituting the "flag" is arbitrary and can be one bit or more. Furthermore, assuming that the identification information (including the flag) includes not only its identification information in the bitstream, but also the difference information of the identification information relative to a certain reference information in the bitstream, therefore, in this specification, "flag" and "identification information" include not only their information, but also the difference information relative to the reference information.

[0345] Furthermore, various types of information (metadata, etc.) related to encoded data (bitstream) can be transmitted or recorded in any form, as long as the information is associated with the encoded data. Here, the term "associated" means, for example, that a piece of data can be used (linked) when processing other data. That is, associated data can be combined into one piece of data, or it can be separate pieces of data. For example, information associated with encoded data (image) can be transmitted on a different transmission path than the transmission path of the encoded data (image). Furthermore, for example, information associated with encoded data (image) can be recorded on a different recording medium (or another recording area of ​​the same recording medium) than the encoded data (image). Note that this "association" can be a part of the data rather than the entire data. For example, an image and the information corresponding to that image can be associated with each other in any unit, such as multiple frames, a single frame, or a portion of a frame.

[0346] Note that in this specification, terms such as “combine,” “multiplex,” “add,” “integrate,” “include,” “store,” “put in,” “insert,” and “embed” mean combining multiple items into one, such as combining coded data and metadata into one data, and also mean one method of “association” as described above.

[0347] Furthermore, the implementation of this technology is not limited to the above-described implementation, and various modifications are possible without departing from the scope of this technology.

[0348] For example, a configuration described as a single device (or processing unit) can be divided and configured into multiple devices (or processing units). Conversely, a configuration described above as multiple devices (or processing units) can be combined and configured into a single device (or processing unit). Furthermore, configurations other than those described above can, of course, be added to the configuration of each device (or each processing unit). Additionally, if the configuration and operation of the entire system are substantially the same, a portion of the configuration of one device (or processing unit) can be included in the configuration of another device (or another processing unit).

[0349] Furthermore, for example, the above procedure can be executed in any device. In this case, it is sufficient as long as the device has the necessary functions (function blocks, etc.) and can obtain the necessary information.

[0350] Furthermore, for example, each step of a flowchart can be executed by a single device, or it can be shared and executed by multiple devices. Additionally, when a step includes multiple processes, those multiple processes can be executed by a single device, or they can be shared and executed by multiple devices. In other words, multiple processes included in a step can be executed as processes of multiple steps. Conversely, processes described as multiple steps can be executed together as a single step.

[0351] Furthermore, for example, in a program executed by a computer, the processes describing the steps of the program can be executed sequentially in the order described in this specification, or they can be executed in parallel or individually at necessary time intervals, such as when called. That is, as long as there is no contradiction, the processes in the corresponding steps can be executed in an order different from the above-described order. In addition, the processes in the steps used to describe the program can be executed in parallel with the processes in another program, or they can be combined with the processes in another program.

[0352] Furthermore, for example, multiple technologies related to this technology can be implemented independently as a single entity, provided there are no contradictions. Of course, any multiple technologies can also be combined and implemented. For example, part or all of the technology described in any of the embodiments can be combined with part or all of the technology described in other embodiments. Additionally, any part or all of the above-described technology can be implemented by using it in conjunction with another technology not described above.

[0353] Note that this technology can have the following configurations.

[0354] (1) An information processing device, comprising:

[0355] A hierarchical unit, for each point in a point cloud representing an object with a 3D shape as a set of points, hierarchically layers the attribute information by recursively repeating the classification of predicted and reference points relative to a reference point. Predicted points are used to derive the difference between the predicted and predicted values ​​of the attribute information, while reference points are used to derive the predicted values. Where:

[0356] The reference point is set based on the centroid of the point in the hierarchical unit.

[0357] (2) The information processing apparatus according to (1), wherein:

[0358] The hierarchical unit sets the point closer to the centroid among the candidate points as the reference point.

[0359] (3) The information processing apparatus according to (1) or (2), wherein:

[0360] The layered unit sets a reference point based on the centroid of points located within a predetermined range.

[0361] (4) The information processing apparatus according to any one of (1) to (3), wherein:

[0362] The hierarchical unit sets a reference point from multiple candidates that are under substantially the same conditions relative to the centroid based on a predetermined search order.

[0363] (5) An information processing method, comprising:

[0364] When performing hierarchical classification of attribute information for each point in a point cloud that represents objects with three-dimensional shapes as a set of points, the reference point is set based on the centroid of the point, the prediction point is used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value.

[0365] (6) An information processing apparatus, comprising:

[0366] A hierarchical unit, for each point in a point cloud representing an object with a 3D shape as a set of points, hierarchically layers the attribute information by recursively repeating the classification of predicted and reference points relative to a reference point. Predicted points are used to derive the difference between the predicted and predicted values ​​of the attribute information, while reference points are used to derive the predicted values. Where:

[0367] The hierarchical unit sets reference points based on the distribution of each point.

[0368] (7) The information processing apparatus according to (6), wherein:

[0369] The hierarchical unit sets reference points based on table information, which specifies the centroid of the nearest point for each distribution pattern of the points.

[0370] (8) The information processing apparatus according to (6) or (7), wherein:

[0371] The hierarchical unit sets reference points based on table information, which specifies predetermined points for each distribution pattern of the points.

[0372] (9) The information processing apparatus according to (8) further includes:

[0373] An encoding unit that encodes information related to table information.

[0374] (10) An information processing method, comprising:

[0375] When performing attribute information hierarchical classification of each point in a point cloud that represents objects with three-dimensional shapes as a set of points, the reference point is set based on the distribution of the points, the prediction point is used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value.

[0376] (11) An information processing apparatus, comprising:

[0377] A hierarchical unit, for the attribute information of each point in a point cloud representing a set of points with three-dimensional shapes, hierarchically layers the attribute information by recursively repeating the classification of predicted points and reference points relative to a reference point. Predicted points are used to derive the difference between the predicted values ​​of the attribute information and the predicted values ​​of the attribute information, while reference points are used to derive the predicted values.

[0378] The encoding unit encodes information related to the setting of reference points by the layered unit.

[0379] (12) The information processing apparatus according to (11), wherein:

[0380] The encoding unit encodes information related to the setting of reference points for all points.

[0381] (13) The information processing apparatus according to (11), wherein:

[0382] The encoding unit encodes information related to the setting of reference points for a subset of levels.

[0383] (14) The information processing apparatus according to (13), wherein:

[0384] The encoding unit also encodes information related to the setting of reference points for points that meet predetermined conditions.

[0385] (15) An information processing method, comprising:

[0386] For the attribute information of each point in a point cloud that represents objects with three-dimensional shapes as a set of points, the attribute information is hierarchically classified by recursively repeating the classification of predicted points and reference points relative to a reference point. Predicted points are used to derive the difference between the predicted values ​​of the attribute information and the predicted values ​​of the attribute information, while reference points are used to derive the predicted values.

[0387] Information related to the setting of reference points is encoded.

[0388] (16) An information processing apparatus, comprising:

[0389] A hierarchical unit, for each point in a point cloud representing an object with a three-dimensional shape as a set of points, hierarchically stratifies the attribute information by recursively repeating the classification of predicted and reference points relative to a reference point. Predicted points are used to derive the difference between the predicted and predicted values ​​of the attribute information, while reference points are used to derive the predicted values. Where:

[0390] The hierarchical unit alternately selects points closer to the center of the bounding box and points farther away from the center of the bounding box from the candidates for reference points at each level.

[0391] (17) The information processing apparatus according to (16), wherein:

[0392] The hierarchical unit selects points closer to the center of the bounding box and points farther away from the center of the bounding box from the candidates based on the position of the bounding box and the search order.

[0393] (18) The information processing apparatus according to (17), wherein:

[0394] The search order is based on the distance from the center of the bounding box.

[0395] (19) The information processing apparatus according to (17), wherein:

[0396] Set the search order for each region obtained by dividing the bounding box into eight regions.

[0397] (20) An information processing method, comprising:

[0398] When the attribute information of a point cloud representing an object with a three-dimensional shape as a set of points is hierarchically classified by recursively repeating the classification of predicted points and reference points relative to reference points, points closer to the center of the bounding box and points farther away from the center of the bounding box are alternately selected as reference points from the candidates for reference points at each level. Predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and reference points are used to derive the predicted value.

[0399] [List of reference numerals]

[0400] 100: Encoding device

[0401] 101: Location Information Encoding Unit

[0402] 102: Location Information Decoding Unit

[0403] 103: Point Cloud Generation Unit

[0404] 104: Attribute Information Encoding Unit

[0405] 105: Bitstream Generation Unit

[0406] 111: Hierarchical Processing Unit

[0407] 112: Quantization unit

[0408] 113: Coding Unit

[0409] 121: Reference point setting unit

[0410] 122: Reference Relationship Setting Unit

[0411] 123: Inversion Unit

[0412] 124: Weighted Value Derivation Unit

[0413] 200: Decoding device

[0414] 201: Encoded Data Extraction Unit

[0415] 202: Location Information Decoding Unit

[0416] 203: Attribute Information Decoding Unit

[0417] 204: Point Cloud Generation Unit

[0418] 211: Decoding Unit

[0419] 212: Inverse quantization unit

[0420] 213: De-stratification processing unit

Claims

1. An information processing apparatus, comprising: A layering unit, for the attribute information of each point in a point cloud representing an object with a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of predicted points and the reference points relative to a reference point. The predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value. The reference point is set based on the centroid of the point in the hierarchical unit.

2. The information processing apparatus according to claim 1, wherein, The hierarchical unit sets the point among the candidate points that is closer to the centroid as the reference point.

3. The information processing apparatus according to claim 1, wherein, The reference point is set based on the centroid of a point located within a predetermined range in the layered unit.

4. The information processing apparatus according to claim 1, wherein, The hierarchical unit sets the reference point based on a predetermined search order from a plurality of candidates that are under substantially the same conditions relative to the centroid.

5. An information processing method, comprising: When the attribute information of a point cloud representing an object with a three-dimensional shape as a set of points is hierarchically classified by recursively repeating the classification of the predicted point and the reference point relative to a reference point, the reference point is set based on the centroid of the point, the predicted point is used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value.

6. An information processing apparatus, comprising: A layering unit, for each point in a point cloud representing an object with a three-dimensional shape as a set of points, layers the attribute information by recursively classifying the predicted point and the reference point relative to a reference point, wherein the predicted point is used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference point is used to derive the predicted value, wherein... The reference points are set based on the distribution pattern of each point in the hierarchical unit. The hierarchical unit sets the reference point based on table information, which specifies the centroid of the nearest point for each distribution pattern of the points.

7. The information processing apparatus according to claim 6, further comprising: An encoding unit that encodes information related to the table information.

8. An information processing method, comprising: When the attribute information of a point cloud, which represents objects with three-dimensional shapes as sets of points, is hierarchically classified by recursively repeating the classification of predicted points and the reference points relative to a reference point, the reference points are set based on the distribution of the points. The predicted points are used to derive the difference between the attribute information and the predicted values ​​of the attribute information, and the reference points are used to derive the predicted values. The reference points are set based on table information, which specifies the centroid of the nearest point for each distribution pattern of each point.

9. An information processing apparatus, comprising: A layering unit, for the attribute information of each point in a point cloud representing an object with a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of predicted points and the reference points relative to a reference point, wherein the predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value. An encoding unit encodes information related to the setting of the reference point by the layering unit. The hierarchical unit sets the reference point based on table information, which specifies the centroid of the nearest point for each distribution pattern of each point.

10. The information processing apparatus according to claim 9, wherein, The encoding unit encodes information related to the setting of the reference points for all points.

11. The information processing apparatus according to claim 9, wherein, The encoding unit encodes information related to the setting of reference points for a subset of levels.

12. The information processing apparatus according to claim 11, wherein, The encoding unit also encodes information related to the setting of the reference point for points that meet predetermined conditions.

13. An information processing method, comprising: For the attribute information of each point in a point cloud that represents objects with three-dimensional shapes as a set of points, the attribute information is hierarchically classified by recursively repeating the classification of predicted points and the reference points relative to a reference point, wherein the predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value. Information related to the setting of the reference point is encoded. The reference points are set based on table information, which specifies the centroid of the nearest point for each distribution pattern of each point.

14. An information processing apparatus, comprising: A layering unit, for the attribute information of each point in a point cloud representing an object with a three-dimensional shape as a set of points, layers the attribute information by recursively repeating the classification of predicted points and the reference points relative to a reference point. The predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value. The layering unit alternately selects points closer to the center of the bounding box and points farther away from the center of the bounding box from the candidates for the reference point in each layer as the reference point.

15. The information processing apparatus according to claim 14, wherein, The hierarchical unit selects points closer to the center of the bounding box and points farther away from the center of the bounding box from the candidates based on the position of the bounding box and the search order.

16. The information processing apparatus according to claim 15, wherein, The search order is based on the distance from the center of the bounding box.

17. The information processing apparatus according to claim 15, wherein, The search order is set for each region obtained by dividing the bounding box into eight regions.

18. An information processing method, comprising: When the attribute information of a point cloud representing an object with a three-dimensional shape as a set of points is hierarchically classified by recursively repeating the classification of predicted points and reference points relative to reference points, points closer to the center of the bounding box and points farther away from the center of the bounding box are alternately selected from the candidates for reference points at each level, the predicted points are used to derive the difference between the attribute information and the predicted value of the attribute information, and the reference points are used to derive the predicted value.